Skip to content

Pull requests: NVIDIA-NeMo/Curator

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

docs(fern): add local library autodocs without Fern auth
#2102 opened Jun 22, 2026 by lbliii Contributor Loading…
2 of 3 tasks
[WIP] Feat/cc lancedb pipeline
#2101 opened Jun 22, 2026 by VibhuJawa Contributor Draft
3 tasks
docs: fix tutorial doc links and add data curation challenges
#2098 opened Jun 22, 2026 by lbliii Contributor Loading…
2 tasks done
2
2
docs: expand tutorials Quick Start with docs and Core Concepts links
#2097 opened Jun 22, 2026 by lbliii Contributor Loading…
2 tasks
docs: clarify fuzzy dedup input blocksize community-request waiting-on-customer Waiting on the original author to respond
#2096 opened Jun 22, 2026 by nightcityblade Contributor Loading…
3 tasks done
Aaftabv/qwen1967 global bucketing benchmark
#2094 opened Jun 22, 2026 by mohammadaaftabv Contributor Draft
3 tasks
Aaftabv/qwen1967 local noio control
#2093 opened Jun 22, 2026 by mohammadaaftabv Contributor Draft
3 tasks
fix: make quickstart Ray startup Xenna-safe community-request waiting-on-maintainers Waiting on maintainers to respond
#2089 opened Jun 18, 2026 by nightcityblade Contributor Loading…
3 tasks done
Add option to drop deduplication id field community-request waiting-on-maintainers Waiting on maintainers to respond
#2078 opened Jun 16, 2026 by nightcityblade Contributor Loading…
3 tasks done
fix: open remote JSONL files before cuDF reads community-request waiting-on-maintainers Waiting on maintainers to respond
#2076 opened Jun 14, 2026 by nightcityblade Contributor Loading…
2 of 3 tasks
Add Omni-Fuse Tutorial
#2069 opened Jun 11, 2026 by hk1510 Loading…
3 tasks done
Draft: ASR Open Source Datasets Processing Pipeline
#2067 opened Jun 11, 2026 by sushmitha-deva-09 Contributor Loading…
3 tasks
add heuristic initial replica
#2066 opened Jun 11, 2026 by weijiac0619 Contributor Draft
3 tasks
Pipeline resumability via source-level counter checkpointing
#2063 opened Jun 10, 2026 by abhinavg4 Contributor Loading…
test: cover remote pairwise file paths community-request waiting-on-maintainers Waiting on maintainers to respond
#2061 opened Jun 10, 2026 by nightcityblade Contributor Loading…
3 tasks done
Add LLM judge filter stages
#2060 opened Jun 9, 2026 by arhamm1 Contributor Loading…
Add support for Slurm arrays
#2059 opened Jun 9, 2026 by sarahyurick Contributor Loading…
5 of 6 tasks
fix: auto-detect Ray fanout stages community-request waiting-on-customer Waiting on the original author to respond
#2056 opened Jun 8, 2026 by nightcityblade Contributor Loading…
3 tasks done
Add review-curator-audio-pr Cursor skill for reviewing audio Curator PRs
#2051 opened Jun 5, 2026 by mohammadaaftabv Contributor Loading…
4 of 5 tasks
Dynamo Server Fixes + Nemotron Parsing PDF Benchmark changes
#2050 opened Jun 5, 2026 by praateekmahajan Contributor Loading…
3 tasks
fix: default workflow input extensions by filetype community-request
#2045 opened Jun 3, 2026 by nightcityblade Contributor Loading…
3 tasks done
ProTip! Follow long discussions with comments:>50.