NovaSky-AI / SkyRL Public

Notifications You must be signed in to change notification settings
Fork 356
Star 2k

Code
Issues 194
Pull requests 161
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: NovaSky-AI/SkyRL

Labels 23 Milestones 0

New pull request New

161 Open 1,276 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[megatron] Enable Nemotron-3-Ultra-550B GRPO RL + fix multi-rank (EP>16/PP>2) weight sync

#1816 opened Jun 19, 2026 by erictang000 Collaborator

Loading…

[feat] Add a fully async trainer simulation script for iterating on generation settings

#1814 opened Jun 19, 2026 by SumanthRH Member • Draft

[feat] Skip policy fwd logprobs for rollout-based losses (rollout_is, dppo) + worker train/rollout logprob diff metric

#1813 opened Jun 19, 2026 by erictang000 Collaborator

Loading…

[fully_async] fix DAPO postprocess_generator_output to accept metrics kwargs

#1812 opened Jun 19, 2026 by erictang000 Collaborator

Loading…

[algorithm] add rollout KL loss + fix padding-microbatch field drops

#1811 opened Jun 19, 2026 by erictang000 Collaborator

Loading…

[train] Async batch collation (double-buffering) for the SFT trainer

#1809 opened Jun 18, 2026 by dyurk-lila

Loading…

2 tasks done

[train] Vectorize controller-side training-batch collation (SFT + RL)

#1808 opened Jun 18, 2026 by dyurk-lila

Loading…

3 tasks done

[train] Skip building unused per-token loss_fn_outputs when the caller does not consume them

#1807 opened Jun 18, 2026 by dyurk-lila

Loading…

[megatron] Stream ChunkedDistributedLogprob.backward into a preallocated buffer (lower peak memory)

#1806 opened Jun 18, 2026 by dyurk-lila

Loading…

[megatron] Accept dtype-string optimizer_config_kwargs (coerce exp_avg_dtype etc. to torch.dtype)

#1805 opened Jun 18, 2026 by dyurk-lila

Loading…

[chore][logging] Add trajectory and group completion metrics for async RL

#1804 opened Jun 18, 2026 by SumanthRH Member

Loading…

1 task done

[chore]Upgrade vllm to 0.23.0 run_h100_gpu_ci

Run H100 GPU CI

run_train_gpu_ci run_train_megatron_gpu_ci

#1800 opened Jun 17, 2026 by SumanthRH Member

Loading…

[Fix] vLLM Metrics Scrapper throughput calculation

#1794 opened Jun 16, 2026 by zanderjiang

Loading…

rename adv_estimator param to advantage_estimator in compute_advantages_and_returns

#1793 opened Jun 16, 2026 by KTanmay1

Loading…

1 task

[train] Save HF processor on checkpoint export for VLMs

#1785 opened Jun 14, 2026 by dinhxuanvu Contributor

Loading…

1 of 2 tasks

[fix] Honor served_model_name and surface HTTP errors in RemoteInferenceEngine

#1783 opened Jun 13, 2026 by discobot Contributor

Loading…

[fix] Use masked mean in advantage batch normalization

#1782 opened Jun 12, 2026 by discobot Contributor

Loading…

Support top-K distillation (SDFT/OPSD): teacher top-K sampling + soft-target CE training

#1777 opened Jun 11, 2026 by atemaguer Contributor

Loading…

Rollout Routing Replay (R3) for the fsdp backend, using vllm==0.22

#1772 opened Jun 10, 2026 by jamesbraza Contributor

Loading…

Attach the failed actor's log tail when inference engines die during init

#1771 opened Jun 10, 2026 by jamesbraza Contributor

Loading…

[wip] add router replay tests to CI

#1770 opened Jun 10, 2026 by erictang000 Collaborator • Draft

[megatron] Fused LM-head log-prob + entropy (avoid full [*, seq, vocab] logit materialization)

#1765 opened Jun 9, 2026 by dyurk-lila

Loading…

4 tasks done

[WIP] RDT skyrl integration

#1753 opened Jun 5, 2026 by hao-aaron Collaborator • Draft

[train] VLM SFT support on Megatron backend (Qwen3-VL)

#1752 opened Jun 4, 2026 by s-chundi

Loading…

feat(profiler): drive torch.profiler around the training loop

#1750 opened Jun 4, 2026 by dyurk-lila

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!