Skip to content

rename adv_estimator param to advantage_estimator in compute_advantages_and_returns#1793

Open
KTanmay1 wants to merge 1 commit into
NovaSky-AI:mainfrom
KTanmay1:fix/rename-adv-estimator-param
Open

rename adv_estimator param to advantage_estimator in compute_advantages_and_returns#1793
KTanmay1 wants to merge 1 commit into
NovaSky-AI:mainfrom
KTanmay1:fix/rename-adv-estimator-param

Conversation

@KTanmay1

Copy link
Copy Markdown

Summary

Closes #1792.

The function compute_advantages_and_returns in ppo_utils.py used adv_estimator as its parameter name, while the corresponding config field on AlgorithmConfig is advantage_estimator. These two names diverged silently — the internal plumbing worked correctly because trainer.py reads the config field and passes it positionally-by-keyword, but the mismatch is a footgun for anyone writing a new integration or launch script.

If a developer uses trainer.algorithm.adv_estimator=<value> on the CLI (copying the function parameter name), validate_dict_keys_against_dataclass raises at startup:

ValueError: Invalid fields {'adv_estimator'} for AlgorithmConfig.
            Valid fields are {'advantage_estimator', ...}

Changes

  • skyrl/backends/skyrl_train/utils/ppo_utils.py: rename parameter adv_estimatoradvantage_estimator and update its one internal use
  • skyrl/train/trainer.py: update both keyword-argument call sites from adv_estimator=advantage_estimator=

No behavior change — purely a naming fix.

Test plan

  • Existing CPU tests pass: uv run --extra dev pytest tests/train/ tests/backends/skyrl_train/ --ignore=tests/backends/skyrl_train/gpu/

…es_and_returns

The function parameter name `adv_estimator` diverged from the config field name
`advantage_estimator` (AlgorithmConfig). This mismatch causes a confusing startup
crash (ValueError from validate_dict_keys_against_dataclass) when writing new
integrations that copy the function parameter name as a CLI config key.

Rename the parameter and its one internal use to match the config field.
Update the two keyword-argument call sites in trainer.py accordingly.

Fixes NovaSky-AI#1792

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request renames the parameter adv_estimator to advantage_estimator in the compute_advantages_and_returns function signature and updates its corresponding usages in skyrl/train/trainer.py to maintain consistency. I have no feedback to provide as the changes are straightforward and there are no review comments.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

adv_estimator (function param) vs advantage_estimator (config field) naming mismatch causes confusing startup crash

1 participant