fix(experiments): flatten propagated metadata by hassiebp · Pull Request #1641 · langfuse/langfuse-python

hassiebp · 2026-04-24T14:30:57Z

Summary

Flatten experiment metadata and experiment item metadata before propagating it to child spans.
Validate each flattened metadata value independently against the existing propagated-attribute length limit.
Update propagation tests and add an e2e guard for server-side handling of flattened experiment metadata on non-SDK child spans.

Root Cause

run_experiment() serialized the entire experiment metadata dict into one langfuse.experiment.metadata string before propagation. The propagation layer enforces a 200-character limit per propagated string, so moderately sized metadata dicts were dropped even when each individual metadata value was short enough.

Impact

Experiment metadata now propagates as attributes like langfuse.experiment.metadata.job_name, matching the per-key behavior used for trace metadata. This avoids dropping valid metadata just because the combined serialized object is larger than 200 characters.

Validation

uv run --frozen ruff check langfuse/_client/attributes.py langfuse/_client/client.py langfuse/_client/propagation.py tests/unit/test_propagate_attributes.py tests/e2e/test_experiments.py
uv run --frozen mypy langfuse --no-error-summary
uv run --frozen pytest tests/unit/test_propagate_attributes.py -q
Commit hook also ran ruff-check, ruff-format, and mypy successfully.

Notes

The new e2e test is intentionally a server-ingestion guard. In this local environment I could not run it end-to-end because no Langfuse server or LANGFUSE_* e2e credentials were available. The adjacent server checkout currently appears to parse only the legacy JSON langfuse.experiment.metadata attribute, so this SDK PR should stay draft until matching server ingestion support for langfuse.experiment.metadata.<key> is available in the CI server image.

Disclaimer: Experimental PR review

Greptile Summary

Replaces the single-blob serialization of experiment_metadata / experiment_item_metadata with per-key flattening (langfuse.experiment.metadata.<key>), matching the pattern already used for trace metadata and allowing each value to be validated independently against the 200-char propagation limit. The propagation, context-read, and baggage-decode paths are all updated consistently, and the _get_propagated_span_key map now correctly includes "metadata" so it is no longer misresolved through the fallback.

Confidence Score: 4/5

Safe to merge; only P2 findings in tests (redundant import, stale assertion pattern).

Core logic changes across attributes.py, propagation.py, and client.py are coherent and well-tested. The two P2 issues are confined to the test file and do not affect production behavior.

tests/unit/test_propagate_attributes.py — redundant import and stale warning-text assertion.

Important Files Changed

Filename	Overview
langfuse/_client/attributes.py	Adds `_flatten_and_serialize_metadata_values` to recursively flatten nested dicts into dot-notation string pairs; logic is correct and handles None/empty inputs safely.
langfuse/_client/client.py	Switches `experiment_metadata` and `experiment_item_metadata` from a single serialized JSON string to a flattened dict; `item_metadata` correctly guarded with `isinstance(..., dict)` check.
langfuse/_client/propagation.py	Correctly routes experiment metadata fields into the dict-based validation loop, fixes the hardcoded TRACE_METADATA span key in `_get_propagated_attributes_from_context`, and adds `"metadata"` to `_get_propagated_span_key`; baggage decoding handles multiple metadata namespaces without ambiguity.
tests/unit/test_propagate_attributes.py	Tests updated to verify per-key propagation; contains a redundant local import (rule violation) and a warning-text assertion that targets the old format rather than the new per-key format.
tests/e2e/test_experiments.py	New e2e server-ingestion guard for flattened experiment metadata; intentionally draft (no live server) per PR notes, and the server-side ingestion contract is described as still pending.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: tests/unit/test_propagate_attributes.py
Line: 2791

Comment:
**Redundant local import — `_serialize` is already at module level**

`_serialize` is already imported at the top of this file (line 15: `from langfuse._client.attributes import LangfuseOtelSpanAttributes, _serialize`). This local re-import is unnecessary and violates the team's rule to keep imports at the module level.

```suggestion
```

**Rule Used:** Move imports to the top of the module instead of p... ([source](https://app.greptile.com/review/custom-context?memory=c960fc07-9928-409f-a18b-a780cbdded12))

**Learned From**
[langfuse/langfuse-python#1387](https://github.com/langfuse/langfuse-python/pull/1387)

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: tests/unit/test_propagate_attributes.py
Line: 2904

Comment:
**Warning-text assertion matches the old format, not the new one**

With the new per-key validation, the warning key is `experiment_metadata.<subkey>` (e.g., `experiment_metadata.job_name`), so the warning text would read `"Propagated attribute 'experiment_metadata.job_name' value is over 200 characters"`. The substring being checked — `"experiment_metadata' value is over 200 characters"` — can never appear under the new code path, making the assertion effectively a no-op guard. In this specific test the individual values are ≤200 chars so no warning fires, but if a test is later added with an oversized individual value the assertion wouldn't catch a spurious warning.

Consider updating the assertion to match the new format, e.g.:
```python
assert "experiment_metadata." not in caplog.text or "value is over 200 characters" not in caplog.text
```
or checking that none of the new-format patterns appear in `caplog.text`.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(experiments): flatten propagated met..." | Re-trigger Greptile}

Greptile also left 1 inline comment on this PR.

Context used:

Rule used - Move imports to the top of the module instead of p... (source)

Learned From
langfuse/langfuse-python#1387

github-actions · 2026-04-24T14:36:10Z

@claude review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c94c5a69fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70892a36f9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

claude

Deferring to a human — per your own PR notes this should stay draft until the server accepts flattened langfuse.experiment.metadata.<key> attributes, and the rollout/legacy-compat angle raised by another reviewer is a cross-system call worth a human sign-off.

Extended reasoning...

Overview

The PR switches experiment metadata (experiment_metadata / experiment_item_metadata) from a single JSON-serialized attribute to per-key flattened attributes (langfuse.experiment.metadata.<key>), so each value is independently validated against the 200-char propagation limit. The change spans langfuse/_client/attributes.py (new _flatten_and_serialize_metadata_values helper), client.py (call-site swap in _process_experiment_item), and propagation.py (dict routing in _propagate_attributes, fixed span-key lookup in _get_propagated_attributes_from_context, and multi-namespace baggage decoding in _get_span_key_from_baggage_key). Tests are updated and a new e2e guard is added.

Security risks

No meaningful new security surface. Metadata keys/values are still length-validated per entry; baggage propagation (as_baggage=True) continues to be opt-in, and the new baggage-decode loop only adds additional known metadata namespaces — no user-controlled prefix expansion.

Level of scrutiny

Higher than a typical bug fix. This is a wire-format change in trace/experiment propagation that interacts with server-side ingestion. You explicitly noted the PR should stay draft until the server parses the flattened attribute, and a peer reviewer raised a P1 about mixed-version deployments dropping legacy langfuse.experiment.metadata. Those are exactly the kinds of cross-system decisions (dual-write vs. gate on server rollout) that warrant a human call rather than a shadow approval.

Other factors

The per-file logic looks internally consistent: _get_propagated_span_key now includes "metadata" (fixing the previous fallback path), and _set_propagated_attribute uses the resolved span_key instead of hardcoded TRACE_METADATA. The one nit worth flagging separately from the ingestion question: in tests/unit/test_propagate_attributes.py the substring "experiment_metadata' value is over 200 characters" can no longer appear under the new per-key warning format (experiment_metadata.<subkey>' value is over 200 characters), so that assertion is effectively a no-op — greptile already flagged this. Outstanding P1/P2 comments from other reviewers have not yet been addressed, which is another reason to hold off on auto-approval.

hassiebp changed the title ~~[codex] fix(experiments): flatten propagated metadata~~ fix(experiments): flatten propagated metadata Apr 24, 2026

hassiebp marked this pull request as ready for review April 24, 2026 14:36

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread langfuse/_client/client.py

Comment thread langfuse/_client/attributes.py

greptile-apps Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread tests/unit/test_propagate_attributes.py

fix(experiments): flatten propagated metadata

70892a3

hassiebp force-pushed the codex/flatten-experiment-metadata-propagation branch from c94c5a6 to 70892a3 Compare April 24, 2026 14:47

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread tests/e2e/test_experiments.py

claude Bot reviewed Apr 24, 2026

View reviewed changes

hassiebp merged commit 9244e14 into main Apr 24, 2026
18 checks passed

hassiebp deleted the codex/flatten-experiment-metadata-propagation branch April 24, 2026 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(experiments): flatten propagated metadata#1641

fix(experiments): flatten propagated metadata#1641
hassiebp merged 1 commit intomainfrom
codex/flatten-experiment-metadata-propagation

hassiebp commented Apr 24, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hassiebp commented Apr 24, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Impact

Validation

Notes

Disclaimer: Experimental PR review

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hassiebp commented Apr 24, 2026 •

edited by greptile-apps Bot

Loading