fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up) by Copilot · Pull Request #232 · microsoft/Agent365-python

Copilot · 2026-04-17T18:22:41Z

Addresses review feedback on PR #229: tool-call arguments were being stringified even when already structured, and the LangChain mapper lacked unit test coverage.

Tool-call argument schema consistency

Both mappers now prefer structured (dict/list) arguments over JSON strings:

AgentFramework mapper — added _parse_arguments(): if arguments is a JSON string that parses to dict/list, store the parsed value; on failure, keep the string.
LangChain mapper — replaced inline json.dumps(args) with _parse_tool_call_args(): dict/list kept as-is, JSON strings parsed, unparseable strings kept unchanged with a debug log.

# Before — dict args were stringified
ToolCallRequestPart(name="search", arguments='{"query": "hello"}')  # str

# After — dict args stay structured
ToolCallRequestPart(name="search", arguments={"query": "hello"})    # dict

Integration test improvements

test_tool_call_message_mapping now asserts real structure when the versioned format is present (version == "0.1.0", role/parts shape, at least one tool_call part). Module docstring updated to reflect that versioned format is canonical; raw-list is backward-compatible fallback only.

New unit tests

Added tests/observability/extensions/langchain/test_message_mapper.py (16 tests) covering the LangChain mapper without requiring Azure credentials:

system / user / assistant / tool roles
tool-call args as dict (kept structured), valid JSON string (parsed), invalid JSON string (kept as-is)
ToolMessage → tool_call_response part
empty/whitespace content filtered, None/empty inputs, unknown role defaulting

Original prompt

Create a PR fix on top of microsoft/Agent365-python PR #229 (branch nikhilc/messageStructureForAuto, head SHA 01c32df5fadfacd27b6a92374234f3a7c53bb3c4) to address tool-call argument schema consistency and related review feedback.

Requirements:

Prefer structured tool-call arguments in versioned messages.
- For both AgentFramework and LangChain mappers, when a tool-call's arguments value is a string, attempt to parse it with json.loads().
- If parsing succeeds and yields a dict or list, store that structured value in ToolCallRequestPart.arguments.
- If parsing fails, keep the original string.
- If the tool-call args are already a dict/list, do not stringify.
Fix Ruff lint issues
- In both .../agentframework/message_mapper.py and .../langchain/message_mapper.py, ensure logger = logging.getLogger(__name__) is either used (e.g., for parse failures) or removed so ruff check passes.
Improve tests
- Update tests/observability/extensions/langchain/integration/test_message_format.py so test_tool_call_message_mapping has real assertions verifying tool_call and tool_call_response parts are present in the enriched/versioned message format (when versioned dict format is present). Keep backwards-compat behavior allowing legacy raw-list formats.
- Add unit tests for the LangChain message mapper so that mapping behavior is covered even when integration tests are skipped (no secrets). Include cases:
  - system/user/assistant roles
  - tool calls with args as dict and as JSON string
  - tool response messages (ToolMessage)
  - empty/whitespace content ignored
  - unknown role handling
Documentation nits (optional but recommended)
- Fix outdated docstrings noted in review comments (e.g., “will verify once mapper is added”) if they still exist after changes.

Constraints / repo rules:

Follow repository Python style and typing.
Ensure Microsoft copyright header exists in any new .py file.
Ensure the keyword "Kairo" does not appear in any .py file.

Deliverables:

Commit changes and open a new PR to main from a new branch.
Keep changes focused to the above; do not refactor unrelated code.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: review the PR : @microsoft/Agent365-python/pull/229

Assistant: PR #229 (“Nikhilc/message structure for auto”) looks directionally good: it moves AgentFramework + LangChain auto-instrumentation from “string list” message attributes to the A365 versioned message format ({"version":"0.1.0","messages":[...]}) and adds integration coverage.

That said, I would not merge as-is until a couple of schema/lint/test gaps are addressed.

High-level summary

Goal: Auto instrumentation should emit the new message format for:
- AgentFramework spans via export-time enricher mapping
- LangChain spans via new mapper + updated attribute extraction
Scope: 17 files changed, +2041/-87, with new message mappers and multiple integration tests.

What’s good

Correct architectural choice for OTel Python: do mutation at export-time via EnrichedReadableSpan + enricher because ReadableSpan is immutable after on_end. The updated span_processor.py doc reflects that.
AgentFramework span enricher now maps message spans (invoke_agent + chat) and separately normalizes execute_tool attributes.
Integration tests added for both AgentFramework and LangChain pipelines to validate end-to-end behavior (invoke → chat → tool execution) and message-format assertions.

Blocking / should-fix items

1) Tool-call arguments typing / schema consistency (likely needs a change)

You updated the core schema to allow:

ToolCallRequestPart.arguments: dict | list | str | None (in .../core/models/messages.py)

That’s pragmatic because upstream emitters often send args as JSON strings, but it creates a compatibility question: do we want “arguments” to be structured whenever possible?

Right now:

LangChain mapper turns tool call args into a string (json.dumps(args)), even when args started as a dict.
AgentFramework mapper passes through part.get("arguments") (often a JSON string).

Recommendation (strong):

Attempt json.loads() when arguments is a string that looks like JSON, and only keep it as a string when parsing fails.
If args are already a dict/list, keep them structured (don’t stringify).

This matches the Copilot review comments and will avoid downstream consumers having to re-parse.

2) Ruff lint: unused `logger` in both message mappers

Both new modules declare logger = logging.getLogger(__name__) but don’t use it. That will typically fail ruff (F841), unless the repo config ignores it.

Fix: either remove logger or use it (e.g., debug log on parse failures).

3) LangChain integration test has a “...

This pull request was created from Copilot chat.

… into copilot/fix-tool-call-argument-schema Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>

…lint, tests Agent-Logs-Url: https://github.com/microsoft/Agent365-python/sessions/3ac8e71e-b951-44ea-bca8-3eab3997c5dc Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>

nikhilc-microsoft and others added 9 commits April 14, 2026 19:42

support message format for agent framework

bc06877

support message format for langchain

3d24c26

Merge branch 'main' into nikhilc/messageStructureForAuto

1e869a2

add integration tests

58966df

fix tests

7a1aa33

Merge branch 'main' into nikhilc/messageStructureForAuto

33d864c

fix tests linting

5918d7a

address pr comments

01c32df

Initial plan

1f6c296

Copilot AI assigned Copilot and fpfp100 Apr 17, 2026

Copilot started work on behalf of fpfp100 April 17, 2026 18:22 View session

Copilot AI and others added 2 commits April 17, 2026 18:23

Merge remote-tracking branch 'origin/nikhilc/messageStructureForAuto'…

5a0f047

… into copilot/fix-tool-call-argument-schema Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>

fix: tool-call argument schema consistency, cleaner type signatures, …

d92cf9f

…lint, tests Agent-Logs-Url: https://github.com/microsoft/Agent365-python/sessions/3ac8e71e-b951-44ea-bca8-3eab3997c5dc Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix tool-call argument schema consistency and lint issues~~ fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up) Apr 17, 2026

Copilot finished work on behalf of fpfp100 April 17, 2026 18:34

Copilot AI requested a review from fpfp100 April 17, 2026 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232

fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232
Copilot wants to merge 11 commits intomainfrom
copilot/fix-tool-call-argument-schema

Copilot AI commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tool-call argument schema consistency

Integration test improvements

New unit tests

High-level summary

What’s good

Blocking / should-fix items

1) Tool-call arguments typing / schema consistency (likely needs a change)

2) Ruff lint: unused logger in both message mappers

3) LangChain integration test has a “...

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Apr 17, 2026 •

edited

Loading

2) Ruff lint: unused `logger` in both message mappers