Skip to content

fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232

Draft
Copilot wants to merge 11 commits intomainfrom
copilot/fix-tool-call-argument-schema
Draft

fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232
Copilot wants to merge 11 commits intomainfrom
copilot/fix-tool-call-argument-schema

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 17, 2026

Addresses review feedback on PR #229: tool-call arguments were being stringified even when already structured, and the LangChain mapper lacked unit test coverage.

Tool-call argument schema consistency

Both mappers now prefer structured (dict/list) arguments over JSON strings:

  • AgentFramework mapper — added _parse_arguments(): if arguments is a JSON string that parses to dict/list, store the parsed value; on failure, keep the string.
  • LangChain mapper — replaced inline json.dumps(args) with _parse_tool_call_args(): dict/list kept as-is, JSON strings parsed, unparseable strings kept unchanged with a debug log.
# Before — dict args were stringified
ToolCallRequestPart(name="search", arguments='{"query": "hello"}')  # str

# After — dict args stay structured
ToolCallRequestPart(name="search", arguments={"query": "hello"})    # dict

Integration test improvements

test_tool_call_message_mapping now asserts real structure when the versioned format is present (version == "0.1.0", role/parts shape, at least one tool_call part). Module docstring updated to reflect that versioned format is canonical; raw-list is backward-compatible fallback only.

New unit tests

Added tests/observability/extensions/langchain/test_message_mapper.py (16 tests) covering the LangChain mapper without requiring Azure credentials:

  • system / user / assistant / tool roles
  • tool-call args as dict (kept structured), valid JSON string (parsed), invalid JSON string (kept as-is)
  • ToolMessagetool_call_response part
  • empty/whitespace content filtered, None/empty inputs, unknown role defaulting
Original prompt

Create a PR fix on top of microsoft/Agent365-python PR #229 (branch nikhilc/messageStructureForAuto, head SHA 01c32df5fadfacd27b6a92374234f3a7c53bb3c4) to address tool-call argument schema consistency and related review feedback.

Requirements:

  1. Prefer structured tool-call arguments in versioned messages.

    • For both AgentFramework and LangChain mappers, when a tool-call's arguments value is a string, attempt to parse it with json.loads().
    • If parsing succeeds and yields a dict or list, store that structured value in ToolCallRequestPart.arguments.
    • If parsing fails, keep the original string.
    • If the tool-call args are already a dict/list, do not stringify.
  2. Fix Ruff lint issues

    • In both .../agentframework/message_mapper.py and .../langchain/message_mapper.py, ensure logger = logging.getLogger(__name__) is either used (e.g., for parse failures) or removed so ruff check passes.
  3. Improve tests

    • Update tests/observability/extensions/langchain/integration/test_message_format.py so test_tool_call_message_mapping has real assertions verifying tool_call and tool_call_response parts are present in the enriched/versioned message format (when versioned dict format is present). Keep backwards-compat behavior allowing legacy raw-list formats.
    • Add unit tests for the LangChain message mapper so that mapping behavior is covered even when integration tests are skipped (no secrets). Include cases:
      • system/user/assistant roles
      • tool calls with args as dict and as JSON string
      • tool response messages (ToolMessage)
      • empty/whitespace content ignored
      • unknown role handling
  4. Documentation nits (optional but recommended)

    • Fix outdated docstrings noted in review comments (e.g., “will verify once mapper is added”) if they still exist after changes.

Constraints / repo rules:

  • Follow repository Python style and typing.
  • Ensure Microsoft copyright header exists in any new .py file.
  • Ensure the keyword "Kairo" does not appear in any .py file.

Deliverables:

  • Commit changes and open a new PR to main from a new branch.
  • Keep changes focused to the above; do not refactor unrelated code.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: review the PR : @microsoft/Agent365-python/pull/229

Assistant: PR #229 (“Nikhilc/message structure for auto”) looks directionally good: it moves AgentFramework + LangChain auto-instrumentation from “string list” message attributes to the A365 versioned message format ({"version":"0.1.0","messages":[...]}) and adds integration coverage.

That said, I would not merge as-is until a couple of schema/lint/test gaps are addressed.

High-level summary

  • Goal: Auto instrumentation should emit the new message format for:
    • AgentFramework spans via export-time enricher mapping
    • LangChain spans via new mapper + updated attribute extraction
  • Scope: 17 files changed, +2041/-87, with new message mappers and multiple integration tests.

What’s good

  • Correct architectural choice for OTel Python: do mutation at export-time via EnrichedReadableSpan + enricher because ReadableSpan is immutable after on_end. The updated span_processor.py doc reflects that.
  • AgentFramework span enricher now maps message spans (invoke_agent + chat) and separately normalizes execute_tool attributes.
  • Integration tests added for both AgentFramework and LangChain pipelines to validate end-to-end behavior (invoke → chat → tool execution) and message-format assertions.

Blocking / should-fix items

1) Tool-call arguments typing / schema consistency (likely needs a change)

You updated the core schema to allow:

  • ToolCallRequestPart.arguments: dict | list | str | None (in .../core/models/messages.py)

That’s pragmatic because upstream emitters often send args as JSON strings, but it creates a compatibility question: do we want “arguments” to be structured whenever possible?

Right now:

  • LangChain mapper turns tool call args into a string (json.dumps(args)), even when args started as a dict.
  • AgentFramework mapper passes through part.get("arguments") (often a JSON string).

Recommendation (strong):

  • Attempt json.loads() when arguments is a string that looks like JSON, and only keep it as a string when parsing fails.
  • If args are already a dict/list, keep them structured (don’t stringify).

This matches the Copilot review comments and will avoid downstream consumers having to re-parse.

2) Ruff lint: unused logger in both message mappers

Both new modules declare logger = logging.getLogger(__name__) but don’t use it. That will typically fail ruff (F841), unless the repo config ignores it.

Fix: either remove logger or use it (e.g., debug log on parse failures).

3) LangChain integration test has a “...

This pull request was created from Copilot chat.

Copilot AI and others added 2 commits April 17, 2026 18:23
… into copilot/fix-tool-call-argument-schema

Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix tool-call argument schema consistency and lint issues fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up) Apr 17, 2026
Copilot AI requested a review from fpfp100 April 17, 2026 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants