fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232
Draft
fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)#232
Conversation
… into copilot/fix-tool-call-argument-schema Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>
…lint, tests Agent-Logs-Url: https://github.com/microsoft/Agent365-python/sessions/3ac8e71e-b951-44ea-bca8-3eab3997c5dc Co-authored-by: fpfp100 <126631706+fpfp100@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix tool-call argument schema consistency and lint issues
fix: structured tool-call arguments and LangChain mapper unit tests (PR #229 follow-up)
Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses review feedback on PR #229: tool-call
argumentswere being stringified even when already structured, and the LangChain mapper lacked unit test coverage.Tool-call argument schema consistency
Both mappers now prefer structured (
dict/list) arguments over JSON strings:_parse_arguments(): ifargumentsis a JSON string that parses todict/list, store the parsed value; on failure, keep the string.json.dumps(args)with_parse_tool_call_args():dict/listkept as-is, JSON strings parsed, unparseable strings kept unchanged with a debug log.Integration test improvements
test_tool_call_message_mappingnow asserts real structure when the versioned format is present (version == "0.1.0", role/parts shape, at least onetool_callpart). Module docstring updated to reflect that versioned format is canonical; raw-list is backward-compatible fallback only.New unit tests
Added
tests/observability/extensions/langchain/test_message_mapper.py(16 tests) covering the LangChain mapper without requiring Azure credentials:dict(kept structured), valid JSON string (parsed), invalid JSON string (kept as-is)ToolMessage→tool_call_responsepartNone/empty inputs, unknown role defaultingOriginal prompt
Create a PR fix on top of
microsoft/Agent365-pythonPR #229 (branchnikhilc/messageStructureForAuto, head SHA01c32df5fadfacd27b6a92374234f3a7c53bb3c4) to address tool-call argument schema consistency and related review feedback.Requirements:
Prefer structured tool-call arguments in versioned messages.
argumentsvalue is a string, attempt to parse it withjson.loads().dictorlist, store that structured value inToolCallRequestPart.arguments.Fix Ruff lint issues
.../agentframework/message_mapper.pyand.../langchain/message_mapper.py, ensurelogger = logging.getLogger(__name__)is either used (e.g., for parse failures) or removed soruff checkpasses.Improve tests
tests/observability/extensions/langchain/integration/test_message_format.pysotest_tool_call_message_mappinghas real assertions verifyingtool_callandtool_call_responseparts are present in the enriched/versioned message format (when versioned dict format is present). Keep backwards-compat behavior allowing legacy raw-list formats.Documentation nits (optional but recommended)
Constraints / repo rules:
.pyfile..pyfile.Deliverables:
mainfrom a new branch.The following is the prior conversation context from the user's chat exploration (may be truncated):
User: review the PR : @microsoft/Agent365-python/pull/229
Assistant: PR #229 (“Nikhilc/message structure for auto”) looks directionally good: it moves AgentFramework + LangChain auto-instrumentation from “string list” message attributes to the A365 versioned message format (
{"version":"0.1.0","messages":[...]}) and adds integration coverage.That said, I would not merge as-is until a couple of schema/lint/test gaps are addressed.
High-level summary
What’s good
EnrichedReadableSpan+ enricher becauseReadableSpanis immutable afteron_end. The updatedspan_processor.pydoc reflects that.invoke_agent+chat) and separately normalizesexecute_toolattributes.Blocking / should-fix items
1) Tool-call arguments typing / schema consistency (likely needs a change)
You updated the core schema to allow:
ToolCallRequestPart.arguments: dict | list | str | None(in.../core/models/messages.py)That’s pragmatic because upstream emitters often send args as JSON strings, but it creates a compatibility question: do we want “arguments” to be structured whenever possible?
Right now:
json.dumps(args)), even when args started as a dict.part.get("arguments")(often a JSON string).Recommendation (strong):
json.loads()whenargumentsis a string that looks like JSON, and only keep it as a string when parsing fails.dict/list, keep them structured (don’t stringify).This matches the Copilot review comments and will avoid downstream consumers having to re-parse.
2) Ruff lint: unused
loggerin both message mappersBoth new modules declare
logger = logging.getLogger(__name__)but don’t use it. That will typically failruff(F841), unless the repo config ignores it.Fix: either remove
loggeror use it (e.g., debug log on parse failures).3) LangChain integration test has a “...
This pull request was created from Copilot chat.