Skip to content

Feat/add litellm provider#222

Open
RheagalFire wants to merge 2 commits intoTHUDM:mainfrom
RheagalFire:feat/add-litellm-provider
Open

Feat/add litellm provider#222
RheagalFire wants to merge 2 commits intoTHUDM:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

Summary

  • Adds LiteLLMAgent extending AgentClient, routing to 100+ LLM providers via litellm.completion()
  • Follows the existing agent pattern (HTTPAgent, Claude, FastChatAgent)
  • Works with the YAML config / InstanceFactory system out of the box

Motivation

AgentBench currently supports OpenAI-compatible HTTP endpoints, Claude (legacy completions API), and FastChat. Researchers who want to benchmark against Groq, Mistral, AWS Bedrock, Azure, Together, Fireworks, or
any of 100+ other providers must set up a separate HTTP proxy. LiteLLM handles provider routing natively, letting researchers benchmark any model with a single config change.

Changes

  • src/client/agents/litellm_agent.py -- new LiteLLMAgent(AgentClient) with inference() method
  • src/client/agents/__init__.py -- registered import
  • requirements.txt -- added litellm>=1.60.0,<2.0
  • tests/test_litellm_agent.py -- 21 unit tests

Implementation details

  • Maps AgentBench's "agent" role to LiteLLM's "assistant" role
  • Uses check_context_limit() from http_agent.py for context-limit detection (same AND-rule as HTTPAgent, avoids false positives on rate-limit/auth errors)
  • Guards against None content from providers (returns "" instead of crashing AgentOutput validator)
  • Catches AgentClientException and re-raises immediately (matches HTTPAgent pattern)
  • Non-retryable errors (AuthenticationError, BadRequestError, NotFoundError) fail immediately without retrying
  • drop_params=True for cross-provider compatibility
  • Lazy import litellm inside inference() so the base install works without litellm
  • Exposes model_name attribute for task.py error reporting

Testing

Unit tests (21 passed):

TestInit::test_default PASSED
TestInit::test_with_params PASSED                                                                                                                                                                                  
TestInit::test_model_required PASSED
TestInference::test_basic_completion PASSED                                                                                                                                                                        
TestInference::test_role_mapping PASSED    
TestInference::test_drop_params PASSED 
TestInference::test_api_key_forwarded PASSED
TestInference::test_api_key_omitted PASSED                                                                                                                                                                         
TestInference::test_api_args_forwarded PASSED
TestInference::test_api_base_forwarded PASSED                                                                                                                                                                      
TestInference::test_model_forwarded PASSED   
TestInference::test_none_content_returns_empty_string PASSED                                                                                                                                                       
TestInference::test_empty_string_content PASSED             
TestRetryAndErrors::test_retries_3_times_then_raises PASSED                                                                                                                                                        
TestRetryAndErrors::test_context_limit_raises_immediately PASSED
TestRetryAndErrors::test_auth_error_not_retried PASSED                                                                                                                                                             
TestRetryAndErrors::test_rate_limit_not_misclassified_as_context_limit PASSED
TestRegistration::test_importable PASSED                                                                                                                                                                           
TestRegistration::test_subclass PASSED                                                                                                                                                                             
TestRegistration::test_instance_factory PASSED
TestRegistration::test_model_name_attribute PASSED                                                                                                                                                                 
======================== 21 passed in 0.16s ========================

Example usage

YAML config (configs/agents/litellm.yaml):

module: src.client.agents.litellm_agent.LiteLLMAgent                                                                                                                                                               
parameters:     
  model: anthropic/claude-sonnet-4-20250514
  api_args:                                                                                                                                                                                                        
    temperature: 0
    max_tokens: 4096                                                                                                                                                                                               

Python:

from src.client.agents.litellm_agent import LiteLLMAgent
                                                                                                                                                                                                                   
# Uses ANTHROPIC_API_KEY from env automatically
agent = LiteLLMAgent(model="anthropic/claude-sonnet-4-20250514")                                                                                                                                                   
response = agent.inference([{"role": "user", "content": "What is 2+2?"}])
print(response)  # "4"                                                                                                                                                                                             
                                                                                                                                                                                                                   
Via InstanceFactory (same as benchmark runner):
from src.typings import InstanceFactory                                                                                                                                                                            
                                       
factory = InstanceFactory(                                                                                                                                                                                         
    module="src.client.agents.litellm_agent.LiteLLMAgent",
    parameters={"model": "openai/gpt-4o"},                
)
agent = factory.create()                                                                                                                                                                                           

Risk / Compatibility

  • Additive only. Existing agents untouched.
  • litellm>=1.60.0,<2.0 added to requirements.txt.
  • Provider API keys read from standard env vars (no new env vars introduced).

@RheagalFire
Copy link
Copy Markdown
Author

cc @Xiao9905

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant