Skip to content

feat(sample-app): add FastAPI + LiteLLM tracing example#4227

Open
pm32900 wants to merge 3 commits into
traceloop:mainfrom
pm32900:feat/fastapi-litellm-example
Open

feat(sample-app): add FastAPI + LiteLLM tracing example#4227
pm32900 wants to merge 3 commits into
traceloop:mainfrom
pm32900:feat/fastapi-litellm-example

Conversation

@pm32900

@pm32900 pm32900 commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Adds a minimal FastAPI example (fastapi_litellm_example.py) to the sample-app package
demonstrating how to trace an HTTP LLM endpoint using OpenLLMetry.

All existing examples are run-once scripts. This is the first example showing
tracing inside a running HTTP service, which is how most production LLM
applications are structured.

What's new

packages/sample-app/sample_app/fastapi_litellm_example.py

  • FastAPI app with a single POST /chat endpoint
  • LiteLLM completion() call wrapped with @task + @workflow decorators
  • Traceloop.init(disable_batch=True) for easy local debugging
  • LLM_MODEL and LLM_API_BASE env vars allow routing to any OpenAI-compatible
    backend (OpenAI, vLLM, Ollama, Groq, etc.)

packages/sample-app/pyproject.toml

  • Added fastapi>=0.115.0,<1 and uvicorn>=0.32.0,<1

How to test

cd packages/sample-app
uv run uvicorn sample_app.fastapi_litellm_example:app --reload --port 8000

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Explain OpenTelemetry in one sentence"}'

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

## New Features
- Added a new FastAPI `POST /chat` endpoint to send a message and receive an AI-generated reply.
- Integrated LiteLLM-backed chat responses with configurable model and API base via environment settings (defaulting to OpenAI GPT-4o mini).

## Bug Fixes
- Improved robustness by validating model output and returning a clear error response when content is missing.

## Chores
- Added FastAPI and Uvicorn runtime dependencies to support deployment of the sample app.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@CLAassistant

CLAassistant commented Jun 10, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 250827c1-f2dc-4995-a050-03538452171c

📥 Commits

Reviewing files that changed from the base of the PR and between ed05986 and 8e2c035.

📒 Files selected for processing (1)
  • packages/sample-app/sample_app/fastapi_litellm_example.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/sample-app/sample_app/fastapi_litellm_example.py

📝 Walkthrough

Walkthrough

This PR introduces a new FastAPI example application in the sample app package that demonstrates tracing LLM calls using Traceloop and LiteLLM. The app exposes a /chat endpoint, coordinates a traced workflow that calls an LLM, and returns the response. FastAPI and Uvicorn dependencies were also added to support the server.

Changes

FastAPI LiteLLM Example Application

Layer / File(s) Summary
Dependencies and module initialization
packages/sample-app/pyproject.toml, packages/sample-app/sample_app/fastapi_litellm_example.py
FastAPI and Uvicorn dependencies added to project. Imports for LiteLLM, Traceloop, FastAPI, and Pydantic configured. Traceloop initialized with batch disabled and FastAPI app created at module load time.
LLM task and request contract
packages/sample-app/sample_app/fastapi_litellm_example.py
ChatRequest Pydantic model defined to accept message input. call_llm task implemented with @task decorator to load model and API base from environment, build a chat message, and invoke LiteLLM completion.
Chat workflow and POST endpoint
packages/sample-app/sample_app/fastapi_litellm_example.py
chat_workflow created with @workflow decorator to orchestrate the message through call_llm. POST /chat endpoint handler accepts ChatRequest, runs the workflow, and returns the LLM reply as JSON.

Sequence Diagram

sequenceDiagram
  participant Client
  participant ChatEndpoint as POST /chat
  participant ChatWorkflow
  participant CallLLM as call_llm task
  participant LiteLLM
  
  Client->>ChatEndpoint: POST ChatRequest(message)
  ChatEndpoint->>ChatWorkflow: chat_workflow(message)
  ChatWorkflow->>CallLLM: call_llm(message)
  CallLLM->>LiteLLM: completion(model, api_base, messages)
  LiteLLM-->>CallLLM: response content
  CallLLM-->>ChatWorkflow: reply text
  ChatWorkflow-->>ChatEndpoint: reply text
  ChatEndpoint-->>Client: {reply}
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A FastAPI app hops into the fold,
With LLM calls traced and Traceloop bold,
Fast endpoints respond, workflows run true,
Sample code sparkling, refreshingly new!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a FastAPI example with LiteLLM tracing to the sample-app package, which aligns with the file additions and PR objectives.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
packages/sample-app/sample_app/fastapi_litellm_example.py (1)

33-36: ⚡ Quick win

Avoid blocking the event loop with synchronous I/O in async endpoint.

The async endpoint handler calls synchronous chat_workflow(), which performs blocking I/O (LLM API call). This blocks FastAPI's event loop and prevents the server from handling concurrent requests efficiently.

For a sample application that users will reference, it's important to demonstrate the correct async pattern.

⚡ Proposed fix using asyncio.to_thread for non-blocking execution
+import asyncio
+
 `@app.post`("/chat")
 async def chat(request: ChatRequest):
-    reply = chat_workflow(request.message)
+    reply = await asyncio.to_thread(chat_workflow, request.message)
     return {"reply": reply}

Alternatively, if LiteLLM supports async (check with litellm.acompletion), refactor the functions to be fully async:

 `@task`(name="call_llm")
-def call_llm(message: str) -> str:
-    response = litellm.completion(
+async def call_llm(message: str) -> str:
+    response = await litellm.acompletion(
         model=os.environ.get("LLM_MODEL", "openai/gpt-4o-mini"),
         messages=[{"role": "user", "content": message}],
         api_base=os.environ.get("LLM_API_BASE", None),
     )
     return response.choices[0].message.content

 `@workflow`(name="chat_workflow")
-def chat_workflow(message: str) -> str:
-    return call_llm(message)
+async def chat_workflow(message: str) -> str:
+    return await call_llm(message)

 `@app.post`("/chat")
 async def chat(request: ChatRequest):
-    reply = chat_workflow(request.message)
+    reply = await chat_workflow(request.message)
     return {"reply": reply}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sample-app/sample_app/fastapi_litellm_example.py` around lines 33 -
36, The async FastAPI handler chat currently calls the synchronous, blocking
function chat_workflow which performs LLM I/O and will block the event loop;
change chat to offload blocking work to a worker thread (e.g., use
asyncio.to_thread to call chat_workflow) or refactor chat_workflow and
downstream calls to async (e.g., use litellm async API like acompletion if
available) so the endpoint does not perform synchronous I/O on the event loop;
update references to chat_workflow and any LLM call sites accordingly to ensure
non-blocking behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/sample-app/pyproject.toml`:
- Around line 40-41: Update the FastAPI dependency constraint in
packages/sample-app pyproject.toml to exclude vulnerable releases: replace the
current fastapi spec (the string "fastapi>=0.115.0,<1") with a range that sets
the minimum to 0.65.2 (e.g. "fastapi>=0.65.2,<1" or an equivalent exclusion of
<0.65.2) so the project no longer allows versions affected by the CSRF advisory;
leave the uvicorn constraint unchanged.

In `@packages/sample-app/sample_app/fastapi_litellm_example.py`:
- Around line 20-27: call_llm currently calls litellm.completion without error
handling or validating the response; wrap the litellm.completion call in a
try/except that catches runtime/network/auth errors (e.g., Exception) and
logs/raises a controlled error, and validate the returned object before using it
by checking response.choices exists and is non-empty and that
response.choices[0].message.content is not None (return a clear error or
fallback if validation fails); update the return path to safely extract and
return the content only after these checks and ensure any caught exceptions
produce a meaningful error message for the caller.
- Line 13: The Traceloop.init call currently sets disable_batch=True for
debugging but does not install a ConsoleSpanExporter; add a ConsoleSpanExporter
from opentelemetry.sdk.trace.export and attach it via a SimpleSpanProcessor to
the active TracerProvider so spans are emitted to the console. Locate the
Traceloop.init usage (Traceloop.init(app_name="fastapi_litellm_example",
disable_batch=True)) and before or immediately after it, import
ConsoleSpanExporter and SimpleSpanProcessor, create a ConsoleSpanExporter
instance, wrap it in a SimpleSpanProcessor, and add that processor to the global
tracer provider (or the provider used by Traceloop) so tracing output is visible
locally.

---

Nitpick comments:
In `@packages/sample-app/sample_app/fastapi_litellm_example.py`:
- Around line 33-36: The async FastAPI handler chat currently calls the
synchronous, blocking function chat_workflow which performs LLM I/O and will
block the event loop; change chat to offload blocking work to a worker thread
(e.g., use asyncio.to_thread to call chat_workflow) or refactor chat_workflow
and downstream calls to async (e.g., use litellm async API like acompletion if
available) so the endpoint does not perform synchronous I/O on the event loop;
update references to chat_workflow and any LLM call sites accordingly to ensure
non-blocking behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 65c1248a-3f18-4528-aa42-b2d23e0b0d56

📥 Commits

Reviewing files that changed from the base of the PR and between 4643b88 and ed05986.

⛔ Files ignored due to path filters (1)
  • packages/sample-app/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • packages/sample-app/pyproject.toml
  • packages/sample-app/sample_app/fastapi_litellm_example.py

Comment on lines +40 to +41
"fastapi>=0.115.0,<1",
"uvicorn>=0.32.0,<1",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify FastAPI and Uvicorn versions and check for security advisories

echo "=== Checking FastAPI latest version ==="
curl -s https://pypi.org/pypi/fastapi/json | jq -r '.info.version'

echo -e "\n=== Checking Uvicorn latest version ==="
curl -s https://pypi.org/pypi/uvicorn/json | jq -r '.info.version'

echo -e "\n=== Checking for FastAPI security advisories ==="
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: PIP, package: "fastapi") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'

echo -e "\n=== Checking for Uvicorn security advisories ==="
gh api graphql -f query='
{
  securityVulnerabilities(first: 5, ecosystem: PIP, package: "uvicorn") {
    nodes {
      advisory {
        summary
        severity
        publishedAt
      }
      vulnerableVersionRange
      firstPatchedVersion {
        identifier
      }
    }
  }
}'

Repository: traceloop/openllmetry

Length of output: 1175


Bump FastAPI minimum version (current range includes HIGH-severity CSRF advisory)

In packages/sample-app/pyproject.toml (lines 40-41), fastapi>=0.115.0,<1 includes versions <0.65.2, which are affected by a HIGH CSRF advisory (patched in 0.65.2). Set the lower bound to >=0.65.2 (or otherwise exclude <0.65.2).
The listed Uvicorn HIGH advisories affect <0.11.7, so uvicorn>=0.32.0,<1 is not impacted by those.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sample-app/pyproject.toml` around lines 40 - 41, Update the FastAPI
dependency constraint in packages/sample-app pyproject.toml to exclude
vulnerable releases: replace the current fastapi spec (the string
"fastapi>=0.115.0,<1") with a range that sets the minimum to 0.65.2 (e.g.
"fastapi>=0.65.2,<1" or an equivalent exclusion of <0.65.2) so the project no
longer allows versions affected by the CSRF advisory; leave the uvicorn
constraint unchanged.

Comment thread packages/sample-app/sample_app/fastapi_litellm_example.py Outdated
Comment thread packages/sample-app/sample_app/fastapi_litellm_example.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants