Add context to traces

Adding context to traces enables you to track execution details, analyze user behavior, debug issues across environments, and monitor application performance. MLflow provides standardized metadata fields for common context types plus the flexibility to add custom metadata specific to your application.

Prerequisites

Choose the appropriate installation method based on your environment:

Production

For production deployments, install the mlflow-tracing package:

pip install --upgrade mlflow-tracing

The mlflow-tracing package is optimized for production use with minimal dependencies and better performance characteristics.

Development

For development environments, install the full MLflow package with Databricks extras:

pip install --upgrade "mlflow[databricks]>=3.1"

The full mlflow[databricks] package includes all features needed for local development and experimentation on Databricks.

Note

MLflow 3 is required for context tracking. MLflow 2.x is not supported due to performance limitations and missing features essential for production use.

Types of context metadata

Production applications need to track multiple pieces of context simultaneously. MLflow has standardized metadata fields to capture important contextual information:

Context Type	Use Cases	MLflow Field
Client request ID	Link traces to specific client requests or API calls for end-to-end debugging	`TraceInfo` parameter `client_request_id`
User session ID	Group traces from multi-turn conversations, allowing you to analyze the full conversational flow	`mlflow.trace.session` metadata
User ID	Associate traces with specific users for personalization, cohort analysis, and user-specific debugging	`mlflow.trace.user` metadata
Environment data	Track deployment context (environment, version, region) for operational insights and debugging across different deployments	automatic metadata and custom metadata
Custom metadata	Add application-specific metadata for organization, search, and filtering traces	(your metadata keys)

Track users and sessions

Tracking users and sessions in your GenAI application provides essential context for understanding user behavior, analyzing conversation flows, and improving personalization.

Why track users and sessions?

User and session tracking enables powerful analytics and improvements:

User behavior analysis - Understand how different users interact with your application
Conversation flow tracking - Analyze multi-turn conversations and context retention
Personalization insights - Identify patterns to improve user-specific experiences
Quality per user - Track performance metrics across different user segments
Session continuity - Maintain context across multiple interactions

Standard metadata fields for users and sessions

MLflow provides two standard metadata fields for session and user tracking:

mlflow.trace.user - Associates traces with specific users
mlflow.trace.session - Groups traces belonging to multi-turn conversations

When you use these standard metadata fields, MLflow automatically enables filtering and grouping in the UI. Unlike tags, metadata cannot be updated once the trace is logged, making it ideal for immutable identifiers like user and session IDs.

Basic implementation

Here's how to add user and session tracking to your application:

import mlflow

@mlflow.trace
def chat_completion(user_id: str, session_id: str, message: str):
    """Process a chat message with user and session tracking."""

    # Add user and session context to the current trace
    # The @mlflow.trace decorator ensures there's an active trace
    mlflow.update_current_trace(
        metadata={
            "mlflow.trace.user": user_id,      # Links this trace to a specific user
            "mlflow.trace.session": session_id, # Groups this trace with others in the same conversation
        }
    )

    # The trace will capture the execution time, inputs, outputs, and any errors
    # Your chat logic here
    response = generate_response(message)
    return response

# Example usage in a chat application
def handle_user_message(request):
    # Extract user and session IDs from your application's context
    # These IDs should be consistent across all interactions
    return chat_completion(
        user_id=request.user_id,        # e.g., "user-123" - unique identifier for the user
        session_id=request.session_id,   # e.g., "session-abc-456" - groups related messages
        message=request.message
    )

# Placeholder chat logic
def generate_response(message: str) -> str:
    """Your chat logic here"""
    return "Placeholder response"

# Run the chat completion with user and session context
result = chat_completion(
    user_id="user-123",
    session_id="session-abc-456",
    message="What is MLflow and how does it help with machine learning?"
)

Key points:

The @mlflow.trace decorator automatically creates a trace for the function execution
mlflow.update_current_trace() adds the user ID and session ID as metadata to the active trace
Using metadata ensures these identifiers are immutable once the trace is created

Track environments and versions

Tracking the execution environment and application version of your GenAI application allows you to debug performance and quality issues relative to the code. This metadata enables:

Environment-specific analysis across development, staging, and production
Performance/quality tracking and regression detection across app versions
Faster root cause analysis when issues occur

Note

For a comprehensive overview of how versioning works, see Version Tracking.

Automatically populated metadata

These standard metadata fields are automatically captured by MLflow based on your execution environment.

Important

If the automatic capture logic does not meet your requirements, you can override these automatically populated metadata manually using mlflow.update_current_trace(metadata={"mlflow.source.name": "custom_name"}).

Category	Metadata Field	Description	Automatic Setting Logic
Execution environment	`mlflow.source.name`	The entry point or script that generated the trace.	Automatically populated with the filename for Python scripts, notebook name for Databricks/Jupyter notebooks.
	`mlflow.source.git.commit`	Git commit hash.	If run from a Git repository, the commit hash is automatically detected and populated.
	`mlflow.source.git.branch`	Git branch.	If run from a Git repository, the current branch name is automatically detected and populated.
	`mlflow.source.git.repoURL`	Git repo URL.	If run from a Git repository, the repository URL is automatically detected and populated.
	`mlflow.source.type`	Captures the execution environment.	Automatically set to `NOTEBOOK` if running in Jupyter or Databricks notebook, `LOCAL` if running a local Python script, else `UNKNOWN` (automatically detected). In your deployed app, we suggest updating this variable based on the environment e.g., `PRODUCTION`, `STAGING`, etc.
Application version	`metadata.mlflow.modelId`	MLflow LoggedModel ID.	Automatically set to the model ID value in the environment variable `MLFLOW_ACTIVE_MODEL_ID` or the model ID set via `mlflow.set_active_model()` function.

Customizing automatically populated metadata

You can override any of the automatically populated metadata fields using mlflow.update_current_trace(). This is useful when the automatic detection doesn't meet your requirements or when you want to add additional context:

import mlflow
import os

# We suggest populating metadata from environment variables rather than hard coding the values

@mlflow.trace
def my_app(user_question: str) -> dict:
    # Override automatically populated metadata and add custom context
    mlflow.update_current_trace(
        metadata={
            # Use any of the keys from above
            "mlflow.source.type": os.getenv("APP_ENVIRONMENT", "development"),  # Override default LOCAL/NOTEBOOK
        }
    )

    # Application logic

    return {"response": user_question + "!!"}

my_app("test")

Add custom metadata

You can attach custom metadata to capture any application-specific context. For example, you might want to attach information such as:

app_version: e.g., "1.0.0" (from APP_VERSION environment variable)
deployment_id: e.g., "deploy-abc-123" (from DEPLOYMENT_ID environment variable)
region: e.g., "us-east-1" (from REGION environment variable)
(Other custom metadata like feature flags can also be added)

import mlflow
import os

# We suggest populating metadata from environment variables rather than hard coding the values

@mlflow.trace
def my_app(user_question: str) -> dict:
    # Add custom context
    mlflow.update_current_trace(
        metadata={
            # Use any key
            "app_version": os.getenv("APP_VERSION", "1.0.0"),
            "deployment_id": os.getenv("DEPLOYMENT_ID", "unknown"),
            "region": os.getenv("REGION", "us-east-1")
        }
    )

    # Application logic

    return {"response": user_question + "!!"}

my_app("test")

Production web application example

In production applications, you typically track user, session, and environment context simultaneously. The following FastAPI example demonstrates how to capture all context types together:

import mlflow
import os
from fastapi import FastAPI, Request
from pydantic import BaseModel

# Initialize FastAPI app
app = FastAPI()

class ChatRequest(BaseModel):
    message: str

@mlflow.trace # Ensure @mlflow.trace is the outermost decorator
@app.post("/chat") # FastAPI decorator should be inner decorator
def handle_chat(request: Request, chat_request: ChatRequest):
    # Retrieve all context from request headers
    client_request_id = request.headers.get("X-Request-ID")
    session_id = request.headers.get("X-Session-ID")
    user_id = request.headers.get("X-User-ID")

    # Update the current trace with all context and environment metadata
    # The @mlflow.trace decorator ensures an active trace is available
    mlflow.update_current_trace(
        client_request_id=client_request_id,
        metadata={
            # Session context - groups traces from multi-turn conversations
            "mlflow.trace.session": session_id,
            # User context - associates traces with specific users
            "mlflow.trace.user": user_id,
            # Override automatically populated environment metadata
            "mlflow.source.type": os.getenv("APP_ENVIRONMENT", "development"),  # Override default LOCAL/NOTEBOOK
            # Add custom environment metadata
            "environment": "production",
            "app_version": os.getenv("APP_VERSION", "1.0.0"),
            "deployment_id": os.getenv("DEPLOYMENT_ID", "unknown"),
            "region": os.getenv("REGION", "us-east-1"),
            # Add custom tags
            "my_custom_tag": "custom tag value",
        }
    )

    # --- Your application logic for processing the chat message ---
    # For example, calling a language model with context
    # response_text = my_llm_call(
    #     message=chat_request.message,
    #     session_id=session_id,
    #     user_id=user_id
    # )
    response_text = f"Processed message: '{chat_request.message}'"
    # --- End of application logic ---

    # Return response
    return {
        "response": response_text
    }

# To run this example (requires uvicorn and fastapi):
# uvicorn your_file_name:app --reload
#
# Example curl request with context headers:
# curl -X POST "http://127.0.0.1:8000/chat" \
#      -H "Content-Type: application/json" \
#      -H "X-Request-ID: req-abc-123-xyz-789" \
#      -H "X-Session-ID: session-def-456-uvw-012" \
#      -H "X-User-ID: user-jane-doe-12345" \
#      -d '{"message": "What is my account balance?"}'

Tracing web app UI

This example demonstrates a unified approach to context tracking, capturing:

Client Request ID: From the X-Request-ID header, logged using the client_request_id parameter.
User Information: From the X-User-ID header, logged as mlflow.trace.user metadata.
Session Information: From the X-Session-ID header, logged as mlflow.trace.session metadata.
Environment Context: From environment variables and automatic detection, with overrides for production settings.
Application Version: From the APP_VERSION environment variable.
Deployment Details: Custom metadata for deployment ID and region.

Best practices

Consistent ID formats - Use standardized formats for user and session IDs across your application
Session boundaries - Define clear rules for when sessions start and end
Environment variables - Populate metadata from environment variables rather than hard-coding values
Combine context types - Track user, session, and environment context together for complete traceability
Regular analysis - Set up dashboards to monitor user behavior, session patterns, and version performance
Override defaults thoughtfully - Only override automatically populated metadata when necessary

Next steps

Once you've added user and session metadata, you can:

Search and filter traces - Query traces using the context metadata you've added
Analyze user activity - Production-ready patterns for analyzing user behavior

Feedback

Was this page helpful?

Last updated on 2025-12-02