向跟踪添加上下文

通过向跟踪添加上下文，可以跟踪执行详细信息、分析用户行为、跨环境调试问题以及监视应用程序性能。 MLflow 为常见上下文类型提供标准化的元数据字段，以及添加特定于应用程序的自定义元数据的灵活性。

先决条件

根据环境选择适当的安装方法：

生产

对于生产部署，请安装 mlflow-tracing 包：

pip install --upgrade mlflow-tracing

该 mlflow-tracing 包针对生产用途进行优化，具有最少的依赖项和更好的性能特征。

开发

对于开发环境，请使用 Databricks 附加组件安装完整的 MLflow 包：

pip install --upgrade "mlflow[databricks]>=3.1"

完整 mlflow[databricks] 包包括 Databricks 本地开发和试验所需的所有功能。

注释

上下文跟踪需要 MLflow 3。由于性能限制和缺少生产用途所必需的功能，不支持 MLflow 2.x。

上下文元数据的类型

生产应用程序需要同时跟踪多个上下文片段。 MLflow 已标准化元数据字段来捕获重要的上下文信息：

上下文类型	用例	MLflow 字段
客户端请求 ID	将跟踪链接到用于端到端调试的特定客户端请求或 API 调用	`TraceInfo` 参数 `client_request_id`
用户会话 ID	从多轮次对话对跟踪进行分组，使你能够分析完整的聊天流	`mlflow.trace.session` 元数据
用户 ID	将跟踪与特定用户相关联，以便进行个性化、队列分析和用户特定的调试	`mlflow.trace.user` 元数据
环境数据	跟踪部署上下文（环境、版本、区域），以便跨不同部署进行作见解和调试	自动元数据和自定义元数据
自定义元数据	为了组织、搜索和筛选跟踪，添加特定于应用程序的元数据。	（元数据密钥）

跟踪用户和会话

跟踪 GenAI 应用程序中的用户和会话提供了了解用户行为、分析聊天流和改进个性化的基本上下文。

为什么跟踪用户和会话？

用户和会话跟踪可实现强大的分析和改进：

用户行为分析 - 了解不同用户与应用程序交互的方式
对话流程跟踪 - 分析多回合对话和上下文保留
个性化见解 - 确定用于改进用户特定体验的模式
每个用户的质量 - 跟踪不同用户细分的性能指标
会话连续性 - 在多项交互中维持上下文

用户和会话的标准元数据字段

MLflow 为会话和用户跟踪提供两个标准元数据字段：

mlflow.trace.user - 将跟踪与特定用户相关联
mlflow.trace.session - 对属于多回合对话的跟踪进行分组

使用这些标准元数据字段时，MLflow 会自动在 UI 中启用筛选和分组。与标记不同，在记录跟踪后，元数据无法更新，因此非常适合不可变标识符（如用户和会话 ID）。

基本实现

下面介绍如何将用户和会话跟踪添加到应用程序：

import mlflow

@mlflow.trace
def chat_completion(user_id: str, session_id: str, message: str):
    """Process a chat message with user and session tracking."""

    # Add user and session context to the current trace
    # The @mlflow.trace decorator ensures there's an active trace
    mlflow.update_current_trace(
        metadata={
            "mlflow.trace.user": user_id,      # Links this trace to a specific user
            "mlflow.trace.session": session_id, # Groups this trace with others in the same conversation
        }
    )

    # The trace will capture the execution time, inputs, outputs, and any errors
    # Your chat logic here
    response = generate_response(message)
    return response

# Example usage in a chat application
def handle_user_message(request):
    # Extract user and session IDs from your application's context
    # These IDs should be consistent across all interactions
    return chat_completion(
        user_id=request.user_id,        # e.g., "user-123" - unique identifier for the user
        session_id=request.session_id,   # e.g., "session-abc-456" - groups related messages
        message=request.message
    )

# Placeholder chat logic
def generate_response(message: str) -> str:
    """Your chat logic here"""
    return "Placeholder response"

# Run the chat completion with user and session context
result = chat_completion(
    user_id="user-123",
    session_id="session-abc-456",
    message="What is MLflow and how does it help with machine learning?"
)

要点：

@mlflow.trace 修饰器会自动为函数执行创建跟踪
mlflow.update_current_trace() 将用户 ID 和会话 ID 作为元数据添加到活动跟踪
使用 metadata 可确保创建跟踪后这些标识符不可变

跟踪环境和版本

通过跟踪 GenAI 应用程序的执行环境和应用程序版本，可以针对代码相关的性能和质量问题进行调试。此元数据允许：

在、development和staging之间进行production
跨应用版本的性能/质量跟踪和回归检测
问题发生时更快地进行根本原因分析

注释

有关版本控制工作原理的全面概述，请参阅版本跟踪。

自动填充的元数据

这些标准元数据字段由 MLflow 根据执行环境自动捕获。

重要

如果自动捕获逻辑不符合要求，则可以使用 mlflow.update_current_trace(metadata={"mlflow.source.name": "custom_name"})手动替代这些自动填充的元数据。

类别	元数据字段	Description	自动设置逻辑
执行环境	`mlflow.source.name`	生成跟踪的入口点或脚本。	自动使用 Python 脚本的文件名和 Databricks/Jupyter 笔记本的名称进行填充。
	`mlflow.source.git.commit`	Git 提交哈希。	如果从 Git 存储库运行，则会自动检测并填充提交哈希。
	`mlflow.source.git.branch`	Git 分支。	如果从 Git 存储库运行，则会自动检测和填充当前分支名称。
	`mlflow.source.git.repoURL`	Git 存储库 URL。	如果从 Git 存储库运行，则会自动检测并填充存储库 URL。
	`mlflow.source.type`	捕获执行环境。	在 Jupyter 或 Databricks 笔记本中运行时，自动设置为 `NOTEBOOK`；在本地 Python 脚本中运行时，为 `LOCAL`；否则为 `UNKNOWN`（自动检测到）。在已部署的应用中，我们建议根据环境（例如，`PRODUCTIONSTAGING`等）更新此变量。
应用程序版本	`metadata.mlflow.modelId`	MLflow LoggedModel 的 ID。	自动设置为环境变量 `MLFLOW_ACTIVE_MODEL_ID` 中的模型 ID 值或通过 `mlflow.set_active_model()` 函数设置的模型 ID 值。

自定义自动填充的元数据

可以通过 mlflow.update_current_trace() 覆盖任何自动填充的元数据字段。当自动检测不符合要求或想要添加其他上下文时，这非常有用：

import mlflow
import os

# We suggest populating metadata from environment variables rather than hard coding the values

@mlflow.trace
def my_app(user_question: str) -> dict:
    # Override automatically populated metadata and add custom context
    mlflow.update_current_trace(
        metadata={
            # Use any of the keys from above
            "mlflow.source.type": os.getenv("APP_ENVIRONMENT", "development"),  # Override default LOCAL/NOTEBOOK
        }
    )

    # Application logic

    return {"response": user_question + "!!"}

my_app("test")

添加自定义元数据

可以附加 自定义元数据 来捕获任何特定于应用程序的上下文。例如，你可能想要附加如下信息：

app_version：例如（ "1.0.0" 来自 APP_VERSION 环境变量）
deployment_id：例如（ "deploy-abc-123" 来自 DEPLOYMENT_ID 环境变量）
region：例如（ "us-east-1" 来自 REGION 环境变量）
（还可以添加其他自定义元数据，如功能标志）

import mlflow
import os

# We suggest populating metadata from environment variables rather than hard coding the values

@mlflow.trace
def my_app(user_question: str) -> dict:
    # Add custom context
    mlflow.update_current_trace(
        metadata={
            # Use any key
            "app_version": os.getenv("APP_VERSION", "1.0.0"),
            "deployment_id": os.getenv("DEPLOYMENT_ID", "unknown"),
            "region": os.getenv("REGION", "us-east-1")
        }
    )

    # Application logic

    return {"response": user_question + "!!"}

my_app("test")

生产 Web 应用程序示例

在生产应用程序中，通常同时跟踪用户、会话和环境上下文。以下 FastAPI 示例演示如何一起捕获所有上下文类型：

import mlflow
import os
from fastapi import FastAPI, Request
from pydantic import BaseModel

# Initialize FastAPI app
app = FastAPI()

class ChatRequest(BaseModel):
    message: str

@mlflow.trace # Ensure @mlflow.trace is the outermost decorator
@app.post("/chat") # FastAPI decorator should be inner decorator
def handle_chat(request: Request, chat_request: ChatRequest):
    # Retrieve all context from request headers
    client_request_id = request.headers.get("X-Request-ID")
    session_id = request.headers.get("X-Session-ID")
    user_id = request.headers.get("X-User-ID")

    # Update the current trace with all context and environment metadata
    # The @mlflow.trace decorator ensures an active trace is available
    mlflow.update_current_trace(
        client_request_id=client_request_id,
        metadata={
            # Session context - groups traces from multi-turn conversations
            "mlflow.trace.session": session_id,
            # User context - associates traces with specific users
            "mlflow.trace.user": user_id,
            # Override automatically populated environment metadata
            "mlflow.source.type": os.getenv("APP_ENVIRONMENT", "development"),  # Override default LOCAL/NOTEBOOK
            # Add custom environment metadata
            "environment": "production",
            "app_version": os.getenv("APP_VERSION", "1.0.0"),
            "deployment_id": os.getenv("DEPLOYMENT_ID", "unknown"),
            "region": os.getenv("REGION", "us-east-1"),
            # Add custom tags
            "my_custom_tag": "custom tag value",
        }
    )

    # --- Your application logic for processing the chat message ---
    # For example, calling a language model with context
    # response_text = my_llm_call(
    #     message=chat_request.message,
    #     session_id=session_id,
    #     user_id=user_id
    # )
    response_text = f"Processed message: '{chat_request.message}'"
    # --- End of application logic ---

    # Return response
    return {
        "response": response_text
    }

# To run this example (requires uvicorn and fastapi):
# uvicorn your_file_name:app --reload
#
# Example curl request with context headers:
# curl -X POST "http://127.0.0.1:8000/chat" \
#      -H "Content-Type: application/json" \
#      -H "X-Request-ID: req-abc-123-xyz-789" \
#      -H "X-Session-ID: session-def-456-uvw-012" \
#      -H "X-User-ID: user-jane-doe-12345" \
#      -d '{"message": "What is my account balance?"}'

追踪网页应用 UI

此示例演示了统一的上下文跟踪方法，并捕获：

客户端请求 ID：从 X-Request-ID 标头中，使用 client_request_id 参数记录。
用户信息：从 X-User-ID 标头中，记录为 mlflow.trace.user 元数据。
会话信息：从 X-Session-ID 标头中提取，记录为 mlflow.trace.session 元数据。
环境上下文：从环境变量和自动检测中获取，并可通过生产设置覆写。
应用程序版本：来自APP_VERSION环境变量。
部署详细信息：部署 ID 和区域的自定义元数据。

最佳做法

一致的 ID 格式 - 在应用程序中对用户和会话 ID 使用标准化格式
会话边界 - 定义会话开始和结束时间的明确规则
环境变量 - 从环境变量填充元数据，而不是硬编码值
组合上下文类型 - 将用户、会话和环境上下文结合在一起，以实现完整的可跟踪性
常规分析 - 设置仪表板以监视用户行为、会话模式和版本性能
谨慎地覆盖默认设置 - 只有在必要时才修改自动填充的元数据

后续步骤

添加用户和会话元数据后，可以：

搜索和筛选踪迹 - 使用已添加的上下文中的元数据查询踪迹
分析用户活动 - 用于分析用户行为的生产就绪模式

反馈

此页面是否有帮助？

Last updated on 2025-12-03