Integraciones de seguimiento de MLflow

El seguimiento de MLflow se integra con una amplia gama de bibliotecas y marcos populares de inteligencia artificial generativa, lo que ofrece una experiencia de seguimiento automático de una sola línea para todas ellas. Esto le permite obtener observabilidad inmediata en las aplicaciones de GenAI con una configuración mínima.

Esta amplia compatibilidad significa que puede obtener observabilidad sin cambios significativos en el código, aprovechando las herramientas que ya usa. En el caso de componentes personalizados o bibliotecas no admitidas, MLflow también proporciona eficaces API de seguimiento manual.

El seguimiento automático captura la lógica de la aplicación y los pasos intermedios, como las llamadas LLM, el uso de herramientas y las interacciones del agente, en función de la implementación de la biblioteca o el SDK específicos.

Nota:

En los clústeres de proceso sin servidor, el registro automático de marcos de seguimiento de genAI no está habilitado automáticamente. Debe habilitar explícitamente el registro automático llamando a la función adecuada mlflow.<library>.autolog() para las integraciones específicas que desea realizar el seguimiento.

Principales integraciones de un vistazo

Estos son ejemplos de inicio rápido para algunas de las integraciones más usadas. Haga clic en una pestaña para ver un ejemplo de uso básico. Para obtener requisitos previos detallados y escenarios más avanzados para cada uno, visite sus páginas de integración dedicadas (vinculadas desde las pestañas o la lista siguiente).

OpenAI

import mlflow
import openai

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/openai-tracing-demo")

openai_client = openai.OpenAI()

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?",
    }
]

response = openai_client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.1,
    max_tokens=100,
)
# View trace in MLflow UI

Guía completa de integración de OpenAI

LangChain

import mlflow
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.langchain.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langchain-tracing-demo")

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt = PromptTemplate.from_template("Tell me a joke about {topic}.")
chain = prompt | llm | StrOutputParser()

chain.invoke({"topic": "artificial intelligence"})
# View trace in MLflow UI

Guía completa de integración de LangChain

LangGraph

import mlflow
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.langchain.autolog() # LangGraph uses LangChain's autolog

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/langgraph-tracing-demo")

@tool
def get_weather(city: str):
    """Use this to get weather information."""
    return f"It might be cloudy in {city}"

llm = ChatOpenAI(model="gpt-4o-mini")
graph = create_react_agent(llm, [get_weather])
result = graph.invoke({"messages": [("user", "what is the weather in sf?")]})
# View trace in MLflow UI

Guía completa de integración de LangGraph

Anthropic

import mlflow
import anthropic
import os

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.anthropic.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/anthropic-tracing-demo")

client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
)
# View trace in MLflow UI

Guía completa de integración antrópica

DSPy

import mlflow
import dspy

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.dspy.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/dspy-tracing-demo")

lm = dspy.LM("openai/gpt-4o-mini") # Assumes OPENAI_API_KEY is set
dspy.configure(lm=lm)

class SimpleSignature(dspy.Signature):
    input_text: str = dspy.InputField()
    output_text: str = dspy.OutputField()

program = dspy.Predict(SimpleSignature)
result = program(input_text="Summarize MLflow Tracing.")
# View trace in MLflow UI

Guía completa de integración de DSPy

Databricks

import mlflow
import os
from openai import OpenAI # Databricks FMAPI uses OpenAI client

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.openai.autolog() # Traces Databricks FMAPI using OpenAI client

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/databricks-fmapi-tracing")

client = OpenAI(
    api_key=os.environ.get("DATABRICKS_TOKEN"),
    base_url=f"{os.environ.get('DATABRICKS_HOST')}/serving-endpoints"
)
response = client.chat.completions.create(
    model="databricks-llama-4-maverick",
    messages=[{"role": "user", "content": "Key features of MLflow?"}],
)
# View trace in MLflow UI

Guía completa de integración de Databricks

Bedrock

import mlflow
import boto3

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# import os
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.bedrock.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/bedrock-tracing-demo")

bedrock = boto3.client(
    service_name="bedrock-runtime",
    region_name="us-east-1" # Replace with your region
)
response = bedrock.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello World in one line."}]
)
# View trace in MLflow UI

Guía completa de integración de Bedrock

AutoGen

import mlflow
from autogen import ConversableAgent
import os

# If running this code outside of a Databricks notebook (e.g., locally),
# uncomment and set the following environment variables to point to your Databricks workspace:
# os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
# os.environ["DATABRICKS_TOKEN"] = "your-personal-access-token"

mlflow.autogen.autolog()

mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/autogen-tracing-demo")

config_list = [{"model": "gpt-4o-mini", "api_key": os.environ.get("OPENAI_API_KEY")}]
assistant = ConversableAgent("assistant", llm_config={"config_list": config_list})
user_proxy = ConversableAgent("user_proxy", human_input_mode="NEVER", code_execution_config=False)

user_proxy.initiate_chat(assistant, message="What is 2+2?")
# View trace in MLflow UI

Guía completa de integración de AutoGen

Administración segura de claves de API

Para entornos de producción, Databricks recomienda usar ai Gateway o secretos de Databricks para administrar claves de API. AI Gateway es el método preferido y ofrece características de gobernanza adicionales.

Advertencia

Nunca guardes las claves de API directamente en el código o en los notebooks. Use siempre secretos de AI Gateway o Databricks para credenciales confidenciales.

Puerta de enlace de IA (recomendado)

Databricks recomienda Mosaic AI Gateway para gobernar y supervisar el acceso a los modelos de inteligencia artificial de generación.

Cree un punto de conexión de Foundation Model configurado con AI Gateway:

En el área de trabajo de Databricks, vaya a Servicio> paracrear un nuevo punto de conexión.
Elija un tipo de punto de conexión y un proveedor.
Configure el punto de conexión con la clave de API.
Durante la configuración del punto de conexión, habilite AI Gateway y configure la limitación de velocidad, las reservas y los límites de protección según sea necesario.
Puede obtener código generado automáticamente para empezar a consultar rápidamente el punto de conexión. Vaya a Servir> su punto de conexión >Use>Query. Asegúrese de agregar el código de seguimiento:

import mlflow
from openai import OpenAI
import os

# How to get your Databricks token: https://docs.databricks.com/en/dev-tools/auth/pat.html
# DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
# Alternatively in a Databricks notebook you can use this:
DATABRICKS_TOKEN = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Set up MLflow tracking (if running outside Databricks)
# If running in a Databricks notebook, these are not needed.
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Shared/my-genai-app")

client = OpenAI(
  api_key=DATABRICKS_TOKEN,
  base_url="<YOUR_HOST_URL>/serving-endpoints"
)

chat_completion = client.chat.completions.create(
  messages=[
  {
    "role": "system",
    "content": "You are an AI assistant"
  },
  {
    "role": "user",
    "content": "What is MLflow?"
  }
  ],
  model="<YOUR_ENDPOINT_NAME>",
  max_tokens=256
)

print(chat_completion.choices[0].message.content)

Secretos de Databricks

Usa secretos de Databricks para administrar claves de API:

Cree un ámbito secreto y almacene la clave de API:

from databricks.sdk import WorkspaceClient

# Set your secret scope and key names
secret_scope_name = "llm-secrets"  # Choose an appropriate scope name
secret_key_name = "api-key"        # Choose an appropriate key name

# Create the secret scope and store your API key
w = WorkspaceClient()
w.secrets.create_scope(scope=secret_scope_name)
w.secrets.put_secret(
    scope=secret_scope_name,
    key=secret_key_name,
    string_value="your-api-key-here"  # Replace with your actual API key
)

Recupera y usa el secreto en tu código:

import mlflow
import openai
import os

# Configure your secret scope and key names
secret_scope_name = "llm-secrets"
secret_key_name = "api-key"

# Retrieve the API key from Databricks secrets
os.environ["OPENAI_API_KEY"] = dbutils.secrets.get(
    scope=secret_scope_name,
    key=secret_key_name
)

# Enable automatic tracing
mlflow.openai.autolog()

# Use OpenAI client with securely managed API key
client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Explain MLflow Tracing"}],
    max_tokens=100
)

Habilitación de varias integraciones de seguimiento automático

A medida que las aplicaciones de GenAI suelen combinar varias bibliotecas, el seguimiento de MLflow permite habilitar el seguimiento automático para varias integraciones simultáneamente, lo que proporciona una experiencia de seguimiento unificada.

Por ejemplo, para habilitar tanto el seguimiento de LangChain como el de OpenAI directamente:

import mlflow

# Enable MLflow Tracing for both LangChain and OpenAI
mlflow.langchain.autolog()
mlflow.openai.autolog()

# Your code using both LangChain and OpenAI directly...
# ... an example can be found on the Automatic Tracing page ...

MLflow generará un único seguimiento cohesivo que combina los pasos de las llamadas LLM de LangChain y OpenAI directas, lo que le permite inspeccionar el flujo completo. Puede encontrar más ejemplos de combinación de integraciones en la página Seguimiento automático .

Deshabilitación del seguimiento automático

El seguimiento automático de cualquier biblioteca específica se puede deshabilitar llamando a mlflow.<library>.autolog(disable=True). Para deshabilitar todas las integraciones de registro automático a la vez, use mlflow.autolog(disable=True).

import mlflow

# Disable for a specific library
mlflow.openai.autolog(disable=True)

# Disable all autologging
mlflow.autolog(disable=True)

Comentarios

¿Le ha resultado útil esta página?

Last updated on 2025-12-03