Use ai.analyze_sentiment with PySpark

The ai.analyze_sentiment function uses generative AI to detect the emotional state of the input text, with a single line of code. It can detect whether the emotional state of the input is positive, negative, mixed, or neutral. It can also detect the emotional state according to your specified labels. If the function can't determine the sentiment, it leaves the output blank.

Note

This article covers using ai.analyze_sentiment with PySpark. To use ai.analyze_sentiment with pandas, see this article.
See other AI functions in this overview article.
Learn how to customize the configuration of AI functions.

Overview

The ai.analyze_sentiment function is available for Spark DataFrames. You must specify the name of an existing input column as a parameter.

The function returns a new DataFrame, with sentiment labels for each input text row stored in an output column.

Syntax

# Default sentiment labels
df.ai.analyze_sentiment(input_col="input", output_col="sentiment")

# Custom sentiment labels
df.ai.analyze_sentiment(input_col="input", output_col="sentiment", labels=["happy", "angry", "indifferent"])

Parameters

Name	Description
`input_col` Required	A string that contains the name of an existing column with input text values to analyze for sentiment.
`output_col` Optional	A string that contains the name of a new column to store the sentiment label for each row of input text. If you don't set this parameter, a default name generates for the output column.
`labels` Optional	One or more strings that represent the set of sentiment labels to match to input text values.
`error_col` Optional	A string that contains the name of a new column to store any OpenAI errors that result from processing each row of input text. If you don't set this parameter, a default name generates for the error column. If an input row has no errors, the value in this column is `null`.

Returns

The function returns a Spark DataFrame that includes a new column that contains sentiment labels that match each row of text in the input column. The default sentiment labels include positive, negative, neutral, or mixed. If custom labels are specified, those labels are used instead. If a sentiment can't be determined, the return value is null.

Example

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("The cleaning spray permanently stained my beautiful kitchen counter. Never again!",),
        ("I used this sunscreen on my vacation to Florida, and I didn't get burned at all. Would recommend.",),
        ("I'm torn about this speaker system. The sound was high quality, though it didn't connect to my roommate's phone.",),
        ("The umbrella is OK, I guess.",)
    ], ["reviews"])

sentiment = df.ai.analyze_sentiment(input_col="reviews", output_col="sentiment")
display(sentiment)

This example code cell provides the following output:

Use ai.analyze_sentiment with pandas.
Categorize text with ai.classify.
Generate vector embeddings with ai.embed.
Extract entities with ai_extract.
Fix grammar with ai.fix_grammar.
Answer custom user prompts with ai.generate_response.
Calculate similarity with ai.similarity.
Summarize text with ai.summarize.
Translate text with ai.translate.
Learn more about the full set of AI functions.
Customize the configuration of AI functions.
Did we miss a feature you need? Suggest it on the Fabric Ideas forum.

Feedback

Was this page helpful?

Last updated on 2025-11-21