Edit

Share via


Use ai.fix_grammar with PySpark

The ai.fix_grammar function uses generative AI to correct the spelling, grammar, and punctuation of input text, with a single line of code.

Note

Overview

The ai.fix_grammar function is available for Spark DataFrames. You must specify the name of an existing input column as a parameter.

The function returns a new DataFrame that includes corrected text for each input text row, stored in an output column.

Syntax

df.ai.fix_grammar(input_col="input", output_col="corrections")

Parameters

Name Description
input_col
Required
A string that contains the name of an existing column with input text values to correct for spelling, grammar, and punctuation.
output_col
Optional
A string that contains the name of a new column to store corrected text for each row of input text. If you don't set this parameter, a default name generates for the output column.
error_col
Optional
A string that contains the name of a new column to store any OpenAI errors that result from processing each row of input text. If you don't set this parameter, a default name generates for the error column. If there are no errors for a row of input, the value in this column is null.

Returns

The function returns a Spark DataFrame that includes a new column that contains corrected text for each row of text in the input column. If the input text is null, the result is null.

Example

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("There are an error here.",),
        ("She and me go weigh back. We used to hang out every weeks.",),
        ("The big picture are right, but you're details is all wrong.",)
    ], ["text"])

results = df.ai.fix_grammar(input_col="text", output_col="corrections")
display(results)

This example code cell provides the following output:

Screenshot showing a  data frame with a 'text' column and a 'corrections' column, which has the text from the text column with corrected grammar.