Udostępnij przez


PreTokenizer.CreateWhiteSpace(IReadOnlyDictionary<String,Int32>) Method

Definition

Create a new instance of the PreTokenizer class which split the text at the white spaces.

public static Microsoft.ML.Tokenizers.PreTokenizer CreateWhiteSpace(System.Collections.Generic.IReadOnlyDictionary<string,int>? specialTokens = default);
static member CreateWhiteSpace : System.Collections.Generic.IReadOnlyDictionary<string, int> -> Microsoft.ML.Tokenizers.PreTokenizer
Public Shared Function CreateWhiteSpace (Optional specialTokens As IReadOnlyDictionary(Of String, Integer) = Nothing) As PreTokenizer

Parameters

specialTokens
IReadOnlyDictionary<String,Int32>

The dictionary containing the special tokens and their corresponding ids.

Returns

The pre-tokenizer that splits the text at the white spaces.

Remarks

This pre-tokenizer uses the regex pattern "\S+" to split the text into tokens.

Applies to