Udostępnij przez


TiktokenTokenizer.CreateForEncoding Method

Definition

Create tokenizer based on encoding name

public static Microsoft.ML.Tokenizers.TiktokenTokenizer CreateForEncoding(string encodingName, System.Collections.Generic.IReadOnlyDictionary<string,int>? extraSpecialTokens = default, Microsoft.ML.Tokenizers.Normalizer? normalizer = default);
static member CreateForEncoding : string * System.Collections.Generic.IReadOnlyDictionary<string, int> * Microsoft.ML.Tokenizers.Normalizer -> Microsoft.ML.Tokenizers.TiktokenTokenizer
Public Shared Function CreateForEncoding (encodingName As String, Optional extraSpecialTokens As IReadOnlyDictionary(Of String, Integer) = Nothing, Optional normalizer As Normalizer = Nothing) As TiktokenTokenizer

Parameters

encodingName
String

Encoding name

extraSpecialTokens
IReadOnlyDictionary<String,Int32>

Extra special tokens other than the built-in ones for the encoding

normalizer
Normalizer

To normalize the text before tokenization

Returns

The tokenizer

Applies to