Udostępnij przez


BertTokenizer.EncodeToIds Method

Definition

Overloads

EncodeToIds(String, Int32, Boolean, String, Int32, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(string text, int maxTokenCount, bool addSpecialTokens, out string? normalizedText, out int charsConsumed, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : string * int * bool * string * int * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As String, maxTokenCount As Integer, addSpecialTokens As Boolean, ByRef normalizedText As String, ByRef charsConsumed As Integer, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
String

The text to encode.

maxTokenCount
Int32

The maximum number of tokens to return.

addSpecialTokens
Boolean

Indicate whether to add special tokens to the encoded Ids.

normalizedText
String

The normalized text.

charsConsumed
Int32

The number of characters consumed from the input text.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(ReadOnlySpan<Char>, Int32, Boolean, String, Int32, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(ReadOnlySpan<char> text, int maxTokenCount, bool addSpecialTokens, out string? normalizedText, out int charsConsumed, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : ReadOnlySpan<char> * int * bool * string * int * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As ReadOnlySpan(Of Char), maxTokenCount As Integer, addSpecialTokens As Boolean, ByRef normalizedText As String, ByRef charsConsumed As Integer, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
ReadOnlySpan<Char>

The text to encode.

maxTokenCount
Int32

The maximum number of tokens to return.

addSpecialTokens
Boolean

Indicate whether to add special tokens to the encoded Ids.

normalizedText
String

The normalized text.

charsConsumed
Int32

The number of characters consumed from the input text.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(String, Int32, String, Int32, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(string text, int maxTokenCount, out string? normalizedText, out int charsConsumed, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : string * int * string * int * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As String, maxTokenCount As Integer, ByRef normalizedText As String, ByRef charsConsumed As Integer, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
String

The text to encode.

maxTokenCount
Int32

The maximum number of tokens to return.

normalizedText
String

The normalized text.

charsConsumed
Int32

The number of characters consumed from the input text.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(ReadOnlySpan<Char>, Int32, String, Int32, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(ReadOnlySpan<char> text, int maxTokenCount, out string? normalizedText, out int charsConsumed, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : ReadOnlySpan<char> * int * string * int * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As ReadOnlySpan(Of Char), maxTokenCount As Integer, ByRef normalizedText As String, ByRef charsConsumed As Integer, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
ReadOnlySpan<Char>

The text to encode.

maxTokenCount
Int32

The maximum number of tokens to return.

normalizedText
String

The normalized text.

charsConsumed
Int32

The number of characters consumed from the input text.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(ReadOnlySpan<Char>, Boolean, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(ReadOnlySpan<char> text, bool addSpecialTokens, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : ReadOnlySpan<char> * bool * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As ReadOnlySpan(Of Char), addSpecialTokens As Boolean, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
ReadOnlySpan<Char>

The text to encode.

addSpecialTokens
Boolean

Indicate whether to add special tokens to the encoded Ids.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(String, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(string text, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : string * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As String, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
String

The text to encode.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(ReadOnlySpan<Char>, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(ReadOnlySpan<char> text, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : ReadOnlySpan<char> * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As ReadOnlySpan(Of Char), Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
ReadOnlySpan<Char>

The text to encode.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to

EncodeToIds(String, Boolean, Boolean, Boolean)

Source:
BertTokenizer.cs
Source:
BertTokenizer.cs
Source:
BertTokenizer.cs

Encodes input text to token Ids.

public System.Collections.Generic.IReadOnlyList<int> EncodeToIds(string text, bool addSpecialTokens, bool considerPreTokenization = true, bool considerNormalization = true);
override this.EncodeToIds : string * bool * bool * bool -> System.Collections.Generic.IReadOnlyList<int>
Public Function EncodeToIds (text As String, addSpecialTokens As Boolean, Optional considerPreTokenization As Boolean = true, Optional considerNormalization As Boolean = true) As IReadOnlyList(Of Integer)

Parameters

text
String

The text to encode.

addSpecialTokens
Boolean

Indicate whether to add special tokens to the encoded Ids.

considerPreTokenization
Boolean

Indicate whether to consider pre-tokenization before tokenization.

considerNormalization
Boolean

Indicate whether to consider normalization before tokenization.

Returns

The list of encoded Ids.

Applies to