Udostępnij przez


WordPieceTokenizer.Create Method

Definition

Overloads

Create(Stream, WordPieceOptions)

Create a new instance of the WordPieceTokenizer class.

Create(String, WordPieceOptions)

Create a new instance of the WordPieceTokenizer class.

Create(Stream, WordPieceOptions)

Source:
WordPieceTokenizer.cs
Source:
WordPieceTokenizer.cs
Source:
WordPieceTokenizer.cs

Create a new instance of the WordPieceTokenizer class.

public static Microsoft.ML.Tokenizers.WordPieceTokenizer Create(System.IO.Stream vocabStream, Microsoft.ML.Tokenizers.WordPieceOptions? options = default);
static member Create : System.IO.Stream * Microsoft.ML.Tokenizers.WordPieceOptions -> Microsoft.ML.Tokenizers.WordPieceTokenizer
Public Shared Function Create (vocabStream As Stream, Optional options As WordPieceOptions = Nothing) As WordPieceTokenizer

Parameters

vocabStream
Stream

The path to the WordPiece vocab file.

options
WordPieceOptions

The options to use for the WordPiece tokenizer.

Returns

A new instance of the WordPieceTokenizer class.

Remarks

If the PreTokenizer is null, the whitespace pre-tokenizer will be used. When creating the tokenizer, ensure that the vocabulary stream is sourced from a trusted provider.

Applies to

Create(String, WordPieceOptions)

Source:
WordPieceTokenizer.cs
Source:
WordPieceTokenizer.cs
Source:
WordPieceTokenizer.cs

Create a new instance of the WordPieceTokenizer class.

public static Microsoft.ML.Tokenizers.WordPieceTokenizer Create(string vocabFilePath, Microsoft.ML.Tokenizers.WordPieceOptions? options = default);
static member Create : string * Microsoft.ML.Tokenizers.WordPieceOptions -> Microsoft.ML.Tokenizers.WordPieceTokenizer
Public Shared Function Create (vocabFilePath As String, Optional options As WordPieceOptions = Nothing) As WordPieceTokenizer

Parameters

vocabFilePath
String

The path to the WordPiece vocab file.

options
WordPieceOptions

The options to use for the WordPiece tokenizer.

Returns

A new instance of the WordPieceTokenizer class.

Remarks

If the PreTokenizer is null, the whitespace pre-tokenizer will be used. When creating the tokenizer, ensure that the vocabulary file is sourced from a trusted provider.

Applies to