SentencePieceNormalizer Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Normalize the string according to SentencePiece normalization.
public sealed class SentencePieceNormalizer : Microsoft.ML.Tokenizers.Normalizer
type SentencePieceNormalizer = class
inherit Normalizer
Public NotInheritable Class SentencePieceNormalizer
Inherits Normalizer
- Inheritance
Constructors
| SentencePieceNormalizer(Boolean, Boolean, Boolean, Boolean, IReadOnlyDictionary<String,Int32>) |
Creates a SentencePieceNormalizer object. |
Properties
| AddDummyPrefix |
Indicate emitting the dummy prefix character U+2581 at the beginning of sentence token during the encoding. |
| EscapeWhiteSpaces |
Indicate escaping white spaces by adding the dummy prefix character U+2581. |
| RemoveExtraWhiteSpaces |
Indicate removing extra white spaces from the original string during the normalization. |
| SpecialTokens |
Indicate the added tokens. |
| TreatWhitespaceAsSuffix |
Indicate treating white space as suffix. |
Methods
| Normalize(ReadOnlySpan<Char>) |
Normalize the original string according to SentencePiece normalization. |
| Normalize(String) |
Normalize the original string according to SentencePiece normalization. |