VoiceLiveModelFactory Class
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
A factory class for creating instances of the models for mocking.
public static class VoiceLiveModelFactory
type VoiceLiveModelFactory = class
Public Class VoiceLiveModelFactory
- Inheritance
-
VoiceLiveModelFactory
Methods
| Name | Description |
|---|---|
| AnimationOptions(String, IEnumerable<AnimationOutputType>) |
Configuration for animation outputs including blendshapes and visemes metadata. |
| AssistantMessageItem(String, IEnumerable<MessageContentPart>, Nullable<ItemParamStatus>) |
An assistant message item within a conversation. |
| AudioEchoCancellation() |
Echo cancellation configuration for server-side audio processing. |
| AudioInputTranscriptionOptions(AudioInputTranscriptionOptionsModel, String, IDictionary<String,String>, IEnumerable<String>) |
Configuration for input audio transcription. |
| AudioNoiseReduction(AudioNoiseReductionType) |
Configuration for input audio noise reduction. |
| AvatarConfiguration(IEnumerable<IceServer>, String, String, Boolean, VideoParams) |
Configuration for avatar streaming and behavior during the session. |
| AzureCustomVoice(String, String, Nullable<Single>, String, IEnumerable<String>, String, String, String, String, String) |
Azure custom voice configuration. |
| AzurePersonalVoice(String, Nullable<Single>, PersonalVoiceModels) |
Azure personal voice configuration. |
| AzureSemanticEouDetection(Nullable<EouThresholdLevel>, Nullable<Single>) |
Azure semantic end-of-utterance detection (default). |
| AzureSemanticEouDetectionEn(Nullable<EouThresholdLevel>, Nullable<Single>) |
Azure semantic end-of-utterance detection (English-optimized). |
| AzureSemanticEouDetectionMultilingual(Nullable<EouThresholdLevel>, Nullable<Single>) |
Azure semantic end-of-utterance detection (multilingual). |
| AzureSemanticVadTurnDetection(Nullable<Single>, Nullable<Int32>, Nullable<Int32>, EouDetection, Nullable<Int32>, Nullable<Boolean>, IEnumerable<String>, Nullable<Boolean>, Nullable<Boolean>, Nullable<Boolean>) |
Server Speech Detection (Azure semantic VAD, default variant). |
| AzureSemanticVadTurnDetectionEn(Nullable<Single>, Nullable<Int32>, Nullable<Int32>, EouDetection, Nullable<Int32>, Nullable<Boolean>, Nullable<Boolean>, Nullable<Boolean>, Nullable<Boolean>) |
Server Speech Detection (Azure semantic VAD, English-only). |
| AzureSemanticVadTurnDetectionMultilingual(Nullable<Single>, Nullable<Int32>, Nullable<Int32>, EouDetection, Nullable<Int32>, Nullable<Boolean>, IEnumerable<String>, Nullable<Boolean>, Nullable<Boolean>, Nullable<Boolean>) |
Server Speech Detection (Azure semantic VAD). |
| AzureStandardVoice(String, Nullable<Single>, String, IEnumerable<String>, String, String, String, String, String) |
Azure standard voice configuration. |
| AzureVoice(String) |
Base for Azure voice configurations. Please note this is the abstract base class. The derived classes available for instantiation are: AzureCustomVoice, AzureStandardVoice, and AzurePersonalVoice. |
| CachedTokenDetails(Int32, Int32) |
Details of output token usage. |
| ConversationRequestItem(String, String) |
Base for any response item; discriminated by |
| EouDetection(String) |
Top-level union for end-of-utterance (EOU) semantic detection configuration. Please note this is the abstract base class. The derived classes available for instantiation are: AzureSemanticEouDetection, AzureSemanticEouDetectionEn, and AzureSemanticEouDetectionMultilingual. |
| FunctionCallItem(String, String, String, String, Nullable<ItemParamStatus>) |
A function call item within a conversation. |
| FunctionCallOutputItem(String, String, String, Nullable<ItemParamStatus>) |
A function call output item within a conversation. |
| IceServer(IEnumerable<Uri>, String, String) |
ICE server configuration for WebRTC connection negotiation. |
| InputAudioContentPart(String, String) |
Input audio content part. |
| InputTextContentPart(String) |
Input text content part. |
| InputTokenDetails(Int32, Int32, Int32, CachedTokenDetails) |
Details of input token usage. |
| LogProbProperties(String, Single, BinaryData) |
A single log probability entry for a token. |
| MessageContentPart(String) |
Base for any message content part; discriminated by |
| MessageItem(String, IEnumerable<MessageContentPart>, Nullable<ItemParamStatus>) |
A message item within a conversation. |
| OpenAIVoice(OAIVoice) |
OpenAI voice configuration with explicit type field. This provides a unified interface for OpenAI voices, complementing the existing string-based OAIVoice for backward compatibility. |
| OutputTextContentPart(String) |
Output text content part. |
| OutputTokenDetails(Int32, Int32) |
Details of output token usage. |
| RequestAudioContentPart(String) |
An audio content part for a request. |
| RequestTextContentPart(String) |
A text content part for a request. |
| ResponseAudioContentPart(String) |
An audio content part for a response. |
| ResponseCancelledDetails(ResponseCancelledDetailsReason) |
Details for a cancelled response. |
| ResponseFailedDetails(BinaryData) |
Details for a failed response. |
| ResponseFunctionCallItem(String, String, String, String, String, SessionResponseItemStatus) |
A function call item within a conversation. |
| ResponseFunctionCallOutputItem(String, String, String, String) |
A function call output item within a conversation. |
| ResponseIncompleteDetails(ResponseIncompleteDetailsReason) |
Details for an incomplete response. |
| ResponseStatusDetails(String) |
Base for all non-success response details. Please note this is the abstract base class. The derived classes available for instantiation are: ResponseCancelledDetails, ResponseIncompleteDetails, and ResponseFailedDetails. |
| ResponseTextContentPart(String) |
A text content part for a response. |
| ResponseTokenStatistics(Int32, Int32, Int32, InputTokenDetails, OutputTokenDetails) |
Overall usage statistics for a response. |
| ServerVadTurnDetection(Nullable<Single>, Nullable<Int32>, Nullable<Int32>, EouDetection, Nullable<Boolean>, Nullable<Boolean>, Nullable<Boolean>) |
Base model for VAD-based turn detection. |
| SessionResponse(String, String, Nullable<SessionResponseStatus>, ResponseStatusDetails, IEnumerable<SessionResponseItem>, ResponseTokenStatistics, String, VoiceProvider, IEnumerable<InteractionModality>, Nullable<OutputAudioFormat>, Nullable<Single>, MaxResponseOutputTokensOption) |
The response resource. |
| SessionResponseItem(String, String, String) |
Base for any response item; discriminated by |
| SessionResponseMessageItem(String, String, ResponseMessageRole, IEnumerable<VoiceLiveContentPart>, SessionResponseItemStatus) |
Base type for message item within a conversation. |
| SessionUpdate(String, String) |
A voicelive server event. Please note this is the abstract base class. The derived classes available for instantiation are: SessionUpdateError, SessionUpdateSessionCreated, SessionUpdateSessionUpdated, SessionUpdateAvatarConnecting, SessionUpdateInputAudioBufferCommitted, SessionUpdateInputAudioBufferCleared, SessionUpdateInputAudioBufferSpeechStarted, SessionUpdateInputAudioBufferSpeechStopped, SessionUpdateConversationItemCreated, SessionUpdateConversationItemInputAudioTranscriptionCompleted, SessionUpdateConversationItemInputAudioTranscriptionFailed, SessionUpdateConversationItemTruncated, SessionUpdateConversationItemDeleted, SessionUpdateResponseCreated, SessionUpdateResponseDone, SessionUpdateResponseOutputItemAdded, SessionUpdateResponseOutputItemDone, SessionUpdateResponseContentPartAdded, SessionUpdateResponseContentPartDone, SessionUpdateResponseTextDelta, SessionUpdateResponseTextDone, SessionUpdateResponseAudioTranscriptDelta, SessionUpdateResponseAudioTranscriptDone, SessionUpdateResponseAudioDelta, SessionUpdateResponseAudioDone, SessionUpdateResponseAnimationBlendshapeDelta, SessionUpdateResponseAnimationBlendshapeDone, SessionUpdateResponseAudioTimestampDelta, SessionUpdateResponseAudioTimestampDone, SessionUpdateResponseAnimationVisemeDelta, SessionUpdateResponseAnimationVisemeDone, SessionUpdateConversationItemInputAudioTranscriptionDelta, SessionUpdateConversationItemRetrieved, SessionUpdateResponseFunctionCallArgumentsDelta, and SessionUpdateResponseFunctionCallArgumentsDone. |
| SessionUpdateAvatarConnecting(String, String) |
Sent when the server is in the process of establishing an avatar media connection and provides its SDP answer. |
| SessionUpdateConversationItemCreated(String, String, SessionResponseItem) |
Returned when a conversation item is created. There are several scenarios that produce this event:
|
| SessionUpdateConversationItemDeleted(String, String) |
Returned when an item in the conversation is deleted by the client with a
|
| SessionUpdateConversationItemInputAudioTranscriptionCompleted(String, String, Int32, String) |
This event is the output of audio transcription for user audio written to the
user audio buffer. Transcription begins when the input audio buffer is
committed by the client or server (in |
| SessionUpdateConversationItemInputAudioTranscriptionDelta(String, String, Nullable<Int32>, String, IEnumerable<LogProbProperties>) |
Returned when the text value of an input audio transcription content part is updated. |
| SessionUpdateConversationItemInputAudioTranscriptionFailed(String, String, Int32, VoiceLiveErrorDetails) |
Returned when input audio transcription is configured, and a transcription
request for a user message failed. These events are separate from other
|
| SessionUpdateConversationItemRetrieved(SessionResponseItem, String) |
Returned when a conversation item is retrieved with |
| SessionUpdateConversationItemTruncated(String, Int32, Int32, String) |
Returned when an earlier assistant audio message item is truncated by the
client with a |
| SessionUpdateError(String, SessionUpdateErrorDetails) |
Returned when an error occurs, which could be a client problem or a server problem. Most errors are recoverable and the session will stay open, we recommend to implementors to monitor and log error messages by default. |
| SessionUpdateErrorDetails(String, String, String, String, String) |
Details of the error. |
| SessionUpdateInputAudioBufferCleared(String) |
Returned when the input audio buffer is cleared by the client with a
|
| SessionUpdateInputAudioBufferCommitted(String, String, String) |
Returned when an input audio buffer is committed, either by the client or
automatically in server VAD mode. The |
| SessionUpdateInputAudioBufferSpeechStarted(String, Int32, String) |
Sent by the server when in |
| SessionUpdateInputAudioBufferSpeechStopped(String, Int32, String) |
Returned in |
| SessionUpdateResponseAnimationBlendshapeDelta(String, String, String, Int32, Int32, BinaryData, Int32) |
Represents a delta update of blendshape animation frames for a specific output of a response. |
| SessionUpdateResponseAnimationBlendshapeDone(String, String, String, Int32) |
Indicates the completion of blendshape animation processing for a specific output of a response. |
| SessionUpdateResponseAnimationVisemeDelta(String, String, String, Int32, Int32, Int32, Int32) |
Represents a viseme ID delta update for animation based on audio. |
| SessionUpdateResponseAnimationVisemeDone(String, String, String, Int32, Int32) |
Indicates completion of viseme animation delivery for a response. |
| SessionUpdateResponseAudioDelta(String, String, String, Int32, Int32, BinaryData) |
Returned when the model-generated audio is updated. |
| SessionUpdateResponseAudioDone(String, String, String, Int32, Int32) |
Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseAudioTimestampDelta(String, String, String, Int32, Int32, Int32, Int32, String) |
Represents a word-level audio timestamp delta for a response. |
| SessionUpdateResponseAudioTimestampDone(String, String, String, Int32, Int32) |
Indicates completion of audio timestamp delivery for a response. |
| SessionUpdateResponseAudioTranscriptDelta(String, String, String, Int32, Int32, String) |
Returned when the model-generated transcription of audio output is updated. |
| SessionUpdateResponseAudioTranscriptDone(String, String, String, Int32, Int32, String) |
Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseContentPartAdded(String, String, String, Int32, Int32, VoiceLiveContentPart) |
Returned when a new content part is added to an assistant message item during response generation. |
| SessionUpdateResponseContentPartDone(String, String, String, Int32, Int32, VoiceLiveContentPart) |
Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseCreated(String, SessionResponse) |
Returned when a new Response is created. The first event of response creation,
where the response is in an initial state of |
| SessionUpdateResponseDone(String, SessionResponse) |
Returned when a Response is done streaming. Always emitted, no matter the
final state. The Response object included in the |
| SessionUpdateResponseFunctionCallArgumentsDelta(String, String, String, Int32, String, String) |
Returned when the model-generated function call arguments are updated. |
| SessionUpdateResponseFunctionCallArgumentsDone(String, String, String, Int32, String, String, String) |
Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseOutputItemAdded(String, String, Int32, SessionResponseItem) |
Returned when a new Item is created during Response generation. |
| SessionUpdateResponseOutputItemDone(String, String, Int32, SessionResponseItem) |
Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateResponseTextDelta(String, String, String, Int32, Int32, String) |
Returned when the text value of a "text" content part is updated. |
| SessionUpdateResponseTextDone(String, String, String, Int32, Int32, String) |
Returned when the text value of a "text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. |
| SessionUpdateSessionCreated(String, VoiceLiveSessionResponse) |
Returned when a Session is created. Emitted automatically when a new connection is established as the first server event. This event will contain the default Session configuration. |
| SessionUpdateSessionUpdated(String, VoiceLiveSessionResponse) |
Returned when a session is updated with a |
| SystemMessageItem(String, IEnumerable<MessageContentPart>, Nullable<ItemParamStatus>) |
A system message item within a conversation. |
| TurnDetection(String) |
Top-level union for turn detection configuration. Please note this is the abstract base class. The derived classes available for instantiation are: ServerVadTurnDetection, AzureSemanticVadTurnDetection, AzureSemanticVadTurnDetectionEn, and AzureSemanticVadTurnDetectionMultilingual. |
| UserMessageItem(String, IEnumerable<MessageContentPart>, Nullable<ItemParamStatus>) |
A user message item within a conversation. |
| VideoBackground(String, String) |
Defines a video background, either a solid color or an image URL (mutually exclusive). |
| VideoCrop(IEnumerable<Int32>, IEnumerable<Int32>) |
Defines a video crop rectangle using top-left and bottom-right coordinates. |
| VideoParams(Nullable<Int32>, String, VideoCrop, VideoResolution, VideoBackground, Nullable<Int32>) |
Video streaming parameters for avatar. |
| VideoResolution(Int32, Int32) |
Resolution of the video feed in pixels. |
| VoiceLiveContentPart(String) |
Base for any content part; discriminated by |
| VoiceLiveErrorDetails(String, String, String, String, String) |
Error object returned in case of API failure. |
| VoiceLiveFunctionDefinition(String, String, BinaryData) |
The definition of a function tool as used by the voicelive endpoint. |
| VoiceLiveSessionOptions(String, IEnumerable<InteractionModality>, AnimationOptions, VoiceProvider, String, Nullable<Int32>, Nullable<InputAudioFormat>, Nullable<OutputAudioFormat>, AudioNoiseReduction, AudioEchoCancellation, AvatarConfiguration, AudioInputTranscriptionOptions, IEnumerable<AudioTimestampType>, IEnumerable<VoiceLiveToolDefinition>, ToolChoiceOption, Nullable<Single>, MaxResponseOutputTokensOption, BinaryData) |
Base for session configuration shared between request and response. |
| VoiceLiveToolDefinition(String) |
The base representation of a voicelive tool definition. Please note this is the abstract base class. The derived classes available for instantiation are: VoiceLiveFunctionDefinition. |