Sanskrit text to Sanskrit speech coversion using Azure text to speech services

Murthy 20 Reputation points
2025-11-09T06:18:13.95+00:00

Hello,

I have a program that I use to convert telugu text captured into a a word doc into speech using Azure cloud with its speach service. It works fine. But actually the telugu text is nothing but Sanskrit text trasliterated into telugu text using some conversion tools. I had to do this way to get the Sanskrit speec output because I could not find any VOICE name that is of sanskrit like in sa-IN- etc. I tried to look for it is westus, eastus and centralindia but didn't find it. But I heard Azure speech services supports sanskrit text to speech coversion and it has a sansskirt voice called "sa-IN-MadhurNeural". Can any one please clarify if that support exists. And if it does exist, what region should I look for it?

If I can use sa-IN directly, I can use Devanagiri text itself directly because all my documents that I want to covert to speech for are in Devanagari text only, That way I can avoid this additional step of transliteratting into telugu text first and also not miss the pronunciation issues when telugu speech is used to speak Sanskrit.

Appreciate the help.

regards

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
{count} votes

Answer accepted by question author
  1. Nikhil Jha (Accenture International Limited) 4,150 Reputation points Microsoft External Staff Moderator
    2025-11-14T07:39:40.1366667+00:00

    Hello Murthy,

    Apologies for the delay in response. I understand you've been using a workaround by transliterating Sanskrit text into Telugu script to generate speech output, and you're looking for a direct Sanskrit voice to avoid pronunciation issues and streamline your workflow with Devanagari text.

    After thorough verification against the official Microsoft Azure Speech service documentation, I must inform you that there is currently NO Sanskrit-specific neural voice available in Azure Text-to-Speech services. Specifically, the voice "sa-IN-MadhurNeural" does not exist in the Azure Speech portfolio.

    Current Sanskrit Support Status in Azure AI Services

    While Sanskrit (sa - language code) is supported in some Azure AI services, its availability varies significantly across different features:​

    Sanskrit IS supported in:

    Sanskrit is NOT supported in:

    • Azure Text-to-Speech (TTS) - No neural voices available​
    • Azure Speech-to-Text (STT) - No transcription models available

    Recommended Workaround
    1: Continue Using Hindi Voice (I would suggest this as the most practical approach)

    The hi-IN-MadhurNeural voice you may have heard about can process Devanagari script text, as Hindi uses the same script. However, pronunciation will follow Hindi phonetic rules rather than authentic Sanskrit pronunciation patterns.

    import azure.cognitiveservices.speech as speechsdk
    
    # Configure Speech service
    speech_config = speechsdk.SpeechConfig(
        subscription="YourSubscriptionKey", 
        region="centralindia"  # or eastus, westus, etc.
    )
    
    # Use Hindi voice for Devanagari text
    speech_config.speech_synthesis_voice_name = "hi-IN-MadhurNeural"
    
    # Synthesize Sanskrit text in Devanagari script
    audio_config = speechsdk.audio.AudioOutputConfig(filename="sanskrit_output.wav")
    synthesizer = speechsdk.SpeechSynthesizer(
        speech_config=speech_config, 
        audio_config=audio_config
    )
    
    # Your Sanskrit text in Devanagari
    sanskrit_text = "नमस्ते भारत"
    
    result = synthesizer.speak_text_async(sanskrit_text).get()
    

    However, this has a few limitations:

    • Pronunciation follows Hindi phonetics, not classical Sanskrit.
    • Sandhi rules and Vedic accent patterns won't be accurate.
    • May mispronounce Sanskrit-specific conjuncts and combinations.

    Note: This is a code samples based on available documentation along with a few custom adjustments. Since environments and requirements may vary, I would kindly recommend reviewing and validating the code in a safe or test environment before applying it to production.

    2: Custom Neural Voice (Enterprise Solution)

    For production-grade Sanskrit TTS with authentic pronunciation, consider creating a Custom Neural Voice. https://techcommunity.microsoft.com/blog/azure-ai-foundry-blog/azure-ai-speech-text-to-speech-feb-2025-updates-new-hd-voices-and-more/4387263

    This requires:​

    • High-quality Sanskrit speech recordings (10-50 hours minimum)
    • Professional Sanskrit voice talent.
    • Azure Custom Neural Voice access (requires application approval).
    • Significant investment in time and resources.

    I would also suggest staying informed:


    I hope it helps a way forward.
    Please accept the answer & upvote for remediation of other community members.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.