aura-tts model | Clarifai - The World's AI

aura-tts

The Deepgram Aura Text-to-Speech Model offers rapid, high-quality, and efficient speech synthesis, enabling lifelike voices for AI agents across various applications

No input available.

Notes

Introduction

The Aura Text-to-Speech Model, developed by Deepgram, offers a unique blend of speed, quality, and efficiency for generating natural-sounding speech. This model is engineered to empower AI agents with lifelike voices, enabling seamless interactions in conversational AI applications.

Aura Text-to-Speech Model

The Aura Text-to-Speech Model is optimized for real-time voicebots and conversational AI applications. It provides lightning-fast response times and high-quality speech synthesis, enhancing user experience and engagement.

Features:

Speed: Industry-leading latency of less than 250 ms ensures rapid responses in conversational contexts.
Quality: Human-like tone, rhythm, and emotion enrich interactions, fostering natural dialogues.
Scale: Cost-efficient and scalable architecture suitable for high-throughput applications.

Run Deepgram Text-to-speech Model with an API

Running the API with Clarifai's Python SDK

You can run the Deepgram Text-to-speech Model API using Clarifai’s Python SDK.

Export your PAT as an environment variable. Then, import and initialize the API Client.

Find your PAT in your security settings.

export CLARIFAI_PAT={your personal access token}

from clarifai.client.model import Model

input = "I love your product very much"

api_key = Deepgram_API

inference_params = dict(model="aura-asteria-en", api_key = api_key)

# Model Predict
model_prediction = Model("https://clarifai.com/deepgram/tts/models/aura-tts").predict_by_bytes(input.encode(), input_type="text", inference_params=inference_params)

output_base64 = model_prediction.outputs[0].data.audio.base64

with open('audio_file.wav', 'wb') as f:
f.write(output_base64)

You can also run Deepgram Text-to-speech or tts API using other Clarifai Client Libraries like Java, cURL, NodeJS, PHP, etc here.

Voices:

Initially offers 12 English-speaking voices (7 male, 5 female), trained on high-quality conversational datasets, each identified by a unique model name following the format [modelname]-[voicename]-[language].

To select a model, use the syntax model="aura-asteria-en" in the inference parameter

Use Cases

The Aura Text-to-Speech Model finds application in various domains where natural-sounding voice synthesis is crucial, including:

Conversational AI platforms
Virtual assistants
Interactive voice response (IVR) systems
Voice-enabled applications and devices

Advantages

Speed: Aura is the quickest among premium options and excels in fast response times, enhancing user satisfaction and engagement.
Quality: With capabilities to replicate authentic human dialogues, including natural cadences, pauses, audible breaths, and hesitation sounds, Aura delivers superior quality speech synthesis.
Efficiency: Optimized for efficiency, Aura follows Deepgram’s standard usage-based pricing scheme, ensuring cost-effectiveness for businesses of all sizes.

Disclaimer

Please be advised that this model utilizes wrapped Artificial Intelligence (AI) provided by Deepgram (the "Vendor"). These AI models may collect, process, and store data as part of their operations. By using our website and accessing these AI models, you hereby consent to the data practices of the Vendor. We do not have control over the data collection, processing, and storage practices of the Vendor. Therefore, we cannot be held responsible or liable for any data handling practices, data loss, or breaches that may occur. It is your responsibility to review the privacy policies and terms of service of the Vendor to understand their data practices.

You can access the Vendor's privacy policy and terms of service at https://deepgram.com/privacy. We disclaim all liability with respect to the actions or omissions of the Vendor, and we encourage you to exercise caution and to ensure that you are comfortable with these practices before utilizing the AI models hosted on our site.

ID
Model Type ID
Text To Audio
Input Type
text
Output Type
audio
Description
The Deepgram Aura Text-to-Speech Model offers rapid, high-quality, and efficient speech synthesis, enabling lifelike voices for AI agents across various applications
Last Updated
Mar 20, 2024
Privacy
PUBLIC
Use Case
License
Share
Badge