Informizely customer feedback surveys
By using the Aculab site, you agree with our use of cookies.

Text-To-Speech (TTS)

Aculab Cloud supports Polly and Cepstral Text To Speech (TTS) engines.

Selecting a voice in the REST API

In the REST API Play action, the text_to_say property supports Speech Synthesis Markup Language (SSML) allowing you to change the way your text is spoken. However, this cannot be used to select the voice used by TTS to say your text. This defaults to the voice configured in your service. You can choose a different voice by setting tts_voice to a Selector from the voice tables below. For example, to set English US Female Polly Kimberly use the following setting for tts_voice:

"tts_voice" : "English US Female Polly Kimberly"

Selecting a voice in the UAS API

In the UAS API, the Say methods support Speech Synthesis Markup Language (SSML) allowing you to change the way your text is spoken, for example, by choosing which voice you'd like to use using the voice tag. You can also choose the TTS engine to use, via the optional acu-engine tag which, if provided, must be outermost in the string. If you don't provide these tags your account's Default TTS voice will be used. For example, to set English US Female Polly Kimberly use the following SSML:

channel.FilePlayer.Say("<acu-engine name='Polly'><voice name='Kimberly'>I have something to say.</voice></acu-engine>");

 The preset default for your account will usually be a Polly voice

Polly

Polly's website has a demo which allows you to select a voice and immediately hear how different text will sound - see Polly demos.

Polly TTS supports a subset of SSML, which can optionally be embedded within the text you supply to the say function. For a summary of the SSML tags which may be used, see Common SSML tags below. For more detailed information, to go W3C SSML 1.1 recommendation.

We support the following Polly voices:

Filter by

NameSelectorAudio Clip
Kimberly English US Female Polly Kimberly
Naja Danish Denmark Female Polly Naja
Mads Danish Denmark Male Polly Mads
Lotte Dutch Netherlands Female Polly Lotte
Ruben Dutch Netherlands Male Polly Ruben
Emma English UK Female Polly Emma
Amy English UK Female Polly Amy
Brian English UK Male Polly Brian
Geraint English Wales Male Polly Geraint
Gwyneth Welsh Wales Female Polly Gwyneth
Nicole English Australia Female Polly Nicole
Russell English Australia Male Polly Russell
Raveena English India Female Polly Raveena
Salli English US Female Polly Salli
Ivy English US Female Polly Ivy
Kendra English US Female Polly Kendra
Joanna English US Female Polly Joanna
Joey English US Male Polly Joey
Justin English US Male Polly Justin
Celine French France Female Polly Celine
Mathieu French France Male Polly Mathieu
Chantal French Canada Female Polly Chantal
Marlene German Germany Female Polly Marlene
Hans German Germany Male Polly Hans
Vicki German Germany Female Polly Vicki
Dora Icelandic Iceland Female Polly Dora
Karl Icelandic Iceland Male Polly Karl
Giorgio Italian Italy Male Polly Giorgio
Carla Italian Italy Female Polly Carla
Liv Norwegian Norway Female Polly Liv
Maja Polish Poland Female Polly Maja
Ewa Polish Poland Female Polly Ewa
Jacek Polish Poland Male Polly Jacek
Jan Polish Poland Male Polly Jan
Ricardo Portuguese Brazil Male Polly Ricardo
Vitoria Portuguese Brazil Female Polly Vitoria
Ines Portuguese Portugal Female Polly Ines
Cristiano Portuguese Portugal Male Polly Cristiano
Carmen Romanian Romania Female Polly Carmen
Tatyana Russian Russia Female Polly Tatyana
Maxim Russian Russia Male Polly Maxim
Conchita Spanish Castile Female Polly Conchita
Enrique Spanish Castile Male Polly Enrique
Penelope Spanish US Female Polly Penelope
Miguel Spanish US Male Polly Miguel
Astrid Swedish Sweden Female Polly Astrid
Filiz Turkish Turkey Female Polly Filiz

Cepstral

Cepstral's website has a demo which allows you to select a voice and immediately hear how different text will sound - see Cepstral demos.

Cepstral TTS supports a subset of the Speech Synthesis Markup Language (SSML), which can optionally be embedded within the text you supply to the say function. For a summary of the SSML tags which may be used, see Common SSML tags below. For more detailed information, go to Cepstral SSML FAQ and scroll down to the 'Common Usage Examples'. With reference to that page, please bear in mind the following:

We support the following Cepstral voices:

NameSelector
Callie-8kHz (default)English US Female Cepstral Callie
Marta-8kHzSpanish US Female Cepstral Marta
VittoriaItalian Italy Female Cepstral Vittoria

We don't support:

  • Inserting recorded audio files (our APIs' play functions already allow file replay)
  • Applying Cepstral special effects
  • Inserting bookmarks

Reserved characters

Some characters are reserved for use in SSML so, if the text you need to say contains any of these, replace them as shown:

Reserved CharacterReplace With
<&lt;
>&gt;
&&amp;

For example, "Bill & Ben played in the garden" would be become "Bill &amp; Ben played in the garden".

Common SSML tags

Polly and Cepstral both support a subset of SSML. Details of common tags can be found below. It is highly recommended that you test your application before deploying with a different TTS engine.

TagDescription
break

Inserts a break or pause in the speech.

Optional arguments are time and strength.

time sets an absolute value for the pause. For example <break time="3s"> and <break time="3ms"> set the break time to be three seconds and three milliseconds respectively. The length of a break may be up to 10 seconds in duration

strength sets the relative value of the pause. These are none, x-weak, weak, medium, strong and x-strong.

Examples:

This is a <break /> sentence break.
This is a <break time="2s"/> two second break.
This is a dramatic <break strength="x-strong"/> break.
voice

Allows the user to change the voice used. Parameter name is required, specifying the voice to use. The supported voices for each TTS are listed above.

 This SSML features is supported in the UAS API only. For the REST API please use the tts_voice setting.

Examples:

<acu-engine name='Polly'><voice name='Amy'>I'm using Amy instead of the default voice.</voice></acu-engine>
                
prosody

Allows the user to change the pitch, speed and volume of a segment of speech.

Common optional parameters are: pitch, rate and volume.

pitch can be used to set the pitch of speech. Options are: x-low, low, medium, high, x-high,a relative change (measured in Hz) e.g. +50Hz, or a percentage change e.g +50%.

rate sets the rate of speech. Options are: x-slow, slow, medium, fast and x-fast,a relative change (measured in Hz) e.g. +50Hz, or a percentage change e.g +50%.

volume sets the volume for speech. Options are: silent, x-soft, soft, medium, loud and x-loud, a relative change (measured in Hz) e.g. +50Hz, or a percentage change e.g +50%.

Examples:

<prosody rate="x-fast">I'm using a very fast rate.</prosody>
This is normal volume. <prosody volume="soft">This is a soft volume.</prosody>
I can talk very <prosody rate="slow" pitch="low">deeply and slowly.</prosody>
Today's date is the <prosody rate="-50%">15th April, 2012.</prosody>
emphasis

Can be used to read with empasis.

Required parameter: level. Options are: reduced, moderate and strong.

Examples:

This is a <emphasis level="strong">level of emphasis</emphasis>, which can be used to highlight important information.