right  Talk To Us!



When connecting together two parties who speak different languages using the Connect action, Aculab Cloud can provide accurate live audio translation between the two calls.

When enabled, each utterance made on either of the connected primary or secondary calls will be heard by the far end. At the same time it will also be transcribed using natural language speech recognition, translated into a different language using neural machine translation, then when the original utterance is completed the translation will be played to both parties using Text To Speech (TTS). You specify the language being spoken on each call and control how you want the translated audio to be played back to the other call.

audio to audio translation diagram

As the speech from either party is translated and played back, transcriptions of both the recognised speech and the translated speech are produced and sent to a specified page.

Speech Recognition Models

Google Speech-to-Text defines a number of models that have been trained from millions of examples of audio from specific sources, for example phone calls or videos. Recognition accuracy can be improved by using the specialized model that relates to the kind of audio data being analysed.

For example, the phone_call model used on audio data recorded from a phone call will produce more accurate transcription results than the default, command_and_search, or video models.

Premium models

Google have made premium models available for some languages, for specific use cases (e.g. medical_conversation). These models have been optimized to more accurately recognise audio data from these specific use cases. See Speech Recognition Languages to see which premium models are available for your language.


Currently, our Translation supports over 130 languages. For the up to date list, see Translation Languages.

Depending on the languages spoken on each call, you may only need to set the speech recognition language, the target language of the translation and the TTS voice used to say the translated text. For cases where there are different variations of a language for speech recognition and translation you may need to override all the defaults.


On a trial account you can start using Connect with translation straight away.

Our Translation is charged per minute with 15 second granularity. So, for example:

  • A connect which provides translations between two calls for 49 seconds will be charged for 60 seconds, for each side of the connect.
  • A connect which translates between two calls for 3 minutes 20 seconds will be charged for 3 minutes and 30 seconds for each side of the connect.

Irrespective of how much is said on each call.

You can obtain detailed charge information for a specific call using the Application Status web service. You can obtain detailed charge information for calls over a period of time using the Managing Reports web services. Note that there will be two entries in the Feature Data Record (FDR), one for each side of the connect.