By using the Aculab site, you agree with our use of cookies.

Multi-lingual speech recognition now supported on Aculab Cloud

We’re listening – what would you like to do today?

A generation of people have grown up trying to avoid ringing a contact centre – not because they didn’t like talking to the cheery people who work in such places, but because they first had to get past the IVR system put in place to direct the call. Press 1 for support, 2 for sales, 3 if you know the extension of the person you wish to speak to.…and so on, and so on. We quickly realised that many of these systems would let you bypass the IVR menu and get to a real person if we pressed ‘0’.

So, the IVR system was often bypassed and everyone became disillusioned with the experience, including the IT manager who had just managed to persuade the company to invest in the system.


Scroll forward 20 or more years to the present day and what follows is more like the experience you will get when you ring into a contact centre

“Hello, how can we help you today? Please explain briefly why you are calling”

And more often than not, the caller is more than happy to talk to this automated interface.

image for April 2020 ASR blog post

As you are likely aware, the market for text-to-speech and speech recognition has exploded in recent years. The engineering efforts undertaken by the giants of the tech world (Apple, Google, and Amazon, for example with Siri, Google Home and Alexa respectively) have rapidly increased the sophistication, accuracy, and ease of use of such systems. Many of us now have our own personal voice recognition system at home or in the car, and the acceptance level amongst consumers for such systems has increased substantially. This in turn is driving greater use of voice-driven systems in enterprise application areas such as call centre. For example, both Alexa and Amazon Connect (the AWS contact centre offering) have dialogs driven by Amazon Lex with Transcribe and Polly under the hood to convert between speech and text.

Our TTS and speech recognition partners

When we decided to offer our media processing capabilities as a cloud-based service (CPaaS), we scoured the market for partners to help us offer the best possible service to developers wishing to build their own communications applications. We were not limited in our choices to a single favoured supplier – each one had to be a best-of-breed partner. We chose to host our service on Amazon AWS infrastructure – with separate clouds to support the US and Europe to allow customers to keep their data where they wished, and we chose voice carrier and SMS messaging partners to give us the highest quality, worldwide connectivity options for calls and messaging.

As the market progressed, we evolved the Aculab Cloud platform to keep up with these developments – the first step in that evolution was the integration of HIPAA-compliant text-to-speech (TTS) voices from Amazon Polly.

TTS support is a key feature for systems sending outbound voice messages such as appointment reminders. Rather than record and store the voice message before sending to customers, TTS can be used to deliver clear, natural sounding, bespoke voice messages in multiple languages.

To complement the multi-lingual TTS support, we needed a speech recognition system – and for that we again sought a best-of-breed partner, choosing Google Speech Recognition, one of the Google Cloud AI building blocks.

Speech recognition in 120 languages

If you want to localise your communications system for a new region, then it's likely we can support you with that requirement.

The Aculab Cloud speech recognition feature enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. More than 120 languages and variants can be recognised.

Since our platform is predominantly for conversational communications over the phone, we have focused on that use case - speech recognition is integrated cleanly into our REST API v2, making it easy to implement speech driven conversational dialogs. We've also provided real-time transcription of calls, allowing you to augment the agent's screen based on what's said on the call, perform sentiment analysis to show the contact centre manager where the hot spots are, etc. In addition, our API allows for voice-command interrupts that could be used, for example, in voicemail systems where a voice command such as ‘repeat’ spoken during message playback will prompt the system to replay the message from the beginning, with other words being ignored. And of course you can feed our call recordings to a speech recogniser for offline transcription, allowing later search, etc.

Conversational dialogs on Aculab Cloud

Armed with our high quality TTS and natural language Speech Recognition, you can now use conversational dialog services such as Amazon Lex and Google Dialogflow to drive your call flows on Aculab Cloud. As well as providing a high quality customer experience, this means you can use these same services across other channels such as chat and messaging, providing a consistent user experience. 

Further information about the feature can be found in our documentation area.


The Aculab blog

News, views and industry insights from Aculab

  • Voice Biometrics: Why Businesses and Users are driving its adoption

    In this blog post, we’ll look at the rapidly growing market of Voice Biometrics, and what drives its increasing rate of adoption, as more businesses and services are made aware of the need for multi-factor authentication.

    Read more

  • An underused tool in the fight against the second wave of Coronavirus

    In this article, we'll go into a bit more depth as to why exactly Broadcast Messaging is such a powerful tool. We have compiled a list of six unique characteristics to highlight exactly how it can be used productively, to shore up the lines of communication in the ongoing situation with Coronavirus.

    Read more

  • The seven realms of Broadcast Messaging

    Broadcast messaging that uses a cloud-based service is a natural choice. Using a cloud as-a-service approach gives a variety of message delivery options, and cuts down costs by automatically scaling to meet demand. Find out what makes Aculab Cloud such a natural choice for voice and SMS broadcast messaging, and how other customers are already reaping the benefits from using Aculab's CPaaS platform.

    Read more

  • The technology working behind the scenes to support emergency services networks

    Now more than ever, telecoms infrastructures play a vital role in supporting the health of our communities. Behind the scenes, networking technologies are working to keep the lines of communications open between emergency services and those in need.

    A recent example from the Lombardy region of Italy highlights a typical scenario:

    Read more

  • What’s wrong with Knowledge-Based Authentication (KBA)?

    For many years, online and telephone-based authentication has relied on knowledge-based systems using passwords, PINs, and question-and-answer dialogues to confirm a customer’s identity. With the explosion in the number of contact centres, this approach is close to breaking point. Nobody in the modern world can be expected to remember all of the passwords they need to securely access all their services.

    Read more