Inbound With Speech

UAS API - Inbound With Speech Sample Application

Filename:

samples/inbound_with_speech.py

Description:

An inbound application that prompts the caller to say a number between sixteen and ninety nine. After ringing and answering an inbound call, the application primes the speech recogniser to start recognising speech after the next play has ended. It selects a predefined grammar, SixteenToNinetyNine, for the recognition job. The application then uses TTS to ask for the caller's age, which should be between sixteen and ninety nine.

The caller can respond by pressing digits instead of speaking. If this happens, the application will collect the DTMF digits that were pressed. Finally, the application will speak back to the caller what was said, or pressed, using TTS.

Code:

# -*- coding: utf-8 -*-
"""
A simple application that answers an inbound call, speaks some TTS, 
detects some speech and plays back the speech that was detected.

    Actions:
        - check the channel state
        - ring and answer
        - play some tts
        - detect a spoken response
        - play what was detected
        - hang up

The speech detection grammar used is pre-defined.

"""

from prosody.uas import Hangup, Error
import time

__uas_version__  = "0.0.1"
__uas_identify__ = "application"

def main(channel, application_instance_id, file_man, my_log, application_parameters):
    return_code = 0
    try:
        # check the incoming channel state
        state = channel.state()

        if state == channel.State.CALL_INCOMING:
            state = channel.ring()   # this can raise a Hangup exception
            if state == channel.State.RING_INCOMING:
                state = channel.answer() # this can raise a Hangup exception
        else:
            raise Hangup('No inbound call, state is {0}'.format(state))
        if state != channel.State.ANSWERED:
            raise Hangup('Failed to answer inbound call, state is {0}'.format(state))
        
        my_log.info("Answered an inbound call") # log at info level

        channel.FileRecorder.start("speech_test_{0}.wav".format(application_instance_id))
        # prepare the speech detector, speech detection will begin as soon
        # as the prompt has finished playing
        my_grammar = channel.SpeechDetector.Grammar()
        my_grammar.create_from_predefined('SixteenToNinetyNine')
        if channel.SpeechDetector.prime(my_grammar, channel.SpeechDetector.SpeechDetectorTrigger.ONPLAYEND) is False:
            raise Error("Speech detector failed to start")

        # Say the TTS prompt, the text will ask the caller their age, a value from sixteen to ninety nine.
        cause = channel.FilePlayer.say("<acu-engine name='Polly'><voice name='Amy'>How old are you?</voice></acu-engine>")
        if cause != channel.FilePlayer.Cause.NORMAL:
            raise Error("Say prompt failed: cause is {0}".format(cause))

        # Now get the recognised speech, can be words or DTMF digits
        response = channel.SpeechDetector.get_recognised_speech()
        if channel.SpeechDetector.cause() == channel.SpeechDetector.Cause.BARGEIN:
            # age is 16 to 99, so we want two digits
            age = channel.DTMFDetector.get_digits(count=2)
        else:
            age = response.get_recognised_words_as_string()
        if not age:
            to_say = "Sorry, I did not hear that."
        else:
            to_say = "Your age is {0}.".format(age)
        # Say the recognised speech.
        cause = channel.FilePlayer.say("<acu-engine name='Polly'><voice name='Amy'>{0}</voice></acu-engine>".format(to_say))

        if cause != channel.FilePlayer.Cause.NORMAL:
            raise Error("TTS player returned {0}: expected {1}".format(cause, channel.FilePlayer.Cause.NORMAL))
        
        # Bye bye.
        cause = channel.FilePlayer.say("Bye bye.")
        if cause != channel.FilePlayer.Cause.NORMAL:
            raise Error("Say bye bye failed: cause is {0}".format(cause))
        
    except Hangup as exc:
        my_log.info("Hangup exception reports: {0}".format(exc))
         # in this app a hangup is not an error, return a positive value
        return_code = -100
        
    except Error as exc:
        # for error conditions return a negative value
        my_log.error("Error exception reports: {0}".format(exc))
        return_code = -101
        
    except Exception as exc:
        # an unexpected exception, return a negative value
        my_log.exception("Unexpected exception reports: {0}".format(exc))
        return_code = -102
        
    finally:
        channel.FileRecorder.stop()
        if channel.state() != channel.State.IDLE:
            channel.hang_up()

    return return_code

Filename:

Samples\C#\InboundWithSpeech\InboundWithSpeech.cs

Description:

First we ring and answer the call. Then we create a Grammar object that will determine the words that the SpeechDetector on the call will try to recognise. This is passed to the Prime method which readies the SpeechDetector to start recognising speech input after the next play has ended. We give the SpeechDetector 10 seconds to recognise some speech input.

Having primed the SpeechDetector we play a prompt to the caller and then wait for any speech input to be recognised. If the Cause is Normal we have a valid recognised result and play back the recognised word to the call.

Note: if the SpeechDetector didn't recognise one of the expected words or was interrupted by a press on the caller's telephone keypad we could then also use the DTMFDetector to identify the keys pressed. This would provide an alternative means of number input if the caller's environment is too loud or disturbed for accurate recognition.

Code:

using System;
using AMSClassLibrary;
using UASAppAPI;

// An inbound application that prompts the caller to say some text 
// then reads the recognised text back to the caller and hangs up.
//
// Requires:
// -
namespace InboundWithSpeech
{
    // The application class.
    // This must have the same name as the assembly and must inherit from either 
    // UASInboundApplication or UASOutboundApplication.
    // It must override the Run method.
    public class InboundWithSpeech : UASInboundApplication
    {
        enum ReturnCode
        {
            // Success Codes:
            Success = 0,
            // ... any positive integer

            // Fail Codes:
            // -1 to -99 reserved
            ExceptionThrown = -100,
            PlayInterrupted = -101,
            NoResponse = -102
        }

        // This is the entry point for the application
        public override int Run(UASCallChannel channel, string applicationParameters)
        {
            this.Trace.TraceInfo("Start - appParms [{0}]", applicationParameters);
            ReturnCode reply = ReturnCode.Success;

            try
            {
                // Ring for 2 seconds
                channel.Ring(2);

                // Answer the call
                CallState state = channel.Answer();
                if (state == CallState.Answered)
                {
                    this.Trace.TraceInfo("Answered");

                    // Prime the speech recognition to listen when the prompt has finished playing
                    Grammar listenFor = Grammar.CreateFromAlternatives(new String[] { "apricots", "apples", "pears", "pomegranates" });
                    if (!channel.SpeechDetector.Prime(SpeechDetectorTrigger.OnPlayEnd, listenFor, 10))
                    {
                        this.Trace.TraceError("Failed to prime speech recognition");
                        reply = ReturnCode.PlayInterrupted;
                    }

                    // Prompt the caller to make a selection by saying one of the following
                    FilePlayerCause playCause = channel.FilePlayer.Say(
                        "Hello. You've reached the Inbound With Speech sample. Please say one of the following: apricots, apples, pears, pomegranates");
                    if (FilePlayerCause.Normal != playCause)
                    {
                       this.Trace.TraceError("Say failed or was interrupted");
                        reply = ReturnCode.PlayInterrupted;
                    }
                    else
                    {
                        // Now wait for speech recognition
                        SpeechDetectorResult result;
                        SpeechDetectorCause speechCause = channel.SpeechDetector.GetRecognisedSpeech(out result);
                        if (speechCause == SpeechDetectorCause.Normal)
                        {
                            this.Trace.TraceInfo("Got speech result: {0}", result.ToString());

                            // Now say the recognised text using TTS.
                            playCause = channel.FilePlayer.Say(String.Join(" ", result.RecognisedWords.ToArray()));
                            if (FilePlayerCause.Normal != playCause)
                            {
                                this.Trace.TraceError("Say failed or was interrupted");
                                reply = ReturnCode.PlayInterrupted;
                            }
                        }
                        else
                        {
                            this.Trace.TraceInfo("GetRecognisedSpeech returned {0}", speechCause.ToString());
                            this.Trace.TraceError("No response was recognised");
                            reply = ReturnCode.NoResponse;
                        }
                    }

                    // Say Goodbye using the default voice
                    channel.FilePlayer.Say("Goodbye.");

                    // Ensure the call is hung up.
                    channel.HangUp();
                }
            }
            catch (Exception e)
            {
                this.Trace.TraceError("Exception caught: {0}", e.Message);
                reply = ReturnCode.ExceptionThrown;
            }

            this.Trace.TraceInfo("Completed with return code {0}", reply);
            return (int)reply;
        }
    }
}

Filename:

Samples\VB\InboundWithSpeech\InboundWithSpeech.vb

Description:

First we ring and answer the call. Then we create a Grammar object that will determine the words that the SpeechDetector on the call will try to recognise. This is passed to the Prime method which readies the SpeechDetector to start recognising speech input after the next play has ended. We give the SpeechDetector 10 seconds to recognise some speech input.

Having primed the SpeechDetector we play a prompt to the caller and then wait for any speech input to be recognised. If the Cause is Normal we have a valid recognised result and play back the recognised word to the call.

Note: if the SpeechDetector didn't recognise one of the expected words or was interrupted by a press on the caller's telephone keypad we could then also use the DTMFDetector to identify the keys pressed. This would provide an alternative means of number input if the caller's environment is too loud or disturbed for accurate recognition.

Code:

Imports AMSClassLibrary
Imports UASAppAPI

' An inbound application that prompts the caller to say some text 
' then reads the recognised text back to the caller And hangs up.
'
' Requires:
' -
Namespace InboundWithSpeech

    ' The application class.
    ' This must have the same name as the assembly and must inherit from either 
    ' UASInboundApplication or UASOutboundApplication.
    ' It must override the Run method.
    Public Class InboundWithSpeech
        Inherits UASInboundApplication

        ' Possible return codes
        Enum ReturnCode
            ' Success Codes:
            Success = 0
            ' ... any positive integer

            ' Fail Codes:
            ' -1 to -99 reserved
            ExceptionThrown = -100
            PlayInterrupted = -101
            NoResponse = -102
        End Enum

        ' This is the entry point for the application
        Overrides Function Run(ByVal channel As UASCallChannel,
                               ByVal applicationParameters As String) _
                               As Integer

            Me.Trace.TraceInfo("Start - appParms [{0}]", applicationParameters)
            Dim reply As ReturnCode = ReturnCode.Success

            Try
                ' Ring for 2 seconds
                channel.Ring(2)

                ' Answer the call
                Dim state As CallState
                state = channel.Answer()

                If state = CallState.Answered Then

                    Me.Trace.TraceInfo("Call answered")

                    ' Prime the speech recognition to listen when the prompt has finished playing
                    Dim options As String() = {"apricots", "apples", "pears", "pomegranates"}
                    Dim listenFor As Grammar = Grammar.CreateFromAlternatives(options)
                    If Not channel.SpeechDetector.Prime(SpeechDetectorTrigger.OnPlayEnd, listenFor, 10) Then
                        Me.Trace.TraceError("Failed to prime speech recognition")
                        reply = ReturnCode.PlayInterrupted
                    End If

                    ' Prompt the caller to make a selection by saying one of the following
                    Dim playCause As FilePlayerCause = channel.FilePlayer.Say(
                        "Hello. You've reached the Inbound With Speech sample. Please say one of the following: " _
                        + String.Join(", ", options))
                    If FilePlayerCause.Normal <> playCause Then
                        Me.Trace.TraceError("Say failed or was interrupted")
                        reply = ReturnCode.PlayInterrupted
                    Else
                        ' Now wait for speech recognition
                        Dim result As SpeechDetectorResult = Nothing
                        Dim speechCause As SpeechDetectorCause = channel.SpeechDetector.GetRecognisedSpeech(result)
                        If speechCause = SpeechDetectorCause.Normal Then
                            Me.Trace.TraceInfo("Got speech result: {0}", result.ToString())

                            ' Now say the recognised text using TTS.
                            playCause = channel.FilePlayer.Say(String.Join(" ", result.RecognisedWords.ToArray()))
                            If FilePlayerCause.Normal <> playCause Then
                                Me.Trace.TraceError("Say failed or was interrupted")
                                reply = ReturnCode.PlayInterrupted
                            End If
                        Else
                            Me.Trace.TraceInfo("GetRecognisedSpeech returned {0}", speechCause.ToString())
                            Me.Trace.TraceError("No response was recognised")
                            reply = ReturnCode.NoResponse
                        End If
                    End If

                    ' Say Goodbye using the default voice
                    channel.FilePlayer.Say("Goodbye.")

                    ' Ensure the call is hung up.
                    channel.HangUp()
                End If

            Catch ex As Exception
                Me.Trace.TraceError("Exception thrown {0}", ex.Message)
                reply = ReturnCode.ExceptionThrown
            End Try

            Me.Trace.TraceInfo("Completed with return code {0}", reply)
            Return reply

        End Function

    End Class

End Namespace