The WebSocket Protocol enables two-way communication between a client and a server. When Aculab Cloud communicates over a WebSocket, it takes the client role and connects to your server.
Aculab Cloud specifies a sub-protocol of "v1.cloud.aculab.com". This sub-protocol uses text frames to send messages containing JSON strings representing information and control messages. Binary frames are used to pass raw audio data.
Every message contains a JSON object. Each object has a "type" property, to identify the format of the message.
Calls to WebSockets
Applications running in Aculab Cloud can make outbound calls to WebSockets. For example, a REST application can use the Connect action with the destination set to a secure WebSocket URL.
The initial WebSocket connection uses an HTTP request. If that request returns a status other than "101 Switching Protocols", then the HTTP response status code will be passed to the application as the call raw cause. For example, if the server returns "404 Not Found" then the call raw cause will be 404.
If there is a problem connecting to the URL specified, one of the following call raw cause values may be passed to the application:
|Call raw cause||Description|
|900||An error that was not classified.|
|901||The URL could not be parsed.|
|902||The DNS name from the URL could not be resolved.|
|903||A connection could not be made to the server.|
|904||The transport layer security (TLS) handshake failed. This is usually a certifcate problem.|
|905||The WebSocket handshake failed without returning a valid HTTP response status code.|
Once the WebSocket is connected, the server can end the call by sending a call hangup message. In this case the "raw_cause" specified in that message will be passed to the application. If the WebSocket is closed with a WebSocket Close frame, then the status code in that frame will be passed to the application.
When the application ends the WebSocket call, the call raw cause passed to the application will be 0.
Messages from Aculab Cloud
The first message sent when an outbound call is made to a websocket.
|from||The caller's number or SIP address.|
Provides details of the audio data that will follow. Each binary frame from Aculab Cloud will normally contain 20ms of audio data.
|format||The encoding used for the audio data. One of:
|sample_rate||The sample rate of the audio data. Currently always 8000.|
|channels||The number of channels in the audio data. This is either 1 or 2. For calls to WebSockets, this is 1.|
Indicates that the last of the audio data has been sent.
This message is sent when a call ends. The WebSocket connection is closed after this message.
|cause||The call completion cause. One of:
|raw_cause||The cause code used to end the call. This is normally 0 for calls ended by the application.|
Audio play finished
Indicates that a play initiated by a audio play start has finished. This is always sent when a play finishes, whether it ended normally, was aborted or the call ends.
|id||The id from the corresponding audio play start.|
Reports that a message was received that was not understood or could not be processed at that point.
|reason||A description of the problem with the received frame.|
Audio overflow warning
Indicates that Aculab Cloud is dropping audio data because the send queue is full. This means that the WebSocket has been unable to send frames, usually because the server is not reading them fast enough.
Messages to Aculab Cloud
When a call is made to a WebSocket, the server may send messages to Aculab Cloud to play audio through the WebSocket or to disconnect the call.
All frames sent, whether text or binary, may have a payload of up to 1600 bytes. Frames that exceed this limit will cause the connection to be closed.
Audio play start
Starts a new audio play and provides details of the audio data that will follow. Each binary frame must contain a whole number of samples. It is strongly recommended that a multiple of 10ms of data is included in each frame. Aculab Cloud will queue up to 250 audio data frames for each audio play when the frames are sent faster than real-time. The audio data frames should to be paced to not exceed the queue size.
|format||"16bit_PCM"||The encoding used for the audio data. One of:
|sample_rate||8000||The sample rate of the audio data. Currently, only 8000 is supported.|
|channels||1||The number of channels in the audio data. This is either 1 or 2.|
|channel||0||Which channel of the audio to play. This is either 0, 1 or 2.
For mono audio data, either 0 or 1 can be used.
For 2 channel audio, 0 will mix the two channels together (only supported with "16bit_PCM" data).
|initial_buffer_ms||100||The amount of audio data to buffer before starting to play. When sending the audio in real-time, this can reduce the risk of delays causing audio gaps.|
|id||""||An identifier. This will be included in an audio play end message when the play ends.|
Audio play end
Indicates that the last of the audio play data has been sent. A new audio play can
be started after this message without waiting for the
audio play finished to arrive.
Aculab Cloud allows up to 4 audio plays to be queued at a time.
Audio play abort
Aborts all active and queued audio plays. All audio data sent before this message, and not yet played, is discarded. A new audio play can be started after this message without waiting for audio play finished messages to arrive.
Tell Aculab Cloud to hangup the call.
|cause||"NORMAL"||The call completion cause to pass to the application. One of:
|raw_cause||0||The raw cause code to pass to the application.|