Voice Biometrics Users

This describes the web services that support verification of a user's identity by analysing their voice.

A user must first register in the system, at the same time supplying initial audio data of their voice. This constructs a voice model for the user, which should be updated to refine the model. Then, at regular intervals, the model can be updated to cater for natural aging, changes in environment etc.

Authorisation

This API employs basic authentication, using your cloud account username (email), but unlike other web service APIs, requires a suitable User Group Key rather than your account's API Access Key. The format for the authentication username and password is as follows:

Username : cloudID/username (e.g. 1-2-0/bob@example.com)
Password : One of your User Group Keys

For example:

$ curl --user 1-2-0/bob@example.com:ak-378b7602-5a91-47f8-9f6f-ef4bf3e234e9 https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/register?user_id=Bob&filename=bobsaudio1.wav

Audio data format

The audio data supplied to register, update and verify can be up to 60 seconds long and must be in mono WAV format with a minimum sample rate of 8 kHz. Supported formats are 16 bit PCM, a-law and mu-law.

Streaming data on a websocket

When training, updating or verifying a user, calling GET returns a url to a websocket which accepts streamed audio data and returns different responses to POST and PUT. In this initial release the audio data needs to be of known length (e.g. a file). Subsequent releases will provide additional stream types including a means of streaming in real time.

When the websocket API encounters an error it returns a reject event and one of the errors listed here.

Text dependent mode

By default, verification runs in text dependent mode. In this mode the same phrase must be used to register, update and verify and the voice biometric engine makes use of this to improve the reliability of the user's model. If different phrases are likely to be supplied to update and verify then text dependent mode should be disabled when calling verify.

Date formats

All dates are of the format YYYY-MM-DD_hh:mm:ss and are in Coordinated Universal Time (UTC).

Response content

All web services in this API return response content of type "application/json".

Charging information

The cost of each call to register or verify can be obtained using the application_status web service, passing it the application_instance_id returned. Cost information will not be available immediately. The costs_valid field will indicate when it is available.

 This is a low level API. For information on higher level APIs see the Web Services Language Wrappers
  • Register

    You need to supply your account username and a user group key in the basic authorisation string.

    This registers a user as part of a user group and creates an initial voice model for that user from the supplied audio data.

    The quality of the voice model is dependent on the supplied audio. It must be a good representation of the user's normal voice. It is recommended that at least ten seconds of audio is provided and that once registered, a user's voice model is successfully verified and updated at least twice.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/register
    Methods :
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id that uniquely identifies the user. Characters a-z, A-Z, 0-9, "." and "-" can all be used.

      When calling register using PUT, you supply a single wav file in the request body.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string always a transaction Id identifying the registration.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/register?user_id=Bob

      Response:

      If successful you will receive the example JSON response:

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949"
      }

      Or on error an HTTP error containing a JSON response. For example:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' is already present", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=register",
      		"datetime": "2021-01-15_08:56:24"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/register",
      		"datetime": "2021-01-15_08:56:23"
      	}
      }

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id that uniquely identifies the user. Characters a-z, A-Z, 0-9, "." and "-" can all be used.
      wav_file_url optional a url for a wav file to download. It can be repeated in the query string to add additional file urls. Omit if supplying wav file(s) in the request body.

      Remarks

      When calling register using POST, you supply one or more wav files in the request body as multipart/form-data or one or more urls that each point to a wav file to download. Multiple files can be different formats.

      When supplying wav files as multipart form-data, the part Name must be Source1, Source2, ... and each part Filename should be the name of the file without path.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string always a transaction Id identifying the registration.
      sources array of objects always an array of source objects containing indicating whether each file was processed successfully.

      Each source object contains:

      ParameterTypeAvailabilityDescription
      source string always if wav files were supplied in multipart/form-data, this contains the part Name (Source1, Source2, ...).
      if files were supplied in wav_file_url, this contains the original url supplied.
      accepted bool always whether the audio source was accepted by the voice biometric analysis.

      Example:

      Supplying audio file(s) in the request body:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/register?user_id=Bob
      
      1st part: Name: "Source1", Filename: "Bob1.wav"
      2nd part: Name: "Source2", Filename: "Bob2.wav"

      Supplying audio file urls in the query string:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/register?user_id=Bob&wav_file_url=my.wav.files.com%2Fget_wav%3Ffilename%3Dbob123.wav&wav_file_url=my.wav.files.com%2Fget_wav%3Ffilename%3Dbob456.wav

      Response:

      If successful for audio file(s) supplied in the request body you will receive the example JSON response :

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949", 
      	"sources": [
      		{
      			"source": "Source1",
      			"accepted": true
      		},
      		{
      			"source": "Source2",
      			"accepted": true
      		},
      	]
      }

      Or if successful for audio file(s) supplied in wav_file_url you will receive the example JSON response :

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949", 
      	"sources": [
      		{
      			"source": "my.wav.files.com/get_wav?filename=bob123.wav",
      			"accepted": true
      		},
      		{
      			"source": "my.wav.files.com/get_wav?filename=bob456.wav",
      			"accepted": true
      		},
      	]
      }

      Or on error the example JSON response:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' is already present", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=register",
      		"datetime": "2021-01-15_08:56:24"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/register",
      		"datetime": "2021-01-15_08:56:23"
      	}}

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id that uniquely identifies the user. Characters a-z, A-Z, 0-9, "." and "-" can all be used.
      stream_type * optional file a string indicating the type of stream that is to be sent on the websocket. Currently this only supports "file" type where the streamed audio data is of known length. In future this will support additional types such as realtime streaming.
      format * optional 16bit_PCM a string indicating the audio data format. One of "16bit_PCM", "alaw", "mulaw".
      sample_rate * optional 8000 an integer indicating the audio data sample rate in Hz. Minimum is 8000.

      * format and sample_rate properties are present for realtime streaming when it becomes available, but are ignored when stream_type is file. In this case the audio format and sample rate are obtained from the header of the supplied wav file data.

      Calling register using GET will return a url on which a websocket can be opened and the audio data streamed.

      Returns:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      url string always a url to a websocket on which to send your audio data

      You open a websocket on the returned url and send your audio data followed by a JSON message containing the following parameters:

      ParameterRequired/OptionalDefaultDescription
      event required Must be "audio_sent"

      You will receive a JSON response containing the following:

      ParameterTypeAvailabilityDescription
      event string always "register" if the user was succesfully registered or "reject" if not, in which case the reason field will indicate why the registration failed.
      transaction_id string only if event is "register" a transaction Id identifying the registration.
      user_id string only if event is "register" the user-defined Id identifying the user.
      reason string only if event is "reject" a description of the reason for the rejection.
      message string only if event is "reject" further information about the rejection if available.
      code integer only if event is "reject" one of the websocket API error codes.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/register?user_id=Bob

      Returns:

      {	
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"url": "wss://voisentry-2.aculabcloud.net/wss/register?auth_token=PFCoTsxSHqxqYz_P1XMSXKz1pDJPHBWMWQNJhbmoL..."
      }

      Open a websocket on this url and send the audio data.

      Then send the message:

      {
      	"event": "audio_sent"
      }

      If successful you will receive the example JSON response:

      {
      	"event": "register",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027"
      }

      Or on error the example JSON response:

      {
      	"event": "reject", 
      	"reason": "Bad Request",
      	"message": "Bad API request, register: user_id Bob is already present.",
      	"code": 50
      }
  • Update

    You need to supply your account username and a user group key in the basic authorisation string.

    This updates a user's voice model that has been created by calling register. It should only be called with audio data that is guaranteed (confirmed by other means) to be authentic user voice data.

    The quality of the voice model is dependent on the supplied audio. It must be a good representation of the user's normal voice. It is recommended that at least ten seconds of audio is provided. A user's voice model can continue to be updated at regular intervals and when there are changes in environment etc.

    You can supply a single wav using PUT, or multiple wav files using POST. Alternatively you can stream the user's audio data on a websocket opened on the url returned by GET.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/update
    Methods :
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key

    • ParameterRequired/OptionalDefaultDescription
      user_id required the user Id for the user that has already been registered..

      When calling update using PUT, you supply a single wav file in the request body.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string always a transaction Id identifying the update.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/update?user_id=Bob

      Response:

      If successful you will receive the example JSON response:

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949"
      }

      Or on error the example JSON response:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' not found", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=update",
      		"datetime": "2021-01-15_09:24:16"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/update",
      		"datetime": "2021-01-15_09:24:16"
      	}
      }

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required the user Id for the user that has already been registered..
      wav_file_url optional a url for a wav file to download. It can be repeated in the query string to add additional file urls. Omit if supplying wav file(s) in the request body.

      When calling update using POST, you supply one or more wav files in the request body as multipart/form-data or one or more urls that each point to a wav file to download. Multiple files can be different formats.

      When supplying wav files as multipart form-data, the part Name must be Source1, Source2, ... and each part Filename should be the name of the file without path.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string always a transaction Id identifying the update.
      sources array of objects always an array of source objects containing indicating whether each file was processed successfully.

      Each source object contains:

      ParameterTypeAvailabilityDescription
      source string always if wav files were supplied in multipart/form-data, this contains the part Name (Source1, Source2, ...).
      if files were supplied in wav_file_url, this contains the original url supplied.
      accepted bool always whether the audio source was accepted by the voice biometric analysis.

      Example:

      Supplying audio file(s) in the request body:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/update?user_id=Bob
      
      1st part: Name: "Source1", Filename: "Bob1.wav"
      2nd part: Name: "Source2", Filename: "Bob2.wav"

      Supplying audio file urls in the query string:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/update?user_id=Bob&wav_file_url=my.wav.files.com%2Fget_wav%3Ffilename%3Dbob123.wav&wav_file_url=my.wav.files.com%2Fget_wav%3Ffilename%3Dbob456.wav

      Response:

      If successful for audio file(s) supplied in the request body you will receive the example JSON response :

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949", 
      	"sources": [
      		{
      			"source": "Source1",
      			"accepted": true
      		},
      		{
      			"source": "Source2",
      			"accepted": true
      		},
      	]
      }

      Or if successful for audio file(s) supplied in wav_file_url you will receive the example JSON response :

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "57c3c968570f11eb854a02e848f54949", 
      	"sources": [
      		{
      			"source": "my.wav.files.com/get_wav?filename=bob123.wav",
      			"accepted": true
      		},
      		{
      			"source": "my.wav.files.com/get_wav?filename=bob456.wav",
      			"accepted": true
      		},
      	]
      }

      Or on error the example JSON response:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' not found", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=update",
      		"datetime": "2021-01-15_09:24:16"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/update",
      		"datetime": "2021-01-15_09:24:16"
      	}
      }

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required the user Id for the user that has already been registered..
      stream_type * optional file a string indicating the type of stream that is to be sent on the websocket. Currently this only supports "file" type where the streamed audio data is of known length. In future this will support additional types such as realtime streaming.
      format * optional 16bit_PCM a string indicating the audio data format. One of "16bit_PCM", "alaw", "mulaw".
      sample_rate * optional 8000 an integer indicating the audio data sample rate in Hz. Minimum is 8000.

      * format and sample_rate properties are present for realtime streaming when it becomes available, but are ignored when stream_type is file. In this case the audio format and sample rate are obtained from the header of the supplied wav file data.

      Calling update using GET will return a url on which a websocket can be opened and the audio data streamed.

      Returns:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      url string always a url to a websocket on which to send your audio data

      You open a websocket on the returned url and send your audio data followed by a JSON message containing the following parameters:

      ParameterRequired/OptionalDefaultDescription
      event required Must be "audio_sent"

      You will receive a JSON response containing the following:

      ParameterTypeAvailabilityDescription
      event string always "update" if the user was succesfully registered or "reject" if not, in which case the reason field will indicate why the update failed.
      transaction_id string only if event is "update" A transaction Id identifying the update.
      user_id string only if event is "update" the user-defined Id identifying the user.
      reason string only if event is "reject" a description of the reason for the rejection.
      message string only if event is "reject" further information about the rejection if available.
      code integer only if event is "reject" one of the websocket API error codes.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/update?user_id=Bob

      Returns:

      {	
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"url": "wss://voisentry-2.aculabcloud.net/wss/update?auth_token=PFCoTsxSHqxqYz_P1XMSXKz1pDJPHBWMWQNJhbmoL..."
      }

      Open a websocket on the returned url and send the audio data.

      Then send the message:

      {
      	"event": "audio_sent"
      }

      If successful you will receive the example JSON response:

      {
      	"event": "update",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027"
      }

      Or on error the example JSON response:

      {
      	"event": "reject", 
      	"reason": "Bad Request",
      	"message": "Bad API request, update: user_id Bob not found.",
      	"code": 50
      }
  • Verify

    You need to supply your account username and a user group key in the basic authorisation string.

    This verifies that the supplied audio matches the specified registered user.

    The audio data is analysed against the specified user's voice model and this produces a confidence value that the user has been matched. A value of greater than 1.0 indicates a match.

    You can enable Presentation Attack Detection (PAD, also known as spoof detection) for each verification. This tries to detect synthesized audio or audio that has been clandestinely recorded and replayed. It also detects audio that has been used previously.

    You can supply a single wav file using PUT. For POST in text dependent mode you can supply only a single wav file or, if text dependent is false, you can provide multiple files. Alternatively you can stream the user's audio data on a websocket opened on the url returned by GET.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/verify
    Methods :
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id of the user to be verified.
      sensitivity optional 0.0 a float that determines the sensitivity of the biometric analysis. Allowable range is -10.0 to 10.0. Positive values decrease confidence (reducing the likelyhood that an impostor may be verified). Negative values increase confidence (reducing the likelyhood that a real speaker may not be verified).
      text_dependent optional true "true" to enable text dependence else "false". When this is enabled all audio data supplied to register, update and verify must contain the same spoken phrase. The verification algorithm makes use of this to optimise its analysis.
      enable_pad optional false "true" to enable Presentation Attack Detection (PAD) else "false". Presentation attacks may be encountered when voice data is supplied that has been covertly obtained from the user or faked or used previously. The verified and confidence result are not affected by PAD, but if an attack is detected then a PAD_detected key is included in the response that details the type of attack.

      When calling verify using PUT, you supply a single wav file in the request body.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string if verification occurs without error a transaction Id identifying the verification.
      verified bool if verification occurs without error whether the supplied audio has been verified against the user. If PAD was enabled you should also check the response for pad_detected.
      confidence float if verification occurs without error a measure of confidence that the audio has been matched against the user. A value of greater than or equal to 1.0 is considered to have been verified.
      pad_detected a PADDetected object if verification occurs without error and a Presentation Attack was detected. this is present if PAD was enabled and a Presentation Attack was detected during verification. The contents indicate the type of attack detected.

      Where the PADDetected object contains the following:

      ParameterTypeAvailabilityDescription
      type string always one of "TypeA", "TypeB" or "TypeC".
      • A: duplicate audio - audio that has been played previously.
      • B: replayed recording or an imitation attempt.
      • C: synthetic speech.

      Remarks:

      The algorithm that calculates the confidence adapts as the speaker’s model is updated and the scaling of the returned confidence value may vary. Consequently different confidence values cannot be reliably compared (e.g. today's score of 1.21 for a user is not necessarily better than yesterday's score of 1.20).

      A user may often be verified with a confidence score that is just above 1.0. This is normal and should not be considered a weak result. If it becomes apparent that the error rates are insufficient for the application, the sensitivity value can be altered to make it more or less sensitive. Setting a positive sensitivity, 1.0 for example, will reduce the likelyhood that an impostor may pass. Setting a negative sensitivity, -1.0 for example, will reduce the likelyhood that a real speaker may fail verification.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/verify?user_id=Bob

      Response:

      If successful you will receive the example JSON response:

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25
      }

      If the verification is carried out, but a Presentation Attack is detected you will receive the example JSON response:

      {
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25,
      	"pad_detected": {
      		"type": "TypeA"
      	}
      }

      Or on error the example JSON response:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' not found", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=verify",
      		"datetime": "2021-01-15_09:29:36"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/verify",
      		"datetime": "2021-01-15_09:29:36"
      	}
      

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id of the user to be verified.
      wav_file_url optional a url for a wav file to download. When text_dependent is true, only one url is supported. When text_dependent is false, you can provide additional urls in the query string to specify multiple audio sources. Omit if supplying wav file(s) in the request body.
      sensitivity optional 0.0 a float that determines the sensitivity of the biometric analysis. Allowable range is -10.0 to 10.0. Positive values decrease confidence (reducing the likelyhood that an impostor may be verified). Negative values increase confidence (reducing the likelyhood that a real speaker may not be verified).
      text_dependent optional true "true" to enable text dependence else "false". When this is enabled all audio data supplied to register, update and verify must contain the same spoken phrase. The verification algorithm makes use of this to optimise its analysis.
      enable_pad optional false "true" to enable Presentation Attack Detection (PAD) else "false". Presentation attacks may be encountered when voice data is supplied that has been covertly obtained from the user or faked or used previously. The verified and confidence result are not affected by PAD, but if an attack is detected then a PAD_detected key is included in the response that details the type of attack.

      When calling verify using POST with text_dependent enabled, you supply a wav file in the request body as multipart/form-data or a url that points to a wav file to download.

      When calling verify using POST with text_dependent disabled, you supply one or more wav files in the request body as multipart/form-data or one or more urls that each point to a wav file to download. Multiple files can be different formats.

      When supplying wav files as multipart form-data, the part Name must be Source1, Source2, ... and each part Filename should be the name of the file without path.

      Returns on success:

      A JSON object containing the following parameters:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      transaction_id string if verification occurs without error a transaction Id identifying the verification.
      verified bool if verification occurs without error whether the supplied audio has been verified against the user. If PAD was enabled you should also check the response for pad_detected.
      confidence float if verification occurs without error a measure of confidence that the audio has been matched against the user. A value of greater than or equal to 1.0 is considered to have been verified.
      sources array of objects only if text_dependent is False an array of source objects containing indicating whether each file was processed successfully.
      pad_detected a PADDetected object if verification occurs without error and a Presentation Attack was detected. this is present if PAD was enabled and a Presentation Attack was detected during verification. The contents indicate the type of attack detected.

      Each source object contains:

      ParameterTypeAvailabilityDescription
      source string always if wav files were supplied in multipart/form-data, this contains the part Name (Source1, Source2, ...).
      if files were supplied in wav_file_url, this contains the original url supplied.
      accepted bool always whether the audio source was accepted by the voice biometric analysis.

      Where the PADDetected object contains the following:

      ParameterTypeAvailabilityDescription
      type string always one of "TypeA", "TypeB" or "TypeC".
      • A: duplicate audio - audio that has been played previously.
      • B: replayed recording or an imitation attempt.
      • C: synthetic speech.

      Remarks:

      The algorithm that calculates the confidence adapts as the speaker’s model is updated and the scaling of the returned confidence value may vary. Consequently different confidence values cannot be reliably compared (e.g. today's score of 1.21 for a user is not necessarily better than yesterday's score of 1.20).

      A user may often be verified with a confidence score that is just above 1.0. This is normal and should not be considered a weak result. If it becomes apparent that the error rates are insufficient for the application, the sensitivity value can be altered to make it more or less sensitive. Setting a positive sensitivity, 1.0 for example, will reduce the likelyhood that an impostor may pass. Setting a negative sensitivity, -1.0 for example, will reduce the likelyhood that a real speaker may fail verification.

      Example:

      Supplying an audio file in the request body:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/verify?user_id=Bob

      Supplying audio file url in the query string:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/verify?user_id=Bob&wav_file_url=my.wav.files.com%2Fget_wav%3Ffilename%3Dbob123.wav

      Response:

      If the verification is carried out successfully you will receive the example JSON response:

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25
      }

      If the verification is carried out, but a Presentation Attack is detected you will receive the example JSON response:

      {
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25,
      	"pad_detected": {
      		"type": "TypeA"
      	}
      }

      Or on error the example JSON response:

      {
      	"error": {
      		"code": HTTP 400,
      		"text": "Bad Request: user_id 'Bob' not found", 
      		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=verify",
      		"datetime": "2021-01-15_09:29:36"
      	},
      	"request": {
      		"url": "/voice_biometrics/v1/user/verify",
      		"datetime": "2021-01-15_09:29:36"
      	}
      }

      Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

    • ParameterRequired/OptionalDefaultDescription
      user_id required a user-defined Id of the user to be verified.
      sensitivity optional 0.0 a float that determines the sensitivity of the biometric analysis. Allowable range is -10.0 to 10.0. Positive values decrease confidence (reducing the likelyhood that an impostor may be verified). Negative values increase confidence (reducing the likelyhood that a real speaker may not be verified).
      text_dependent optional true "true" to enable text dependence else "false". When this is enabled all audio data supplied to register, update and verify must contain the same spoken phrase. The verification algorithm makes use of this to optimise its analysis.
      enable_pad optional false "true" to enable Presentation Attack Detection (PAD) else "false". Presentation attacks may be encountered when voice data is supplied that has been covertly obtained from the user or faked or used previously. The verified and confidence result are not affected by PAD, but if an attack is detected then a PAD_detected key is included in the response that details the type of attack.
      stream_type * optional file a string indicating the type of stream that is to be sent on the websocket. Currently this only supports "file" type where the streamed audio data is of known length. In future this will support additional types such as realtime streaming.
      format * optional 16bit_PCM a string indicating the audio data format. One of "16bit_PCM", "alaw", "mulaw".
      sample_rate * optional 8000 an integer indicating the audio data sample rate in Hz. Minimum is 8000.

      * format and sample_rate properties are present for realtime streaming when it becomes available, but are ignored when stream_type is file. In this case the audio format and sample rate are obtained from the header of the supplied wav file data.

      Calling verify using GET will return a url on which a websocket can be opened and the audio data streamed.

      Returns:

      ParameterTypeAvailabilityDescription
      application_instance_id string always an application instance Id identifying the voice biometric call and that can be used to call application_status.
      url string always a url to a websocket on which to send your audio data

      You open a websocket on the returned url and send your audio data followed by a JSON message containing the following parameters:

      ParameterRequired/OptionalDefaultDescription
      event required Must be "audio_sent"

      You will receive a JSON response containing the following:

      ParameterTypeAvailabilityDescription
      event string always "verify" or "reject" if the verification process failed, in which case the reason field will indicate why verification process failed.
      transaction_id string only if event is "verify" a transaction Id identifying the verification.
      user_id string only if event is "verify" the user-defined Id identifying the user.
      verified bool only if event is "verify" whether the supplied audio has been verified against the user. If PAD was enabled you should also check the response for pad_detected.
      confidence float only if event is "verify" a measure of confidence that the audio has been matched against the user. A value of greater than or equal to 1.0 is considered to have been verified.
      pad_detected a PADDetected object only if event is "verify" and a Presentation Attack was detected. this is present if PAD was enabled and a Presentation Attack was detected during verification. The contents indicate the type of attack detected.
      reason string only if event is "reject" a description of the reason for the rejection.
      message string only if event is "reject" further information about the rejection if available.
      code integer only if event is "reject" one of the websocket API error codes.

      Where the PADDetected object contains the following:

      ParameterTypeAvailabilityDescription
      type string always one of "TypeA", "TypeB" or "TypeC".
      • A: duplicate audio - audio that has been played previously.
      • B: replayed recording or an imitation attempt.
      • C: synthetic speech.

      Remarks:

      The algorithm that calculates the confidence adapts as the speaker’s model is updated and the scaling of the returned confidence value may vary. Consequently different confidence values cannot be reliably compared (e.g. today's score of 1.21 for a user is not necessarily better than yesterday's score of 1.20).

      A user may often be verified with a confidence score that is just above 1.0. This is normal and should not be considered a weak result. If it becomes apparent that the error rates are insufficient for the application, the sensitivity value can be altered to make it more or less sensitive. Setting a positive sensitivity, 1.0 for example, will reduce the likelyhood that an impostor may pass. Setting a negative sensitivity, -1.0 for example, will reduce the likelyhood that a real speaker may fail verification.

      Example:

      https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/verify?user_id=Bob

      Returns:

      {	
      	"application_instance_id": "vb-2236d625010060f83617.000001",
      	"url": "wss://voisentry-2.aculabcloud.net/wss/verify?auth_token=PFCoTsxSHqxqYz_P1XMSXKz1pDJPHBWMWQNJhbmoL..."
      }

      Open a websocket on the returned url and send the audio data.

      Then send the message:

      {
      	"event": "audio_sent"
      }

      If the verification is carried out successfully you will receive the example JSON response:

      {
      	"event": "verify",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25
      }

      If the verification is carried out, but a Presentation Attack is detected you will receive the example JSON response:

      {
      	"event": "verify",
      	"transaction_id": "e1352c5a44fe11eb91ba021ae2cff027",
      	"verified": true,
      	"confidence": 1.25,
      	"pad_detected": {
      		"type": "TypeA"
      	}
      }

      Or on error the example JSON response:

      {
      	"event": "reject", 
      	"reason": "Bad Request",
      	"message": "Bad API request, update: user_id Bob not found.",
      	"code": 50
      }
  • Stats

    This obtains statistics relating to a specified registered user.

    You need to supply your account username and a user group key in the basic authorisation string.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/stats
    Methods : GET, POST
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key
    ParameterRequired/OptionalDefaultDescription
    user_id required the Id of the user.

    Returns on success:

    A JSON object containing the following parameters:

    ParameterTypeAvailabilityDescription
    user_group_name string always the name of the user group in which this user is registered.
    transaction_id string always a transaction Id identifying the check.
    registration_date string always the registration date and time of the user.
    update_date string always the date and time of last update for the user.
    verification_date string always the date and time of the last verification for the user.
    verification_attempts integer always the number of verifications that were attempted.
    verifications_passed integer always the number of verifications that passed.
    verifications_failed integer always the number of verifications that failed.

    Example:

    https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/stats?user_id=Bob

    If successful you will receive the JSON response:

    {
    	"user_group_name": "BobsCompany", 
    	"transaction_id": "c2ae5d2455ac11ebbdef02e848f54949", 
    	"registration_date": "2021-01-13_14:30:33", 
    	"update_date": "2021-01-13_14:31:28", 
    	"verification_date": "2021-01-13_14:34:24", 
    	"verification_attempts": 1, 
    	"verifications_passed": 1, 
    	"verifications_failed": 0
    }

    Or on error the JSON response:

    {
    	"error": {
    		"code": HTTP 404,
    		"text": "Not found: user Id Bill is not a registered user.", 
    		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=stats",
    		"datetime": "2021-01-15_08:56:24"
    	},
    	"request": {
    		"url": "/voice_biometrics/v1/user/stats",
    		"datetime": "2021-01-15_08:56:23"
    	}
    }

    Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

  • Exists

    This determines whether a specified user is currently registered.

    You need to supply your account username and a user group key in the basic authorisation string.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/exists
    Methods : GET, POST
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key
    ParameterRequired/OptionalDefaultDescription
    user_id required the Id of the user.

    Returns on success:

    A JSON object containing the following parameters:

    ParameterTypeAvailabilityDescription
    user_id string always the Id of the user.
    transaction_id string always a transaction Id identifying the check.
    exists bool always whether the user registration exists.

    Example:

    https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/exists?user_id=Bob

    If successful you will receive the JSON response:

    {
    	"user_id": "Bob", 
    	"transaction_id": "c2ae5d2455ac11ebbdef02e848f54949", 
    	"exists": true 
    }

    Or on error the JSON response:

    {
    	"error": {
    		"code": HTTP 404,
    		"text": "Not found: user Id Bill is not a registered user.", 
    		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=exists",
    		"datetime": "2021-01-15_08:57:18"
    	},
    	"request": {
    		"url": "/voice_biometrics/v1/user/exists",
    		"datetime": "2021-01-15_08:57:17"
    	}
    }

    Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.

  • Delete

    This deletes a specified registered user and removes any trained audio voice model data.

    You need to supply your account username and a user group key in the basic authorisation string.

    Url : https://ws.aculabcloud.net/voice_biometrics/v1/user/delete
    Methods : GET, POST
    Username : cloudID/username (e.g. 1-2-0/bob@example.com/)
    Password:user group access key
    ParameterRequired/OptionalDefaultDescription
    user_id required the Id of the user.

    Returns on success:

    A JSON object containing the following parameters:

    ParameterTypeAvailabilityDescription
    user_id string always the Id of the deleted user.
    transaction_id string always a transaction Id identifying the delete.
    deleted string always the deletion date and time of the user.

    Example:

    https://ws-1-2-0.aculabcloud.net/voice_biometrics/v1/user/delete?user_id=Bob

    If successful you will receive the JSON response:

    {
    	"user_id": "Bob, 
    	"transaction_id": "c2ae5d2455ac11ebbdef02e848f54949", 
    	"deleted": "2021-01-13_14:30:33"
    }

    Or on error the JSON response:

    {
    	"error": {
    		"code": HTTP 404,
    		"text": "Not found: user Id Bill is not a registered user.", 
    		"link": "https://www.aculab.com/cloud/web-services/voice-biometrics/users?target=service_action_tabs&tab-id=delete",
    		"datetime": "2021-01-15_08:57:46"
    	},
    	"request": {
    		"url": "/voice_biometrics/v1/user/delete",
    		"datetime": "2021-01-15_08:57:45"
    	}
    }

    Note: the code returned may be an HTTP error code (prefixed by "HTTP") or an underlying Voice Biometrics engine error that has no prefix. For these non-HTTP errors, see the text for a description of the error.