The question of accuracy in voice biometrics

When it comes to any form of identity verification, accuracy is a core measure of success. So, in respect to the accuracy of voice biometrics, what are people saying? A cursory review of vendor messaging reveals some strikingly similar claims: "We offer accuracy of up to 99%" "Typical accuracy rates are in the region of 98-99%" "We deliver a 99.9% success rate in production" "Our software offers error rates below 0.1%" "The system is regularly tuned to 96-99% accuracy levels" If we take some of these statements at face value, they seem impressive. However, when we try to add some context, they become less meaningful. For example, 99% of what? If a new system delivers up to a 50% improvement in accuracy over previous performance, what does that actually mean? If the old system was 99% accurate, is the new system 149% accurate? Of course not. Is it then 99.5% accurate? Would you interpret 90% accuracy as meaning 10% of a result is inaccurate? Probably not. What people really mean when they say 99% accuracy is that, 99 times out of 100, the result will be correct. For voice biometrics, this is a bit of a diversionary tactic. What’s more meaningful is to speak of error rates. For any security or identity verification application, the important factors are the false acceptance and false rejection rates (FAR and FRR). Accuracy, therefore, should reflect a combination of false positives and false negatives. False acceptance refers to the percentage of times the system erroneously admits an impostor. False rejection refers to the percentage of times that the system rejects a legitimate user. The existence both false positives and false negatives belies any claim that a system can be 100% accurate. As these measures are rates, it’s correct to refer to them in percentage terms. For example, a 0.8% FRR means that, under test conditions, eight results in 1000 attempts were incorrect. The nature of the technology means that both FAR and FRR have to be considered, since tightening up on one relaxes the other. That being the case, the best measure of accuracy is the equal error rate (EER) – the point of optimal performance where you will get no more false positives than false negatives.   ""For a measure to be valid, testing must be conducted with a statistically relevant number of verification attempts.""

But what about results?

Voice biometrics is a technology based on probability and is dependent upon statistical algorithms. The probability of me being me is 1. The probability of you being me is 0. When a system returns a result, the closer it is to 1, the greater the confidence in the caller being who they claim to be. If the result is 0.98, the person is highly likely to be who they say they are. However, a result of 0.65 doesn’t equate to 65%, and neither figure is a measure of accuracy. Some systems present results on a scale of 0-100, which unfortunately leads to them being misrepresented as percentages. Others present results in relation to a pre-determined benchmark. Some solutions simply present call-handlers with a yes/no, red/green, or go/no-go indication. The benefit of benchmarking or threshold setting is that it enables the security-conscious organisation to determine what level of risk is acceptable. On the assumption that there will be false positives, setting the threshold to a lower value will minimise or eliminate those. Obviously, this will come at the expense of a higher incidence of false negatives (and vice versa with a higher threshold). Real-world applications typically vary between a false acceptance rate of below 0.5% and a false rejection rate of less than 5%. The problem with “accuracy” as a measure is that if those were your published tolerances, and the measured performance was within this range, you could claim 100% accuracy. The only viable method of determining the reliability (or acceptability) of a system, is to test it in the environment in which it will be deployed, with data derived from the real-world user population.

Useful links:

Archive

The Aculab blog

News, views and industry insights from Aculab

  • 4 Uses of CPaaS to improve Healthcare services

    The healthcare industry is a constantly shifting marketplace, with new technologies evolving on a regular basis. However these changes tend to be behind the scenes; until the COVID-19 pandemic very little had changed in terms of how doctors and medical staff interact with patients. Now healthcare providers are playing catch up to create pandemic and futureproof communication models. For many, a CPaaS solution is their salvation.

    Continue reading

  • 3 Ways to Reduce Carbon Emissions with Cloud Communications

    As traditional communication solutions, which have a large energy footprint, fall short with sustainability, could cloud-based communications be the answer?

    Continue reading

  • The Battle Against Wildfires

    Wildfires (or forest fires) are happening more and often every year. While it is true that wildfires are a natural process, the frequency and intensity that we are starting to see year-on-year across the globe is concerning. Continue reading to find out how Aculab provides mission-critical infrastructure for emergency networks, to tackle high-risk situations such as wildfires.

    Continue reading

  • Reminder: The world is reopening (again)

    Appointment reminders provide a crucial service in the healthcare industry, find out how Aculab Cloud can help alleviate stress and take your communication to another level.

    Continue reading

  • How to choose a CPaaS Solution

    Identify your CPaaS needs in 3 simple steps using this simple guide, mistakes to avoid when choosing a CPaaS solution and why you need CPaaS for your business

    Continue reading