Voice recognition

Non-white Americans, or those whose native language is not English, are less likely to be understood by virtual assistants and other applications using voice recognition. And although many virtual assistants are female themselves, they are less likely to recognise women’s voices. Not very problematic if you want Alexa to play a song. But slightly trickier when she needs to call the emergency line.

Several studies found that male voices are recognised with more accuracy by voice recognition systems than female voices. Miriam Vogel, president of EqualAI, explains that these devices respond less to certain tones inherent in the female voice. Perhaps progress has been made: a 2017 study by dr. Rachael Tatman and Conner Kasten found no significant difference in accuracy by gender. However, the same study shows a clear difference in accuracy by race for several voice recognition systems. YouTube’s automatic captions, for instance, are about 5% less accurate for African American and mixed race speakers as compared to white Americans.

The inaccuracy of voice recognition systems is not without consequence. Voice recognition is used to pre-screen candidates in hiring processes, for instance. And countries use it to assess English fluency of people applying for a resident permit. Poor recognition may thus further reduce access to employment and legal residency for marginalised groups.

Do you want to contribute to accurate voice recognition for all? Help Mozilla’s Common Voice project by sending in your voice sample! Go to: https://commonvoice.mozilla.org/en