A Technology Project Proposes That Voice Assistants Like Siri Speak More Languages

11:38 PM by
Assistants Like Siri


Neither Apple development, nor Alexa, nor Google Assistant work in African languages. To make this technology more inclusive, the Common Voice initiative calls on thousands of people to donate their voices for an open database that can be used for algorithms to learn to speak other languages.

Technology came to change the lives of millions of people around the world. There is no doubt that it helps improve education, provides access to information and communication instantly. The problem is that there are millions of people who are being relegated and forgotten by the big companies that are leading the changes in the global technological ecosystem. And one of the clearest examples is the voice assistants that are in our phones, smart screens, watches and computers.

In Africa there are more than a thousand native languages. Each of them has its own accents, speech patterns, and structures. And neither Alexa, nor Siri, nor Google Assistant, the three most famous voice assistants on the market, speak any of those languages. This automatically leaves millions of people without the possibility of using this technology, which, curiously, is one of the ones that has grown the most in recent years.

The giants like Google, Amazon or Apple do not pay attention to a part of the market for voice assistants that may seem small but represents millions of people, but there are other companies that are addressing this problem. Mozilla, the company behind the Firefox browser, is one of the companies that is making the most progress.

“Companies appear to have followed the dominant language business model, which often leaves behind the diversity of African languages, among others. They focus on extrapolating developments and 'plugging' them into the African context rather than adapting them and that is not going to work successfully, ”Chenai Chair, special advisor for innovation in Africa at the Mozilla Foundation, says via email.

Mozilla is managing to meet the needs of millions of people on the African continent thanks to the development of a collaborative technology called Common Voice. It is an ambitious open source initiative aimed at democratizing and diversifying voice technology. "It's an approach to changing the status quo," adds Chair.

To understand Common Voice, you have to understand how machine learning algorithms work. These algorithms learn by themselves, but for that you have to give them a large amount of information so that they can understand, in the case of voices and languages, the different phrases, tones and structures of language.

How Common Voice works is simple: it allows people to donate their voices to a free, publicly available database for companies, researchers, and developers to use to train voice-enabled applications, products, and services.

The need it is filling is so great that Common Voice became the world's largest multilingual public domain voice data set. From 2017, when the initiative was born, to the present, it has managed to gather more than 12 thousand hours of voice data and 75 different languages ​​ranging from Swahili to Mandarin and Welsh.

Why is this public database so important? Because most databases of this type are owned by large for-profit corporations and used to train their machine learning algorithms. This makes it practically impossible for developers, researchers and smaller companies, without so many economic possibilities to obtain bases or create their own, to get involved in the development of new, more inclusive speech recognition technologies.

From the beginning of the pandemic, everything changed in the project. Before, events were held, especially in schools, where not only did people meet to collaborate with Common Voice, but also discussion tables were set up to try to find better ways to reach more people. Since the arrival of COVID-19, this process has become virtual. And while this saves a lot of work when organizing meetings, it also created inconveniences: Many people do not have good connections or upload recordings with too much noise to help artificial intelligence do its work. The quality of the information is as important as the information itself.

The experience of donating your voice to Common Voice is straightforward. On the project page you can see two large buttons: speak (speak) and listen (listen). The first is so that anyone who wishes can donate their voice by reading a series of phrases that the system will show them. The second is so that all users can validate the accuracy of the voices that others donated.

The platform's incredibly simple and intuitive design is not by chance. The user experience should work perfectly for people who not only speak different languages, but also have very different degrees of technology education.

The project, in which 400 thousand people from all over the world have already collaborated, is a success. Common Voice made an agreement with the company NVIDIA, from which it received a million and a half dollars in addition to workforce and technology to improve its systems, and also obtained more than 3 million as part of a joint donation by the Foundation Bill & Melinda Gates, German Development Cooperation and the UK Foreign and Commonwealth Office. The money is being used primarily to hire people to specifically grow the database of Swahili, which is a language spoken mostly in Tanzania and Kenya by an estimated 45 million people.

"We plan to make conversational technology available in most languages," says Sid Sharma, Head of Product Marketing at NVIDIA.

The Internet - and therefore everything that surrounds the technological world - was and is built and developed in English. If you take into account that only 20% of the world speaks this language and only 5% of people are native speakers, you can imagine how many people are having a barrier to use technology. If we want it to be more inclusive, we have to start thinking about a future that does not speak just one language, but all possible languages ​​so as not to leave so many citizens out of the digital world. Projects like Common Voice, little by little and with the help of the citizens themselves, are achieving it.

Sources:


0 comments:

Post a Comment