Connect with us

Tech

Why Google Assistant supports so many more languages than Siri, Alexa, Bixby, and Cortana – VentureBeat

Published

 on


Google Assistant, Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana recognize only a narrow slice of the world’s most widely spoken languages. It wasn’t until fall 2018 that Samsung’s Bixby gained support for German, French, Italian, and Spanish — languages spoken by over 600 million people worldwide. And it took years for Cortana to become fluent in Spanish, French, and Portuguese.

But Google — which was already ahead of the competition a year ago with respect to the number of languages its assistant supported — pulled far ahead this year. With the addition of more than 20 new languages in January 2019, and more recently several Indic languages, Google Assistant cemented its lead with over 40 languages in well over 80 countries, up from eight languages and 14 countries in 2017. (Despite repeated requests, Google would not provide an exact number of languages for Google Assistant.) That’s compared with Siri’s 21 supported languages, Alexa’s and Bixby’s seven languages, and Cortana’s eight languages.

So why has Google Assistant pulled so far ahead? Naturally, some of the techniques underpinning Google’s natural language processing (NLP) remain closely guarded trade secrets. But the Mountain View company’s publicly available research sheds some — albeit not much — light on why rivals like Amazon and Apple have yet to match its linguistic prowess.

Genius Dog 336 x 280 - Animated

Supporting a new language is hard

Adding language support to a voice assistant is a multi-pronged process that requires considerable research into speech recognition and voice synthesis.

Most modern speech recognition systems incorporate deep neural networks that predict the phonemes, or perceptually distinct units of sound (for example, p, b, and d in the English words pad, pat, and bad). Unlike older techniques, which relied on hand-tuned statistical models that calculated probabilities for combinations of words to occur in a phrase, neural nets derive characters from representations of audio frequencies called mel-scale spectrograms. This reduces error rates while partially eliminating the need for human supervision.

Speech recognition has advanced significantly, particularly in the past year or so. In a paper, Google researchers detailed techniques that employ spelling correction to reduce errors by 29%, and in another study they applied AI to sound wave visuals to achieve state-of-the-art recognition performance without the use of a language model.

Parallel efforts include SpecAugment, which achieves impressively low word error rates by applying visual analysis data augmentation to mel-scale spectrograms. In production, devices like the Pixel 4 and Pixel 4 XL (in the U.S., U.K., Canada, Ireland, Singapore, and Australia) feature an improved Google Assistant English language model that works offline and processes speech at “nearly zero” latency, delivering answers up to 10 times faster than on previous-generation devices.

Of course, baseline language understanding isn’t enough. Without localization, voice assistants can’t pick up on cultural idiosyncrasies, or worse they run the risk of misappropriation. It takes an estimated 30 to 90 days to build a query-understanding module for a new language, depending on how many intents it needs to cover. And even market-leading smart speakers from the likes of Google and Amazon have trouble understanding certain accents.

Google’s increasingly creative approaches promise to close the gap, however. In September, scientists at the company proposed a speech parser that learns to transcribe multiple languages while at the same time demonstrating “dramatic” improvements in quality, and in October they detailed a “universal” machine translation system trained on over 25 billion samples that’s capable of handling 103 languages.

This work no doubt informed Google Assistant’s multilingual mode, which, like Alexa’s multilingual mode, recognizes up to two languages simultaneously.

Speech synthesis

Generating speech is just as challenging as comprehension, if not more so.

While cutting-edge text to speech (TTS) systems like Google’s Tacotron 2 (which builds voice synthesis models based on spectrograms) and WaveNet 2 (which builds models based on waveforms) learn languages more or less from speech alone, conventional systems tap a database of phones — distinct speech sounds or gestures — strung together to verbalize words. Concatenation, as it’s called, requires capturing the complementary diphones (units of speech comprising two connected halves of phones) and triphones (phones with half of a preceding phone at the beginning and a succeeding phone at the end) in lengthy recording sessions. The number of speech units can easily exceed a thousand.

Another technique — parametric TTS — taps mathematical models to recreate sounds that are then assembled into words and sentences. The data required to generate those sounds is stored in the parameters (variables), and the speech itself is created using a vocoder, which is a voice codec (a coder-decoder) that analyzes and synthesizes the output signals.

Still, TTS is an easier problem to tackle than language comprehension — particularly with deep neural networks like WaveNet 2 at speech engineers’ disposal. Translatotron, which was demoed last May, can translate a person’s voice into another language while retaining their tone and tenor. And in August, Google AI researchers showed that they could drastically improve the quality of speech synthesis and generation using audio data sets from both native and non-native English speakers who have neurodegenerative diseases and techniques from Parrotron, an AI tool for people with impediments.

In a related development, in a pair of papers Google researchers recently revealed ways to make machine-generated speech sound more natural. In a study coauthored by Tacotron co-creator Yuxuan Wang, transfer of things like stress level were achieved by embedding style from a recorded clip of human speech. As for the method described in the second paper, it identified vocal patterns to imitate speech styles like those resulting from anger and tiredness.

How language support might improve in the future

Clearly, Google Assistant has progressed furthest on the assistant language front. So what might it take to get others on the same footing?

Improving assistants’ language support will likely require innovations in speech recognition, as well as NLP. With a “true” neural network stack — one that doesn’t rely heavily on language libraries, keywords, or dictionaries — the emphasis shifts from grammar structures to word embeddings and the relational patterns within word embeddings. Then it becomes possible to train a voice recognition system on virtually any language.

Amazon appears to be progressing toward this with Alexa. Researchers at the company managed to cut down on recognition flubs by 20% to 22% using methods that combined human and machine data labeling, and by a further 15% using a novel noise-isolating AI and machine learning technique. Separately, they proposed an approach involving “teaching” language models new tongues by adapting those trained on one language to others, in the process reducing the data requirement for new languages by up to 50%.

Separately, on the TTS side of the equation, Amazon recently rolled out neural TTS tech in Alexa that improves speech quality by increasing naturalness and expressiveness. Not to be outdone, the latest version of Apple’s iOS mobile operating system, iOS 13, introduces a WaveNet-like TTS technology that makes synthesized voices sound more natural. And last December Microsoft demoed a system — FastSpeech — that speeds up realistic voice generation by eliminating errors like word skipping.

Separately, Microsoft recently open-sourced a version of Google’s popular BERT model that enables developers to deploy BERT at scale. This arrived after researchers at the Seattle company created an AI model — a Multi-Task Deep Neural Network (MT-DNN) — that incorporates BERT to achieve state-of-the-art results, and after a team of applied scientists at Microsoft proposed a baseline-besting architecture for language generation tasks.

Undoubtedly, Google, Apple, Microsoft, Amazon, Samsung, and others are already using techniques beyond those described above to bring new languages to their respective voice assistants. But some had a head start, and others have to contend with legacy systems. That’s why it will likely take time before they’re all speaking the same languages.

Let’s block ads! (Why?)



Source link

Continue Reading

Tech

After Heated Battle, Genshin Impact Wins Player's Voice at The 2022 Game Awards – IGN

Published

 on


Genshin Impact has won the Player’s Voice award at The Game Awards 2022, following an intense battle against Elden Ring and Sonic Frontiers.

Unlike other awards bestowed at The Game Awards, which are primarily determined by members of the press and other influential individuals in the industry, the Player’s Voice category is 100% fan-voted. Earlier this week, ahead o the show, the results projected that Genshin Impact would edge out both Sonic Frontiers and Elden Ring. Other nominees for the Player’s Voice category include God of War: Ragnarok and Stray.

Although Elden Ring, in particular, did not get selected, FromSoftware’s latest project did go home with a few awards, including Best Art Direction and Best Game Direction. For more on the winners from this year’s Game Awards, check out our roundup that features the nominees and winners of each category.

Genius Dog 336 x 280 - Animated

Taylor is the Associate Tech Editor at IGN. You can follow her on Twitter @TayNixster.

Adblock test (Why?)



Source link

Continue Reading

Tech

Video: Super Mario Bros. Movie "Mushroom Kingdom" Official Reveal – Nintendo Life

Published

 on


When he’s not paying off a loan to Tom Nook, Liam likes to report on the latest Nintendo news and admire his library of video games. His favourite Nintendo character used to be a guitar-playing dog, but nowadays he prefers to hang out with Judd the cat.

Adblock test (Why?)



Source link

Genius Dog 336 x 280 - Animated
Continue Reading

Tech

Quebec judge authorizes class-action lawsuit against ‘addictive’ Fortnite video game

Published

 on

A Superior Court judge has authorized a lawsuit brought by Quebec parents who allege their children became addicted to the popular online video game Fortnite.

Justice Sylvain Lussier issued the ruling on Wednesday after hearing arguments in July regarding the class-action request from three parents who described how their children had symptoms of severe dependence after playing the game.

“The court concludes that there is a serious issue to be argued, supported by sufficient and specific allegations as to the existence of risks or even dangers arising from the use of Fortnite,” the judge ruled, noting that the action “does not appear frivolous or manifestly ill-founded.”

Genius Dog 336 x 280 - Animated

The firm that brought the suit, Montreal-based Calex Legal, has drawn parallels to a landmark civil suit mounted against the tobacco industry in Quebec, which alleged an intent to create something addictive without proper warning.

“Our motion was heavily inspired by the tobacco motion just in terms of what we were alleging,” lawyer Alessandra Esposito Chartrand said in an interview. The manufacturer’s legal responsibility is “basically the same,” she added.

The parents alleged the game was deliberately made highly addictive and has had a lasting effect on their children, but the court did not go that far.

“The court finds that there is no evidence for these allegations of the deliberate creation of an addictive game,” the judge noted. “This does not exclude the possibility that the game is in fact addictive and that its designer and distributor are presumed to know it.”

One of the parents, identified by initials in the filings, said their son had played 6,923 games and got angry when his parents tried to limit his game time, including by putting a lock on the computer. Another child played more than 7,700 times in two years playing a minimum of three hours a day. All reported behavioural issues.

The judge authorized the lawsuit for any players residing in Quebec since Sept. 1, 2017, who have become addicted after playing Fortnite Battle Royale, made by U.S.-based Epic Games Inc., exhibiting a host of repercussions on activities including family, social, educational or professional.

There is no dollar figure attached to the lawsuit, with any potential compensation to be determined by the court.

A second category in the class action will look at in-game purchases, with the court declaring purchasers under the age of 18 could be eligible for restitution and a refund of their money.

As of Wednesday, Esposito Chartrand said 200 people have come forward.

Epic Games did not respond to a message seeking comment on Thursday. The defendants have 30 days to seek permission to appeal.

The company’s lawyers had argued before the court that the evidence provided was insufficient and that video game dependence is not a recognized condition in Quebec, adding that the American Psychiatric Association says there is insufficient evidence to classify it as a unique mental disorder.

The judge said those issues would be argued on the merits, but noted the World Health Organization in 2018 declared video game addiction, or “gaming disorder,” a disease.

“The fact that American psychiatrists have requested more research or that this diagnosis has not yet been officially recognized in Quebec does not make the claims in question ‘frivolous’ or ‘unfounded,”’ Lussier wrote.

“The harmful effect of tobacco was not recognized or admitted overnight,” he added.

Source link

Continue Reading

Trending