Tech

Why Google Assistant supports so many more languages than Siri, Alexa, Bixby, and Cortana – VentureBeat

Published

5 years ago

January 29, 2020

Google Assistant, Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana recognize only a narrow slice of the world’s most widely spoken languages. It wasn’t until fall 2018 that Samsung’s Bixby gained support for German, French, Italian, and Spanish — languages spoken by over 600 million people worldwide. And it took years for Cortana to become fluent in Spanish, French, and Portuguese.

But Google — which was already ahead of the competition a year ago with respect to the number of languages its assistant supported — pulled far ahead this year. With the addition of more than 20 new languages in January 2019, and more recently several Indic languages, Google Assistant cemented its lead with over 40 languages in well over 80 countries, up from eight languages and 14 countries in 2017. (Despite repeated requests, Google would not provide an exact number of languages for Google Assistant.) That’s compared with Siri’s 21 supported languages, Alexa’s and Bixby’s seven languages, and Cortana’s eight languages.

So why has Google Assistant pulled so far ahead? Naturally, some of the techniques underpinning Google’s natural language processing (NLP) remain closely guarded trade secrets. But the Mountain View company’s publicly available research sheds some — albeit not much — light on why rivals like Amazon and Apple have yet to match its linguistic prowess.

Supporting a new language is hard

Adding language support to a voice assistant is a multi-pronged process that requires considerable research into speech recognition and voice synthesis.

Most modern speech recognition systems incorporate deep neural networks that predict the phonemes, or perceptually distinct units of sound (for example, p, b, and d in the English words pad, pat, and bad). Unlike older techniques, which relied on hand-tuned statistical models that calculated probabilities for combinations of words to occur in a phrase, neural nets derive characters from representations of audio frequencies called mel-scale spectrograms. This reduces error rates while partially eliminating the need for human supervision.

Speech recognition has advanced significantly, particularly in the past year or so. In a paper, Google researchers detailed techniques that employ spelling correction to reduce errors by 29%, and in another study they applied AI to sound wave visuals to achieve state-of-the-art recognition performance without the use of a language model.

Parallel efforts include SpecAugment, which achieves impressively low word error rates by applying visual analysis data augmentation to mel-scale spectrograms. In production, devices like the Pixel 4 and Pixel 4 XL (in the U.S., U.K., Canada, Ireland, Singapore, and Australia) feature an improved Google Assistant English language model that works offline and processes speech at “nearly zero” latency, delivering answers up to 10 times faster than on previous-generation devices.

Of course, baseline language understanding isn’t enough. Without localization, voice assistants can’t pick up on cultural idiosyncrasies, or worse they run the risk of misappropriation. It takes an estimated 30 to 90 days to build a query-understanding module for a new language, depending on how many intents it needs to cover. And even market-leading smart speakers from the likes of Google and Amazon have trouble understanding certain accents.

Google’s increasingly creative approaches promise to close the gap, however. In September, scientists at the company proposed a speech parser that learns to transcribe multiple languages while at the same time demonstrating “dramatic” improvements in quality, and in October they detailed a “universal” machine translation system trained on over 25 billion samples that’s capable of handling 103 languages.

This work no doubt informed Google Assistant’s multilingual mode, which, like Alexa’s multilingual mode, recognizes up to two languages simultaneously.

Speech synthesis

Generating speech is just as challenging as comprehension, if not more so.

While cutting-edge text to speech (TTS) systems like Google’s Tacotron 2 (which builds voice synthesis models based on spectrograms) and WaveNet 2 (which builds models based on waveforms) learn languages more or less from speech alone, conventional systems tap a database of phones — distinct speech sounds or gestures — strung together to verbalize words. Concatenation, as it’s called, requires capturing the complementary diphones (units of speech comprising two connected halves of phones) and triphones (phones with half of a preceding phone at the beginning and a succeeding phone at the end) in lengthy recording sessions. The number of speech units can easily exceed a thousand.

Another technique — parametric TTS — taps mathematical models to recreate sounds that are then assembled into words and sentences. The data required to generate those sounds is stored in the parameters (variables), and the speech itself is created using a vocoder, which is a voice codec (a coder-decoder) that analyzes and synthesizes the output signals.

Still, TTS is an easier problem to tackle than language comprehension — particularly with deep neural networks like WaveNet 2 at speech engineers’ disposal. Translatotron, which was demoed last May, can translate a person’s voice into another language while retaining their tone and tenor. And in August, Google AI researchers showed that they could drastically improve the quality of speech synthesis and generation using audio data sets from both native and non-native English speakers who have neurodegenerative diseases and techniques from Parrotron, an AI tool for people with impediments.

In a related development, in a pair of papers Google researchers recently revealed ways to make machine-generated speech sound more natural. In a study coauthored by Tacotron co-creator Yuxuan Wang, transfer of things like stress level were achieved by embedding style from a recorded clip of human speech. As for the method described in the second paper, it identified vocal patterns to imitate speech styles like those resulting from anger and tiredness.

How language support might improve in the future

Clearly, Google Assistant has progressed furthest on the assistant language front. So what might it take to get others on the same footing?

Improving assistants’ language support will likely require innovations in speech recognition, as well as NLP. With a “true” neural network stack — one that doesn’t rely heavily on language libraries, keywords, or dictionaries — the emphasis shifts from grammar structures to word embeddings and the relational patterns within word embeddings. Then it becomes possible to train a voice recognition system on virtually any language.

Amazon appears to be progressing toward this with Alexa. Researchers at the company managed to cut down on recognition flubs by 20% to 22% using methods that combined human and machine data labeling, and by a further 15% using a novel noise-isolating AI and machine learning technique. Separately, they proposed an approach involving “teaching” language models new tongues by adapting those trained on one language to others, in the process reducing the data requirement for new languages by up to 50%.

Separately, on the TTS side of the equation, Amazon recently rolled out neural TTS tech in Alexa that improves speech quality by increasing naturalness and expressiveness. Not to be outdone, the latest version of Apple’s iOS mobile operating system, iOS 13, introduces a WaveNet-like TTS technology that makes synthesized voices sound more natural. And last December Microsoft demoed a system — FastSpeech — that speeds up realistic voice generation by eliminating errors like word skipping.

Separately, Microsoft recently open-sourced a version of Google’s popular BERT model that enables developers to deploy BERT at scale. This arrived after researchers at the Seattle company created an AI model — a Multi-Task Deep Neural Network (MT-DNN) — that incorporates BERT to achieve state-of-the-art results, and after a team of applied scientists at Microsoft proposed a baseline-besting architecture for language generation tasks.

Undoubtedly, Google, Apple, Microsoft, Amazon, Samsung, and others are already using techniques beyond those described above to bring new languages to their respective voice assistants. But some had a head start, and others have to contend with legacy systems. That’s why it will likely take time before they’re all speaking the same languages.

Let’s block ads! (Why?)

Source link

Tech

Ottawa orders TikTok’s Canadian arm to be dissolved

Published

3 days ago

November 6, 2024

Anne Joseph

The federal government is ordering the dissolution of TikTok’s Canadian business after a national security review of the Chinese company behind the social media platform, but stopped short of ordering people to stay off the app.

Industry Minister François-Philippe Champagne announced the government’s “wind up” demand Wednesday, saying it is meant to address “risks” related to ByteDance Ltd.’s establishment of TikTok Technology Canada Inc.

“The decision was based on the information and evidence collected over the course of the review and on the advice of Canada’s security and intelligence community and other government partners,” he said in a statement.

The announcement added that the government is not blocking Canadians’ access to the TikTok application or their ability to create content.

However, it urged people to “adopt good cybersecurity practices and assess the possible risks of using social media platforms and applications, including how their information is likely to be protected, managed, used and shared by foreign actors, as well as to be aware of which country’s laws apply.”

Champagne’s office did not immediately respond to a request for comment seeking details about what evidence led to the government’s dissolution demand, how long ByteDance has to comply and why the app is not being banned.

A TikTok spokesperson said in a statement that the shutdown of its Canadian offices will mean the loss of hundreds of well-paying local jobs.

“We will challenge this order in court,” the spokesperson said.

“The TikTok platform will remain available for creators to find an audience, explore new interests and for businesses to thrive.”

The federal Liberals ordered a national security review of TikTok in September 2023, but it was not public knowledge until The Canadian Press reported in March that it was investigating the company.

At the time, it said the review was based on the expansion of a business, which it said constituted the establishment of a new Canadian entity. It declined to provide any further details about what expansion it was reviewing.

A government database showed a notification of new business from TikTok in June 2023. It said Network Sense Ventures Ltd. in Toronto and Vancouver would engage in “marketing, advertising, and content/creator development activities in relation to the use of the TikTok app in Canada.”

Even before the review, ByteDance and TikTok were lightning rod for privacy and safety concerns because Chinese national security laws compel organizations in the country to assist with intelligence gathering.

Such concerns led the U.S. House of Representatives to pass a bill in March designed to ban TikTok unless its China-based owner sells its stake in the business.

Champagne’s office has maintained Canada’s review was not related to the U.S. bill, which has yet to pass.

Canada’s review was carried out through the Investment Canada Act, which allows the government to investigate any foreign investment with potential to might harm national security.

While cabinet can make investors sell parts of the business or shares, Champagne has said the act doesn’t allow him to disclose details of the review.

Wednesday’s dissolution order was made in accordance with the act.

The federal government banned TikTok from its mobile devices in February 2023 following the launch of an investigation into the company by federal and provincial privacy commissioners.

— With files from Anja Karadeglija in Ottawa

This report by The Canadian Press was first published Nov. 6, 2024.

Source link

Health

Here is how to prepare your online accounts for when you die

Published

2 weeks ago

October 24, 2024

Anne Joseph

LONDON (AP) — Most people have accumulated a pile of data — selfies, emails, videos and more — on their social media and digital accounts over their lifetimes. What happens to it when we die?

It’s wise to draft a will spelling out who inherits your physical assets after you’re gone, but don’t forget to take care of your digital estate too. Friends and family might treasure files and posts you’ve left behind, but they could get lost in digital purgatory after you pass away unless you take some simple steps.

Here’s how you can prepare your digital life for your survivors:

Apple

The iPhone maker lets you nominate a “ legacy contact ” who can access your Apple account’s data after you die. The company says it’s a secure way to give trusted people access to photos, files and messages. To set it up you’ll need an Apple device with a fairly recent operating system — iPhones and iPads need iOS or iPadOS 15.2 and MacBooks needs macOS Monterey 12.1.

For iPhones, go to settings, tap Sign-in & Security and then Legacy Contact. You can name one or more people, and they don’t need an Apple ID or device.

You’ll have to share an access key with your contact. It can be a digital version sent electronically, or you can print a copy or save it as a screenshot or PDF.

Take note that there are some types of files you won’t be able to pass on — including digital rights-protected music, movies and passwords stored in Apple’s password manager. Legacy contacts can only access a deceased user’s account for three years before Apple deletes the account.

Google

Google takes a different approach with its Inactive Account Manager, which allows you to share your data with someone if it notices that you’ve stopped using your account.

When setting it up, you need to decide how long Google should wait — from three to 18 months — before considering your account inactive. Once that time is up, Google can notify up to 10 people.

You can write a message informing them you’ve stopped using the account, and, optionally, include a link to download your data. You can choose what types of data they can access — including emails, photos, calendar entries and YouTube videos.

There’s also an option to automatically delete your account after three months of inactivity, so your contacts will have to download any data before that deadline.

Facebook and Instagram

Some social media platforms can preserve accounts for people who have died so that friends and family can honor their memories.

When users of Facebook or Instagram die, parent company Meta says it can memorialize the account if it gets a “valid request” from a friend or family member. Requests can be submitted through an online form.

The social media company strongly recommends Facebook users add a legacy contact to look after their memorial accounts. Legacy contacts can do things like respond to new friend requests and update pinned posts, but they can’t read private messages or remove or alter previous posts. You can only choose one person, who also has to have a Facebook account.

You can also ask Facebook or Instagram to delete a deceased user’s account if you’re a close family member or an executor. You’ll need to send in documents like a death certificate.

TikTok

The video-sharing platform says that if a user has died, people can submit a request to memorialize the account through the settings menu. Go to the Report a Problem section, then Account and profile, then Manage account, where you can report a deceased user.

Once an account has been memorialized, it will be labeled “Remembering.” No one will be able to log into the account, which prevents anyone from editing the profile or using the account to post new content or send messages.

It’s not possible to nominate a legacy contact on Elon Musk’s social media site. But family members or an authorized person can submit a request to deactivate a deceased user’s account.

Passwords

Besides the major online services, you’ll probably have dozens if not hundreds of other digital accounts that your survivors might need to access. You could just write all your login credentials down in a notebook and put it somewhere safe. But making a physical copy presents its own vulnerabilities. What if you lose track of it? What if someone finds it?

Instead, consider a password manager that has an emergency access feature. Password managers are digital vaults that you can use to store all your credentials. Some, like Keeper,Bitwarden and NordPass, allow users to nominate one or more trusted contacts who can access their keys in case of an emergency such as a death.

But there are a few catches: Those contacts also need to use the same password manager and you might have to pay for the service.

___

Is there a tech challenge you need help figuring out? Write to us at onetechtip@ap.org with your questions.

Source link

Tech

Google’s partnership with AI startup Anthropic faces a UK competition investigation

Published

2 weeks ago

October 24, 2024

Anne Joseph

LONDON (AP) — Britain’s competition watchdog said Thursday it’s opening a formal investigation into Google’s partnership with artificial intelligence startup Anthropic.

The Competition and Markets Authority said it has “sufficient information” to launch an initial probe after it sought input earlier this year on whether the deal would stifle competition.

The CMA has until Dec. 19 to decide whether to approve the deal or escalate its investigation.

“Google is committed to building the most open and innovative AI ecosystem in the world,” the company said. “Anthropic is free to use multiple cloud providers and does, and we don’t demand exclusive tech rights.”

San Francisco-based Anthropic was founded in 2021 by siblings Dario and Daniela Amodei, who previously worked at ChatGPT maker OpenAI. The company has focused on increasing the safety and reliability of AI models. Google reportedly agreed last year to make a multibillion-dollar investment in Anthropic, which has a popular chatbot named Claude.

Anthropic said it’s cooperating with the regulator and will provide “the complete picture about Google’s investment and our commercial collaboration.”

“We are an independent company and none of our strategic partnerships or investor relationships diminish the independence of our corporate governance or our freedom to partner with others,” it said in a statement.

The U.K. regulator has been scrutinizing a raft of AI deals as investment money floods into the industry to capitalize on the artificial intelligence boom. Last month it cleared Anthropic’s $4 billion deal with Amazon and it has also signed off on Microsoft’s deals with two other AI startups, Inflection and Mistral.

Source link