MobileNews

YouTube now does AI noise cancellation for Stories on iOS

You may notice a marked improvement in the audio quality of some YouTube Stories going forward, thanks to a new speech enhancement feature Google rolled out. A couple of years ago, the tech giant debuted the “Looking to Listen” AI technology that can pick out voices in a crowd. Now, it’s making the technology available to creators recording YouTube Stories on iOS devices.

Google taught Looking to Listen the correlations between speech and visual signals, such as the speaker’s mouth movements and facial expressions, by training it on a large collection of online videos. To ensure that it will work for everyone and won’t show bias, Google conducted a series of tests exploring its performance based on various visual and auditory attributes. Those attributes include the speaker’s age, skin tone, spoken language, voice pitch, visibility of their face, head pose, facial hair, presence of glasses and the level of background noise. They were able to determine, for instance, that the technology’s capability to enhance speech remains pretty consistent across speakers’ languages. Facial hair doesn’t seem to have a big effect on it either, though it works best on faces with no facial hair and those with a close shave.

The tech giant also went on to explain in its announcement post how it has improved the technology over the past couple of years. To start with, the developers made sure that it can do all the processing on the device itself, so it doesn’t need to send anything to a remote server. They also used a technique that allows the feature to extract thumbnails with faces from videos for analysis very quickly. That allows the technology to start speech enhancement while the video is still being recorded. Those improvements shrunk the feature’s size from 120MB to 6MB, making it easier to deploy. Google says they also “reduced [Looking to Listen’s] running time from 10x real-time on a desktop using the original formulation… to 0.5x real-time performance using only an iPhone CPU.” In fact, it’ll only take the technology a couple of seconds to process a 15-second Story.

To activate the feature, creators only have to toggle on “Enhance speech” in volume controls on iOS. You can also watch it in action in the videos below.

Check out the latest Apple iPhones at great prices from Gizmofashion – our recommended retail partner.


Author: Mariella Moon, @mariella_moon
1h ago

Source: Engadget

Related posts
AI & RoboticsNews

How Amex uses AI to increase efficiency: 40% fewer IT escalations, 85% travel assistance boost

AI & RoboticsNews

The tool integration problem that’s holding back enterprise AI (and how CoTools solves it)

AI & RoboticsNews

Beyond generic benchmarks: How Yourbench lets enterprises evaluate AI models against actual data

Cleantech & EV'sNews

Hyundai just unveiled its 'Dream Car' — but will it bring the funky Insteroid EV to life?

Sign up for our Newsletter and
stay informed!