AI & RoboticsNews

How AI powers the Pixel 4

Google’s Pixel 4 and Pixel 4 XL go on sale this week after making their debut at a hardware event in New York last week, and like previous versions of Google’s flagship smartphone, artificial intelligence powers many exclusive new features and improvements.

On-device machine learning is a main theme for the latest Made by Google hardware. The Pixel 4 uses Neural Core, a TPU chip upgrade from the Pixel 3’s Visual Core chip to the Neural Core. Pixel 4 comes with lots pre-existing AI-enabled features like Now Playing for song detection, but there are 4 major changes: Speech recognition, next-gen Google Assistant, a range of new camera features, and facial recognition to unlock the phone and make payments.

Camera

The makers of smartphones no longer sell phones. They sell cameras. That’s why Google spent about as much time talking about the Pixel 4 camera at its unveiling last week as it did talking about the rest of the phone.

AI recommends things like Night Sight in low light settings and powers depth prediction for Portrait Mode images.

Portrait Mode in Pixel 4 is sharp as ever.

Depth prediction Portrait Mode shots with Pixel 4 seem stronger than results with previous Pixel phones, which sometimes seemed much

There’s also improvements for Night Sight. If you’re taking a handheld shot, Night Sight can still deliver impressive results that led some to call this kind of mad science.

If you can prop up a Pixel 4 so it’s still or use some sort of tripod, Night Sight can last 3 minutes or longer, delivering not just more crisp low light imagery, but actually taking photos of stars in the sky. Our initial tests found this is no exaggeration, but a practical fact.

Photos of the night sky is available in Pixel 3, 3a, and 4.

Above: Night Sight with astrophotography on Pixel 4 XL

Another big difference compared to camera on other Pixel phones is that you can shoot ultra 4K video, and a tap and hold of the camera button records video. Previously a tap and hold of the camera button took dozens of photos. A swipe down for extended control gives you the ability to enable things like Frequent Faces.

Machine-learning-based white balance was first introduced for Pixel 3 and continues with Pixel 4 to deliver pictures with accurate color temperature.

Above: Super Res Zoom shot of Oakland Tribune Tower from 4 blocks away in Oakland, California | Image Credit: Khari Johnson / VentureBeat

Super Res Zoom is another major feature for the Pixel 4, and puts to use a new telephoto lens with up to 8x zoom for improved results compared to previous shots with digital zoom.

Frequent Faces records and stores data about people you photograph regularly in order to recognize their faces in order to shape Top Shot photo recommendation results. Frequent Faces stores data about people you photograph often is stored on their Pixel 4.

Facial recognition

Facial recognition powers a number of features in the Pixel 4 like Face Unlock to open the phone or make a payment, and Featured Faces for recognizing people you take pictures of the most.

With Motion Sense radar that’s triggered when the phone senses movement, Google claims Face Unlock is faster than Apple’s iPhone Face ID. Facial recognition with Pixel 4 launches with the ability to verify Google Pay transactions with a face scan, something that was not available for launch for Apple Pay users, but there have been some major growing pains for Google’s first-ever facial recognition for smartphone system.

Sleuthing by BBC found last week found that Face Unlock still works on people who have their eyes closed, a fact that concerned some initial users.

Fact is, even in a Touch ID world of fingerprint scans, a person with bad intent can force someone to open their phone, but it may be easier to see a face than scan a finger.

The most likely misuse of a design flaw like this is probably for a spouse to unlock their partner’s phone, but it’s easy to think of malicious vulnerabilities when it’s widely known that Pixel 4 will work on the faces of people who are unconscious, sleeping, or dead.

Initially Google planned to make no changes, but on Sunday announced a fix that requires detection of blinking eyes will be released as part of a software update in the coming months, a company spokesperson told VentureBeat in an email.

Another potential place for room for improvement is the performance of Google’s facial recognition on people with dark skin.

Weeks before the release of the Pixel 4, New York Daily News reported that contractors working for Google used questionable tactics to improve its facial recognition’s ability to recognize people with dark skin such as being less than upfront about how the face scans would be used or referring to the scans used to improve Google’s facial recognition as a game. The contractor Randstad apparently attended the BET Awards in Los Angeles and rounded up homeless people in Atlanta by handing out $5 Starbucks gift cards.

The revelation drew questions from the Atlanta city attorney, and raising questions about what’s a fair price for an image of a person’s face.

Google and Randstad started after the news reports emerged which is ongoing. As the investigation continues, the face scan collection program has been suspended.

Future updates may lead to performance improvements. On my dark skin, the Pixel 4 was very consistent in ideal conditions with balanced lighting, but there were moments when in reasonable lighting Face Match repeatedly failed to recognize me. After failure to carry out Face Unlock multiple times in a row, the phone suggested that I delete my face profile and create a new one.

Re-enrollment helped some, and no formal count of Face Unlock success or failure instances were recorded as part of this Pixel 4 review, but opening my phone was a routinely hit-and-miss exercise when scanning my face in bed in the morning, in a vehicle at night, with overhead lighting, or other common scenarios with less-than-ideal lighting.

Above: A selfie taken after a failed Face Unlock with Pixel 4 XL | Image Credit: Khari Johnson / VentureBeat

The setup for facial recognition on your phone is takes about 30 seconds of pointing your nose slowly in different directions to complete a face scan. That’s more extensive than the Face Match capture process on a Nest Hub Max smart display, likely because facial recognition replaces the fingerprint scanner that used to be the primary means of unlocking a Pixel phone, and because in that scenario the facial recognition only needs to tell the difference between up to 6 people per household.

Low performance of facial recognition on people with dark skin is an industry wide problem, not just one for Google. More audits and analysis of the system’s recognition of how Google’s facial recognition performs on people with light and dark skin tones as the phone becomes publicly available.

Despite my own experience regularly encountering failed Face Unlock attempts, it’s far too early to refer to it as a failure just yet as some journalists have chosen to do because Google’s just getting started with facial recognition.

If my own experience is any indication, replacing Touch ID fingerprint scans with Face Unlock comes with tradeoffs.

Next-gen Google Assistant

As Google showcased last week, the new Google Assistant can open apps, search the web, get directions, or send Google Assistant search query results to contacts.

Next-gen Google Assistant uses Continued Conversation to enable multi-turn dialogue. That means that after you say the initial “OK, Google” wake word, Google Assistant will carry out the command then continue to listen for additional commands until you say “Stop” or “Thank you” to end the exchange.

Continued Conversation has been available for some time now on smart displays and speakers, but when used on a smartphone supplies a stream of cards and content. This makes for a different experience than what you get with a smart display that changes the imagery on screen after each question. A stream to scroll back and forth in and complete actions with shows you your own stream of conscious.

This practically means that you can very quickly go from asking Google a question about any given topic, then as you dive into a topic and continue to learn more, you can share with friends or act upon that information. You can also interact with an app or website while Google Assistant runs in the background, a true multimodal experience.

There are still shortcomings. Tell Google Assistant to share with a friend and it may only take a screenshot. I asked Google to share an email and podcast episode with a friend on different occasions and it just took a picture. This works when Google is just sharing a weather report for example but not for things like a website or email. A URL link is almost always more helpful.

Also of note: The new Google Assistant uses an on-device language model and Neural Core to indeed make it faster than its counterpart in other smartphones, but it doesn’t mean the end of latency. The new Google Assistant can still encounter delays when met with a slow Wi-Fi or data connection.

And the new Google Assistant will not be available at launch for G Suite accounts. Seems odd the new Google Assistant designed to make you as efficient as possible is unable to work with G Suite.

Finally, the new Google Assistant can interact with apps and does a better job of surfacing your Google Photos content, but it’s still not contextually aware. So instead of sharing a

The new Google Assistant also gets a slightly different look in Pixel 4, appearing only as a glowing set of Google primary colors at the bottom of the screen. Google Assistant with Pixel 4 makes room for real-time transcription of your words. This is helpful to understand that the assistant correctly heard your request, and the movement of words on the screen  let’s you know the assistant is listening and establishes a kind of rhythm to follow for each voice command.

Speech recognition

It’s been true for years that you can use conversational AI to turn speech into words faster than you can type with your thumbs on a smartphone. Speech-to-text transcription can be found in an increasing number of places, and with the Pixel 4 you get automated transcription of people speaking in videos.

Live caption provides text transcription of audio in podcasts, audio messages and video from your camera roll to YouTube. There are occasional misses here, but this is a helpful feature when you can’t listen to audio but still want to enjoy content like video.

A simple tap and hold of the text that appears on the screen can move it around and a double tap expands to show more text.

The new Recorder app can also transcribe your voice recordings, a feature that allows you to search audio files for words and export text transcripts. The Recorder app uses real-time speech-to-text transcription, and sometimes it makes mistakes, and that’s in line with the experience you get with other speech transcription services. Recorder can also automatically identify keywords in transcripts and recommend audio message titles based on keywords in transcripts, or music, applause, or speech.

One downside: The Recorder app does not do a great job of breaking up or labeling speakers in a conversation so those transcribed words can blend into one another from time to time. Software updates to Recorder should probably address the fact that the Recorder adds no timestamp to transcript text that can be exported.

Each of these new features use the same natural language understanding technology that’s been available for years in GBoard for writing in a Google Doc or sending a message.


Author: Khari Johnson
Source: Venturebeat

Related posts
AI & RoboticsNews

The show’s not over: 2024 sees big boost to AI investment

AI & RoboticsNews

AI on your smartphone? Hugging Face’s SmolLM2 brings powerful models to the palm of your hand

AI & RoboticsNews

Why multi-agent AI tackles complexities LLMs can’t

DefenseNews

US Army buys long-flying solar drones to watch over Pacific units

Sign up for our Newsletter and
stay informed!