AI & Robotics News

Google’s Gemini AI launch marred by questions over capabilities

December 9, 2023

Google unveiled its much-anticipated artificial intelligence system Gemini on Wednesday, touting benchmarks suggesting it could compete with OpenAI’s industry-leading GPT-4 model in reasoning abilities. But the launch has quickly been overshadowed by accusations that the tech giant overstated Gemini’s capabilities.

In a tightly choreographed video demonstration, Google showed Gemini interacting with visual data through a camera mounted above a desk, fielding questions and reasoning through problems as a human assistant manipulated objects. The slick presentation implied Gemini could serve as an intelligent digital assistant capable of sophisticated conversation and assistance with daily tasks.

Yet tech experts analyzing the underlying technology behind the scenes say Gemini may fail to live up to Google’s lofty aspirations. The company is rolling out Gemini in three versions — Gemini Pro, Gemini Light and Gemini Ultra. But early reviews of the mid-range Pro version made public on Wednesday indicate it still struggles with tasks that should be routine for a state-of-the-art AI system.

“I’m extremely disappointed with Gemini Pro on Bard,” said Victor de Lucca, an early tester of the Bard update, in an X.com post showing that the AI system was not able to correctly list the 2023 Oscar winners. “It still gives very, very bad results to questions that shouldn’t be hard anymore with RAG.”

I’m extremely disappointed with Gemini Pro on Bard. It still give very, very bad results to questions that shouldn’t be hard anymore with RAG.

A simple question like this with a simple answer like this, and it still got it WRONG. pic.twitter.com/5GowXtscRU

Others pointed out discrepancies between the capabilities Google claimed in its benchmark testing and what appears possible with the publicly available Pro version.

“Google Gemini Ultra [is] only 4% better…using different prompts versus GPT-4-0613?” asked developer Nick Dobos in a widely shared post on X.com, suggesting the comparison was misleading.

Google Gemini Ultra
4% better
Using different prompts?
Vs gpt-4-0613, the 5 month old version??

Not available publicly???
Only Gemini Pro???

This benchmark is crazy,
look at the units they used
??? pic.twitter.com/72VH5HIIED

The slick Gemini video also came under fire after a Google spokesperson confirmed to Bloomberg that the footage was pre-recorded and narrated after the fact, rather than representing a live conversational demo.

The controversy illustrates the challenges Google faces in marketing AI systems to consumers. While techies eagerly dissect benchmark numbers and academic papers, the general public responds more to inspirational videos promising a revolutionary future.

This disconnect has tripped up big tech companies before, perhaps most infamously in 2016 when Microsoft’s Tay chatbot was yanked offline after learning hate speech from Twitter users. This is also the second time Google Bard has been accused by the tech community of falling short of the company’s promise. In September, VentureBeat reported that Google Bard was still failing to deliver on its promise — even after major updates.

Google is, of course, aiming to recover quickly, promising to make Gemini more widely available to developers and researchers who can fully put it through its paces. But the rocky start shows the tech giant still has work to do if it wants its AI assistant to measure up to the hype.

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.

Google unveiled its much-anticipated artificial intelligence system Gemini on Wednesday, touting benchmarks suggesting it could compete with OpenAI’s industry-leading GPT-4 model in reasoning abilities. But the launch has quickly been overshadowed by accusations that the tech giant overstated Gemini’s capabilities.

In a tightly choreographed video demonstration, Google showed Gemini interacting with visual data through a camera mounted above a desk, fielding questions and reasoning through problems as a human assistant manipulated objects. The slick presentation implied Gemini could serve as an intelligent digital assistant capable of sophisticated conversation and assistance with daily tasks.

Yet tech experts analyzing the underlying technology behind the scenes say Gemini may fail to live up to Google’s lofty aspirations. The company is rolling out Gemini in three versions — Gemini Pro, Gemini Light and Gemini Ultra. But early reviews of the mid-range Pro version made public on Wednesday indicate it still struggles with tasks that should be routine for a state-of-the-art AI system.

“I’m extremely disappointed with Gemini Pro on Bard,” said Victor de Lucca, an early tester of the Bard update, in an X.com post showing that the AI system was not able to correctly list the 2023 Oscar winners. “It still gives very, very bad results to questions that shouldn’t be hard anymore with RAG.”

VB Event

The AI Impact Tour

Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!

I’m extremely disappointed with Gemini Pro on Bard. It still give very, very bad results to questions that shouldn’t be hard anymore with RAG.

A simple question like this with a simple answer like this, and it still got it WRONG. pic.twitter.com/5GowXtscRU

— Vitor de Lucca ?️‍? / threads.net/@vitor_dlucca (@vitor_dlucca) December 7, 2023

Others pointed out discrepancies between the capabilities Google claimed in its benchmark testing and what appears possible with the publicly available Pro version.

“Google Gemini Ultra [is] only 4% better…using different prompts versus GPT-4-0613?” asked developer Nick Dobos in a widely shared post on X.com, suggesting the comparison was misleading.

Google Gemini Ultra
4% better
Using different prompts?
Vs gpt-4-0613, the 5 month old version??

Not available publicly???
Only Gemini Pro???

This benchmark is crazy,
look at the units they used
??? pic.twitter.com/72VH5HIIED

— Nick Dobos (@NickADobos) December 6, 2023

The slick Gemini video also came under fire after a Google spokesperson confirmed to Bloomberg that the footage was pre-recorded and narrated after the fact, rather than representing a live conversational demo.

The controversy illustrates the challenges Google faces in marketing AI systems to consumers. While techies eagerly dissect benchmark numbers and academic papers, the general public responds more to inspirational videos promising a revolutionary future.

This disconnect has tripped up big tech companies before, perhaps most infamously in 2016 when Microsoft’s Tay chatbot was yanked offline after learning hate speech from Twitter users. This is also the second time Google Bard has been accused by the tech community of falling short of the company’s promise. In September, VentureBeat reported that Google Bard was still failing to deliver on its promise — even after major updates.

Google is, of course, aiming to recover quickly, promising to make Gemini more widely available to developers and researchers who can fully put it through its paces. But the rocky start shows the tech giant still has work to do if it wants its AI assistant to measure up to the hype.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Author: Michael Nuñez
Source: Venturebeat
Reviewed By: Editorial Team

181

0