AI & Robotics News

Elon Musk just released an AI that’s smarter than ChatGPT — here’s why that matters

February 19, 2025

Elon Musk's xAI Releases Grok 3 AI Model

Elon Musk’s artificial intelligence startup xAI has unveiled Grok 3, its latest AI model that the company claims outperforms leading competitors across key technical benchmarks. The announcement marks a significant escalation in the race to develop more powerful AI systems.

The launch comes just days after Elon Musk’s failed $97.4 billion bid to acquire OpenAI, the company he co-founded with Sam Altman in 2015. During a livestreamed demonstration on X, Elon Musk characterized Grok 3 as “an order of magnitude more capable than Grok 2” and emphasized its ability to reason through complex problems.

Early testing appears to support some of xAI’s claims. The model topped the influential Chatbot Arena leaderboard, scoring higher than OpenAI’s GPT-4o, Google’s Gemini and DeepSeek’s V3 model in blind user testing. Published benchmarks show Grok 3 achieving superior scores in mathematics (AIME ’24), scientific reasoning (GPQA) and coding tasks.

Grok 3 leads the Chatbot Arena leaderboard with a score of approximately 1400, significantly outperforming other major AI models in blind user testing. (Source: xAI)

Inside Grok 3’s massive computing infrastructure: 200,000 GPUs and a new data center

“Grok 3 clearly has around state of the art thinking capabilities,” wrote former OpenAI researcher Andrej Karpathy in an X post after early-access testing. “Few models get this right reliably. The top OpenAI thinking models get it too, but all of DeepSeek-R1, Gemini 2.0 Flash Thinking, and Claude do not.”

The model’s development required massive computational resources. xAI doubled its GPU cluster to 200,000 Nvidia chips for training, housed in a new Memphis data center. This infrastructure investment highlights the increasing computational demands of advanced AI development, as companies race to build more capable systems.

I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.

Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model (“Think” button) and did great out of the box on my Settler’s of Catan… pic.twitter.com/qIrUAN1IfD

— Andrej Karpathy (@karpathy) February 18, 2025

DeepSearch and advanced reasoning: how Grok 3 aims to outsmart ChatGPT and Google Gemini

A key innovation is Grok 3’s “DeepSearch” feature, which combines web searching with reasoning capabilities to analyze information from multiple sources. The system also includes specialized modes for complex problem-solving, including a “Think” function that shows its reasoning process and a “Big Brain” mode that allocates additional computing power to difficult tasks.

“The thing to really pay attention to in AI is learning speed. And @xai is learning way faster than any other,” posted tech industry veteran Robert Scoble, citing a conversation with Apple Siri cofounder Tom Gruber.

Grok 3 benchmarks.

The thing to really pay attention to in AI is learning speed. And @xai is learning way faster than any other.

Who said that?

Apple Siri cofounder Tom Gruber. He told me at dinner a decade ago that that is the most important thing to pay attention to. pic.twitter.com/yWCiJsN9pU

— Robert Scoble (@Scobleizer) February 18, 2025

However, some limitations emerged during testing. Karpathy noted that the model sometimes fabricates citations and struggles with certain types of humor and ethical reasoning tasks. These challenges are common across current AI systems and highlight the ongoing difficulties in developing truly human-like artificial intelligence.

Scale.ai CEO Alexandr Wang praised the release, tweeting: “Grok 3 is a new best model in the world from the @xai team!” He noted its superior performance on various benchmarks and expressed enthusiasm for future collaboration.

Grok 3 is a new best model in the world from the @xai team!

Grok 3 ranks #1 on Chatbot Arena w/a big gap, and scores impressively on pretraining and reasoning evals.

congrats to @elonmusk @ibab @jimmybajimmyba @Yuhu_ai_

looking forward to more partnership on grok4 & beyond ? pic.twitter.com/BrPGz17P51

— Alexandr Wang (@alexandr_wang) February 18, 2025

AI industry competition heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the future of artificial intelligence

The model will be available through X’s Premium+ subscription ($40/month) and a new standalone “SuperGrok” service ($30/month). Enterprise API access is planned for the coming weeks.

Also Read: Why AI Experts Are Buzzing About DeepSeek

This launch intensifies competition in the AI industry, particularly as Chinese startup DeepSeek recently demonstrated comparable performance with reportedly lower computational requirements. The development also raises questions about the sustainability of the computational arms race in AI, as companies invest billions in increasingly powerful hardware infrastructure.

In key performance benchmarks, Grok 3 and its mini variant show superior scores across mathematics, science and coding tests compared to competing models from Google, OpenAI, Anthropic and DeepSeek. The full-size Grok 3 model (dark blue) achieved particularly strong results in scientific reasoning. (Source: xAI)

Musk emphasized that Grok 3 remains in beta, with improvements expected “almost every day.” The company plans to add voice interaction capabilities within weeks and will open-source its previous model, Grok 2, once the new version stabilizes.

Yet perhaps the most telling aspect of Grok 3’s debut isn’t its technical specifications or benchmark scores, but what it represents: the mounting tension between Musk and his former colleagues at OpenAI. Just days after his failed $97.4 billion bid to acquire OpenAI, Musk has unveiled a model that challenges its supremacy — suggesting that in the high-stakes race for AI dominance, even a rejected suitor can become a formidable rival.

Author: Michael Nuñez
Source: Venturebeat
Reviewed By: Editorial Team

AI Model ChatGPT Elon Musk Grok 3 AI Model

558

0

Worth reading...

AI can fix bugs—but can’t find them: OpenAI’s study highlights limits of LLMs in software engineering