A new technique from Zhejiang University and Alibaba Group gives large language model (LLM) agents a dynamic memory, making them more efficient and effective at complex tasks. The technique, called Memp, provides agents with a “procedural memory” that is continuously updated as they gain experience, much like how humans learn from practice.
Memp creates a lifelong learning framework where…
MCP-Universe benchmark shows GPT-5 fails more than half of real-world orchestration tasks
August 26, 2025
The adoption of interoperability standards, such as the Model Context Protocol (MCP), can provide enterprises with insights into how agents and models function outside their walled confines. However, many benchmarks fail to capture real-life interactions with…
A new study from Arizona State University researchers suggests that the celebrated “Chain-of-Thought” (CoT) reasoning in Large Language Models (LLMs) may be more of a “brittle mirage” than genuine intelligence. The research builds on a growing body of work…
When OpenAI launched GPT-5 about two weeks ago, CEO Sam Altman promised it would be the company’s “smartest, fastest, most useful model yet.” Instead, the launch triggered one of the most contentious user revolts in the brief history of consumer AI.
Now, a simple blind testing tool created by an anonymous developer is revealing the complex reality behind the backlash—and challenging…
Meta is partnering with Midjourney and will license its technology for ‘future models and products’
August 25, 2025
Even three years after its debut, with ever increasing competition in the AI image and video generation space, Midjourney, the bootstrapped San Francisco startup, remains the “gold standard” for its 20 million users — including us here at VentureBeat, where we use it…
OpenCUA’s open source computer-use agents rival proprietary models from OpenAI and Anthropic
August 25, 2025
A new framework from researchers at The University of Hong Kong (HKU) and collaborating institutions provides an open source foundation for creating robust AI agents that can operate computers. The framework, called OpenCUA, includes the tools, data, and recipes for scaling…
Busted by the em dash — AI’s favorite punctuation mark, and how it’s blowing your cover
August 25, 2025
Let’s talk about the em dash. Not the little innocent hyphen, not its slightly more confident cousin, the en dash. No, I’m talking about the ‘EM dash,’ that long, dramatic line that AI looooooves to drop in your sentences like it’s getting paid per dash. Seriously, it’s the AI version of jazz hands.
You may not notice it, but most everyone else does. It’s the dead giveaway that…
The Chan Zuckerberg Initiative announced Thursday the launch of rBio, the first artificial intelligence model trained to reason about cellular biology using virtual simulations rather than requiring expensive laboratory experiments — a breakthrough that could dramatically…
The most widely cited statistic from a new MIT report has been deeply misunderstood. While headlines trumpet that “95% of generative AI pilots at companies are failing,” the report actually reveals something far more remarkable: the fastest and most successful enterprise…
Inside Walmart’s AI security stack: How a startup mentality is hardening enterprise-scale defense
August 22, 2025
VentureBeat recently sat down (virtually) with Jerry R. Geisler III, Executive Vice President and Chief Information Security Officer at Walmart Inc., to gain insights into the cybersecurity challenges the world’s largest retailer faces as AI becomes increasingly autonomous.
We talked about securing agentic AI systems, modernizing identity management and the critical lessons learned from building…