Browsing tag

LLMs

AI & Robotics News

‘Generative AI helps us bend time’: CrowdStrike, Nvidia embed real-time LLM defense, changing how enterprises secure AI

June 12, 2025

Generative AI adoption has surged by 187% over the past two years. But at the same time, enterprise security investments focused specifically on AI risks have grown by only 43%, creating a significant gap in preparedness as AI attack surfaces rapidly expand. More than 70% of enterprises experienced at least one AI-related breach in the past year alone, with generative models now the primary…

AI & Robotics News

When your LLM calls the cops: Claude 4’s whistle-blow and the new agentic AI risk stack

June 2, 2025

The recent uproar surrounding Anthropic’s Claude 4 Opus model – specifically, its tested ability to proactively notify authorities and the media if it suspected nefarious user activity – is sending a cautionary ripple through the enterprise AI landscape. While…

AI & Robotics News

The 3 biggest bombshells from this week’s AI extravaganza

May 27, 2025

Basketball has March Madness. Tech has the Consumer Electronics Show. AI has been waiting for its big moment—and this week may finally be it. With Microsoft’s Build and Google’s I/O developer conferences happening back-to-back, it was already primed to be a big week.

AI & Robotics News

The battle to AI-enable the web: NLweb and what enterprises need to know

May 27, 2025

In the first generation of the web, back in the late 1990s, search was okay but not great, and it wasn’t easy to find things. That led to the rise of syndication protocols in the early 2000s, with Atom and RSS (Really Simple Syndication) providing a simplified way for website owners to make headlines and other content easily available and searchable. In the modern era of AI, a new group of…

AI & Robotics News

Not everything needs an LLM: A framework for evaluating when AI makes sense

May 5, 2025

Question: What product should use machine learning (ML)? Project manager answer: Yes. Jokes aside, the advent of generative AI has upended our understanding of what use cases lend themselves best to ML. Historically, we have always leveraged ML for repeatable, predictive…

AI & Robotics News

30 seconds vs. 3: The d1 reasoning framework that’s slashing AI response times

April 29, 2025

Researchers from UCLA and Meta AI have introduced d1, a novel framework using reinforcement learning (RL) to significantly enhance the reasoning capabilities of diffusion-based large language models (dLLMs). While most attention has focused on autoregressive models like GPT…

AI & Robotics News

Is your AI product actually working? How to develop the right metric system

April 28, 2025

In my first stint as a machine learning (ML) product manager, a simple question inspired passionate debates across functions and leaders: How do we know if this product is actually working? The product in question that I managed catered to both internal and external customers. The model enabled internal teams to identify the top issues faced by our customers so that they could prioritize the right…

AI & Robotics News

Former DeepSeeker and collaborators release new method for training reliable AI agents: RAGEN

April 24, 2025

2025 was, by many expert accounts, supposed to be the year of AI agents — task-specific AI implementations powered by leading large language and multimodal models (LLMs) like the kinds offered by OpenAI, Anthropic, Google, and DeepSeek. But so far, most AI agents remain…

AI & Robotics News

SWiRL: The business case for AI that thinks like your best problem-solvers

April 23, 2025

Researchers from Stanford University and Google DeepMind have unveiled Step-Wise Reinforcement Learning (SWiRL), a technique designed to enhance the ability of large language models (LLMs) to tackle complex tasks requiring multi-step reasoning and tool use. As the interest…

AI & Robotics News

When AI reasoning goes wrong: Microsoft Research shows more tokens can mean more problems

April 16, 2025

In a Nutshell Microsoft Research finds that inference-time scaling methods for large language models don’t universally improve performance. Varying benefits, token inefficiency, and cost unpredictability challenge assumptions. Verification mechanisms enhance model accuracy. Brute-force scaling has limits; conventional models can match reasoning models on simpler tasks but struggle with…