AI & RoboticsNews

Windsurf: OpenAI’s potential $3B bet to drive the ‘vibe coding’ movement

‘Vibe coding’ is a term of the moment, as it refers to a more accepted use of AI and natural language prompts for basic code completion. OpenAI is reportedly looking to get in on the movement — and own more of the full-stack coding experience — as it eyes a $3 billion acquisition of Windsurf (formerly Codeium). If the deal materializes, it would be OpenAI’s most expensive acquisition to…
Read more
AI & RoboticsNews

When AI reasoning goes wrong: Microsoft Research shows more tokens can mean more problems

In a Nutshell Microsoft Research finds that inference-time scaling methods for large language models don’t universally improve performance. Varying benefits, token inefficiency, and cost unpredictability challenge assumptions. Verification mechanisms enhance model accuracy. Brute-force scaling has limits; conventional models can match reasoning models on simpler tasks but struggle with…
Read more
AI & RoboticsNews

Beyond ARC-AGI: GAIA and the search for a real intelligence benchmark

In a Nutshell Intelligence measurement in AI is evolving beyond traditional benchmarks like MMLU, with new tests like ARC-AGI and Humanity’s Last Exam focusing on real-world reasoning. The GAIA benchmark assesses practical AI capabilities across web browsing, code execution, and complex reasoning, setting a new standard for evaluating AI performance. Intelligence is pervasive, yet its…
Read more
AI & RoboticsNews

ChatGPT’s memory can now reference all past conversations, not just what you tell it to

OpenAI is slowly rolling out better memory on ChatGPT, making it a default for ChatGPT to reference past conversations. This has raised the fear that the platform is proactively “listening” to users, making them uncomfortable with how much the platform knows. ChatGPT already logs information from previous interactions through its Memory feature, ensuring preferences are saved and conversations…
Read more