Enterprises need to know if the models that power their applications and agents work in real-life scenarios. This type of evaluation can sometimes be complex because it is hard to predict specific scenarios. A revamped version of the RewardBench benchmark looks to give organizations a better idea of a model’s real-life performance.
The Allen Institute of AI (Ai2) launched RewardBench 2, an…
OpenAI’s Sora is now available for FREE to all users through Microsoft Bing Video Creator on mobile
June 3, 2025
OpenAI‘s Sora was one of the most hyped releases of the AI era, launching in December 2024, nearly 10 months after it was first previewed to awe-struck reactions due to its — at the time, at least — unprecedented level of realism, camera dynamism, and prompt adherence…
Google quietly launches AI Edge Gallery, letting Android phones run AI without the cloud
June 3, 2025
Google has quietly released an experimental Android application that enables users to run sophisticated artificial intelligence models directly on their smartphones without requiring an internet connection, marking a significant step in the company’s push toward edge…
Enterprise alert: PostgreSQL just became the database you can’t ignore for AI applications
June 3, 2025
The open-source PostgreSQL (sometimes also referred to as Postgres) is apparently a very hot commodity for big enterprise data platform vendors.
Snowflake is acquiring privately-held PostgreSQL provider Crunchy Data, in a deal that is reportedly valued at $250 million. The acquisition comes barely two weeks after Snowflake’s rival Databricks acquired serverless PostgreSQL vendor Neon. The pair…
When Salesforce CEO Marc Benioff recently announced that the company would not hire any more engineers in 2025, citing a “30% productivity increase on engineering” due to AI, it sent ripples through the tech industry. Headlines quickly framed this as the beginning of the…
The recent uproar surrounding Anthropic’s Claude 4 Opus model – specifically, its tested ability to proactively notify authorities and the media if it suspected nefarious user activity – is sending a cautionary ripple through the enterprise AI landscape. While…
In the past couple of years as AI systems have become more capable of not just generating text, but taking actions, making decisions and integrating with enterprise systems, they have come with additional complexities. Each AI model has its own proprietary way of interfacing with other software. Anthropic’s Model Context Protocol (MCP) is one of the first attempts to fill this gap. It proposes a…
DeepSeek R1-0528 arrives in powerful open source challenge to OpenAI o3 and Google Gemini 2.5 Pro
May 30, 2025
The whale has returned.
After rocking the global AI and business community early this year with the January 20 initial release of its hit open source reasoning AI model R1, the Chinese startup DeepSeek — a spinoff of formerly only locally well-known Hong Kong quantitative…
New York-based AI startup Hume has unveiled its latest Empathic Voice Interface (EVI) conversational AI model, EVI 3 (pronounced “Evee” Three, like the Pokémon character), targeting everything from powering customer support systems and health coaching to immersive…
Black Forest Labs (BFL), the startup founded by the creators of the popular Stable Diffusion model, has launched a new image generation model called FLUX.1 Kontext. This model not only generates and edits photos, but also allows users to modify them with both text and other images.
The company also announced its new BFL Playground, where people can try out BFL’s models before letting them loose…