DeepSeek AI, a Chinese research lab gaining recognition for its powerful open-source language models such as DeepSeek-R1, has introduced a significant advancement in reward modeling for large language models (LLMs).
Their new technique, Self-Principled Critique Tuning (SPCT), aims to create generalist and scalable reward models (RMs). This could potentially lead to more capable AI applications for…
The RAG reality check: New open-source framework lets enterprises scientifically measure AI performance
April 9, 2025
Enterprises are spending time and money building out retrieval-augmented generation (RAG) systems. The goal is to have an accurate enterprise AI system, but are those systems actually working?
The inability to objectively measure whether RAG systems are actually working is a…
Wells Fargo’s AI assistant just crossed 245 million interactions – no human handoffs, no sensitive data exposed
April 9, 2025
Wells Fargo has quietly accomplished what most enterprises are still dreaming about: building a large-scale, production-ready generative AI system that actually works. In 2024 alone, the bank’s AI-powered assistant, Fargo, handled 245.4 million interactions – more than…
New open source AI company Deep Cogito releases first models and they’re already topping the charts
April 9, 2025
Deep Cogito, a new AI research startup based in San Francisco, officially emerged from stealth today with Cogito v1, a new line of open source large language models (LLMs) fine-tuned from Meta’s Llama 3.2 and equipped with hybrid reasoning capabilities — the ability to answer quickly and immediately, or “self-reflect” like OpenAI’s “o” series and DeepSeek R1.
The company aims to push…
Google’s Gemini 2.5 Pro, which the company calls its most intelligent model ever, quietly took the developer world by storm.
After seeing strong developer interest, Google announced it would increase rate limits for Gemini 2.5 and offer the model at a lower price than many…
The vibe coding phenomenon—where developers increasingly rely on AI to generate and assist with code—has rapidly evolved from a niche concept to a mainstream development approach.
With tools like GitHub Copilot normalizing AI-assisted coding, the next battleground has…
The Stanford Institute for Human-Centered Artificial Intelligence (HAI) has released its 2025 AI Index Report, providing a data-driven analysis of AI’s global development. HAI has been developing a report on AI over the last several years, with its first benchmark coming in 2022. Needless to say, a lot has changed.
The 2025 report is loaded with statistics. Among some of the top findings:
The…
The general-purpose AI agent landscape is suddenly much more crowded and ambitious. This week, Palo Alto-based startup Genspark released what it calls Super Agent, a fast-moving autonomous system designed to handle real-world tasks across a wide range of domains…
DeepSeek jolts AI industry: Why AI’s next leap may not come from more data, but more compute at inference
April 7, 2025
In a Nutshell
DeepSeek’s cost-effective AI model disrupted the industry, impacting Nvidia stock. Data scarcity drives a shift towards “test-time compute” for improved AI performance. This trend affects hardware, cloud platforms, foundation models, and enterprise…
From MIPS to exaflops in mere decades: Compute power is exploding, and it will transform AI
April 7, 2025
At the recent Nvidia GTC conference, the company unveiled what it described as the first single-rack system of servers capable of one exaflop — one billion billion, or a quintillion, floating-point operations (FLOPS) per second. This breakthrough is based on the latest GB200 NVL72 system, which incorporates Nvidia’s latest Blackwell graphics processing units (GPUs). A standard computer rack is…