They Solved AI Hallucinations
Fine-tuned Qwen3 SLMs (0.6-8B) beat frontier LLMs on narrow tasks
We spent a while putting together a systematic comparison of small distilled Qwen3 models (0.6B to 8B) against frontier APIs — GPT-5 nano/mini/5.2, Gemini 2.5 Flash Lite/Flash, Claude Haiku 4.5/Sonnet 4.6/Opus 4.6, Grok 4.1 Fast/Grok 4 — across 9 datasets spanning classification, function calling, Q
karpathy / autoresearch
[https://x.com/karpathy/status/2030371219518931079](https://x.com/karpathy/status/2030371219518931079) *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "
Anthropic sues US government, with good reason
As I wrote yesterday, Dario Amodei is no saint, but I fully support his company’s new lawsuit against the US government.
Anthropic sues Defense Department over supply-chain risk designation
Anthropic has sued the US government over its designation as a supply-chain risk, the latest move in a weekslong battle between it and the Pentagon over the acceptable use cases for its military AI tech. The suit, filed in a California district court, accuses the Trump administration of illegally pu
Employees across OpenAI and Google support Anthropic’s lawsuit against the Pentagon
On Monday, Anthropic filed its lawsuit against the Department of Defense over being designated as a supply chain risk. Hours later, nearly 40 employees from OpenAI and Google - including Jeff Dean, Google's chief scientist and Gemini lead - filed an amicus brief in support of Anthropic's lawsuit, de
Anthropic launches code review tool to check flood of AI-generated code
Anthropic launched Code Review in Claude Code, a multi-agent system that automatically analyzes AI-generated code, flags logic errors, and helps enterprise developers manage the growing volume of code produced with AI.
OpenAI plans to acquire Promptfoo and bake AI security testing directly into its Frontier enterprise platform
OpenAI is acquiring AI security platform Promptfoo to build automated vulnerability testing, covering jailbreaks, prompt injections, and data leaks, directly into its Frontier enterprise platform. The article OpenAI plans to acquire Promptfoo and bake AI security testing directly into its Frontier e
Why Most Valuable AI Systems Are Still Tabular Models
Microsoft brings Anthropic's Claude Cowork into Copilot to run tasks across Outlook, Teams, and Excel
With Copilot Cowork, Microsoft taps Anthropic's Claude instead of OpenAI to let AI handle tasks across Outlook, Teams, and Excel autonomously. The article Microsoft brings Anthropic's Claude Cowork into Copilot to run tasks across Outlook, Teams, and Excel appeared first on The Decoder.
800,000 human brain cells, in a dish, learned to play a video game
Figure robot autonomously cleaning living room
Link to tweet: https://x.com/adcock\_brett/status/2031039203262501252?s=20 Link to website: https://www.figure.ai/news/helix-02-living-room-tidy
Anthropic Sues Pentagon, OpenAI IPO Investor Skeptics, New Groq Chip Reveal at Nvidia GTC — TITV [Video]
Anthropic Sues Pentagon, OpenAI IPO Investor Skeptics, New Groq Chip Reveal at Nvidia GTC — TITV [Video] The Information
Anthropic Sues Pentagon Over ‘Supply Chain Risk’ Label
Is legal the same as legitimate: AI reimplementation and the erosion of copyleft
I am not saying it's Gemma 4, but maybe it's Gemma 4?
three different tweets combined (today, previous week, year ago)
Microsoft Announces New Office 365 Bundle With AI Copilot Included
Microsoft Announces New Office 365 Bundle With AI Copilot Included The Information
AheadFrom Robotics getting less uncanny - now only mildly unsettling...
Genuinely curious what doors the M5 Ultra will open
it seems the Bandwidth is catching up, making bigger models more and more usable.