GPT5.5 slightly outperformed Mythos on a multi-step cyber-attack simulation. One challenge that took a human expert 12 hrs took GPT-5.5 only 11 min at a $1.73 cost
Link to tweets: https://x.com/deredleritt3r/status/2049890601236390098?s=20 https://x.com/AISecurityInst/status/2049868227740565890?s=20 Link to associated blogs: [https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities](https://www.aisi.gov.uk/blog/our-evaluation-of-op
How people ask Claude for personal guidance
How people ask Claude for personal guidance Anthropic
Elon Musk confirms xAI used OpenAI’s models to train Grok
In a federal courtroom in California on Thursday, Elon Musk testified that his own AI startup, xAI, has used OpenAI's models to improve its own. The matter at question is model distillation, a common industry practice by which one larger AI model acts as a "teacher" of sorts to pass on knowledge to
Grok 4.3
Show HN: Perfect Bluetooth MIDI for Windows
[AINews] Agents for Everything Else: Codex for Knowledge Work, Claude for Creative Work
a quiet day lets us reflect on coding agents "breaking containment"
Shai-Hulud Themed Malware Found in the PyTorch Lightning AI Training Library
16x Spark Cluster (Build Update)
Build is done. 16 DGX Sparks on the fabric, all hitting line rate. Setup was time consuming but honestly smoother than I expected. Each Spark runs Nvidia’s flavor of Ubuntu out of the box with mostly everything pre installed and ready to go. For setup I had to rack them, power on, create the same u
Qwen 3.6 27B vs Gemma 4 31B - making Packman game!
Gemma just crushed Qwen in a local LLM gamedev contest! Device: MacBook Pro M5 Max, 64GB RAM Qwen 3.6 27B: 32 tokens/sec · 18m 04s · 33,946 tokens. Gemma 4 31B: 27 tokens/sec · 3m 51s · 6,209 tokens. So what is more important: tokens per second, or the quality of the final answer? Qwen made a
AMD Halo Box (Ryzen 395 128GB) photos
Don't know if the date was released yet, but this was just said a few moments ago at AMD AI Dev Day. No word on price, but I think its made by Lenovo based on the plug earlier in the presentation. Edit: They had a unit on a table and I just confirmed with an engineer it is just a 395 128gb with no
This is exactly what I feel whenever I need to explain the task over and over again
What in tarnation is going on with the cost of compute
Does anyone know? I can’t even find a server gpu <b200 on vast, and for the first time that I’ve ever seen on mithril, at multiple points last week have h100/h200/b200 all been at over $1k an hour, for sustained periods! I don’t know why you wouldn’t just migrate to runpod at that point, even the
Anthropic's Head of Product: Anthropic's Head of Product (summary here), she is stating that "The timelines for a lot of our product features have gone down from six month to one month and sometimes to even one day"
In the recent episode of Lenny's Podcast with Anthropic's Head of Product ([summary here](https://www.podtyper.com/transcriptions/how-anthropic-s-product-team-moves-faster-than-anyone-else-c-4614?tab=insights)), she is stating that **"The timelines for a lot of our product features have gone down fr
U.S. Aims to Penalize Disabled Adults Who Live with Their Families
The More Young People Use AI, the More They Hate It
OpenAI announces new advanced security for ChatGPT accounts, including a partnership with Yubico
OpenAI is launching additional opt-in protections for ChatGPT accounts. The new security initiative includes a new partnership with security key provider Yubico.
Live updates from Elon Musk and Sam Altman’s court battle over the future of OpenAI
Sam Altman and Elon Musk are facing off in a high-stakes trial that could alter the future of OpenAI and its most well-known product, ChatGPT. In 2024, Musk filed a lawsuit accusing OpenAI of abandoning its founding mission of developing AI to benefit humanity and shifting focus to boosting profits
Gemini is rolling out to cars with Google built-in
Here’s an early look at the new Gemini assistant on a vehicle infotainment system. | Image: Google Google is preparing to update vehicles that have Google built-in with its Gemini AI assistant. This will be an upgrade from the current Google Assistant according to Google's announcement, and promi
OpenAI talks about not talking about goblins
OpenAI is opening up about its goblin problem. After a report from Wired revealed instructions to OpenAI's coding model to "never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures," the AI startup published an explanation on its website, calling references