Security researchers successfully prompted the AI system behind a Utah prescription renewal pilot to reclassify meth as an “unrestricted therapeutic”, and more
Security researchers used relatively simple jailbreaking techniques to trick the AI system powering Utah's new prescription refill bot.
A survey of US teens: 57% use AI chatbots to search for info, 54% use them to do schoolwork, 47% for fun or entertainment, 12% for emotional support, and more
Just over half of U.S. teens say they have used chatbots for help with schoolwork, and 12% say they've gotten emotional support.
Anthropic details an experiment on whether AI coding tools shape developer skills: the biggest performance decline for developers occurred in debugging tasks
Read the paper — Research shows AI helps people do parts of their job faster. In an observational study of Claude.ai data, we found AI can speed up some tasks by 80%.
Anthropic details an experiment on whether AI coding tools shape developer skills: the biggest performance decline for developers occurred in debugging tasks
Read the paper — Research shows AI helps people do parts of their job faster. In an observational study of Claude.ai data, we found AI can speed up some tasks by 80%.
An analysis of 5,290 AI research papers at NeurIPS: 141, or ~3%, had US-China AI lab collaboration, vs. 134/4,497 in 2024; Llama featured in 106 Chinese papers
WIRED analyzed more than 5,000 papers from NeurIPS using OpenAI's Codex to understand the areas where the US and China actually work together on AI research.
A look at Confer, an open-source AI assistant project from Signal creator Moxie Marlinspike that is designed to provide end-to-end encryption for AI chats
Moxie Marlinspike—the pseudonym of an engineer who set a new standard for private messaging with the creation of the Signal Messenger …
A look at Confer, an open-source AI assistant project from Signal creator Moxie Marlinspike that is designed to provide end-to-end encryption for AI chats
Moxie Marlinspike—the pseudonym of an engineer who set a new standard for private messaging with the creation of the Signal Messenger …
A researcher details an LLM-based AI agent that “demonstrated a near-flawless ability” to bypass bot detection methods while answering online survey questions
We can no longer trust that survey responses are coming from real people. Online survey research …
A study of 311 AI-generated eighth-grade civics lesson plans in Massachusetts suggests they fall short of inspiring students or promoting critical thinking
When teachers rely on commonly used artificial intelligence chatbots to devise lesson plans, it does not result in more engaging …
The AI boom is driving memory and storage shortages that may last a decade; OpenAI's Stargate has deals for 900K DRAM wafers per month, or ~40% of global output
Once-cheap SSDs, DRAM, and HDD prices are climbing fast as AI demand and constrained supply converge to create the tightest market in years. Bluesky: @smcgrath.phd , @broximar , @z...
How inaccurate AI translations of Wikipedia pages, which AI models use for training, may cause a doom spiral that further marginalizes vulnerable languages
When Kenneth Wehr started managing the Greenlandic-language version of Wikipedia four years ago, his first act was to delete almost everything. LinkedIn: Soeren Eberhardt and Rache...
A look at AI-powered stuffed animals like Grem, Grok, and Gabbo, which are being promoted as an alternative to screen time for children as young as 3
Curio is a company that describes itself as “a magical workshop where toys come to life.” When I recently visited its cheery headquarters …
Google rolls out Gemini 2.5 Deep Think, its most advanced reasoning model, which considers multiple ideas simultaneously, to its $250/month Ultra subscription
Google DeepMind is rolling out Gemini 2.5 Deep Think, which, the company says, is its most advanced AI reasoning model …
Tests reveal that Grok 4 seems to search for Elon Musk's views online when asked about sensitive topics, and its answers tend to align with Musk's opinions
During xAI's launch of Grok 4 on Wednesday night, Elon Musk said — while live-streaming the event on his social media platform …
VA used a DOGE AI tool by Gumroad founder Sahil Lavingia that hallucinated contract sizes to cancel 24+ deals; Lavingia says “mistakes were made”
We obtained records showing how a Department of Government Efficiency staffer with no medical experience used artificial intelligence to identify which VA contracts to kill.
A study finds that asking LLMs to be concise in their answers, particularly on ambiguous topics, can negatively affect factuality and worsen hallucinations
Turns out, telling an AI chatbot to be concise could make it hallucinate more than it otherwise would have.
OpenAI says its new o3 and o4-mini AI models hallucinate more often than its previous reasoning and traditional models, and the company doesn't know why
OpenAI's internal tests show o3 hallucinated on 33% of person-related questions, double the rate of previous models. Even worse, o4-mini hit 48%. Mastodon: Aulia Masna / @aulia@me...
A comparison of OpenAI's o3, o4-mini, and GPT-4.1; Aaron Levie says o3 nailed a multi-step financial modeling task; Scale AI CEO says o3 is “a big breakthrough”
Our take on what's powerful, what's practical, and what's still TBD … If you've been following AI news this week …
OpenAI says its new o3 and o4-mini AI models hallucinate more often than its previous reasoning and traditional models, and the company doesn't know why
OpenAI's recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate …
Interviews with Dario Amodei, Daniela Amodei, and other executives about Anthropic's origin, Claude, why DeepSeek isn't a threat, reaching AGI safely, more
The brother goes on vision quests. The sister is a former English major. Together, they defected from OpenAI, started Anthropic …