Andrej Karpathy unveils nanochat, a full-stack training and inference implementation of an LLM in a single, dependency-minimal codebase, deployable in 4 hours
It provides a full ChatGPT-style LLM, including training, inference and a web Ui … X: Clem / @clementdelangue : Am I wrong in sensing a paradigm shift in AI? Feels like we're moving from a world obses...
OpenAI releases gpt-oss-120b and gpt-oss-20b, its first open-weight models since GPT-2; the smaller gpt-oss-20b can run locally on a device with 16GB+ of RAM
gpt-oss-120b and gpt-oss-20b push the frontier of open-weight reasoning models Simon Willison / Simon Willison's Weblog : OpenAI's new open weight (Apache 2) models are really good OpenAI on GitHub : ...
Alibaba debuts the Qwen3-Coder model for agentic coding, including a 480B-parameter MoE variant, and open sources Qwen Code, a CLI tool adapted from Gemini CLI
Qwen 39.4k — Text Generation Transformers Safetensors qwen3_moe conversational Coco Feng / South China Morning Post : Alibaba upgrades flagship Qwen3 model to outperform OpenAI, DeepSeek in maths, c...
OpenAI releases o1, the first of its rumored reasoning-focused Strawberry models, in preview, alongside a smaller o1-mini, for ChatGPT Plus and Team subscribers
Advancing cost-efficient reasoning. — Contributions Sabrina Ortiz / ZDNET : OpenAI trained its new o1 AI models to think before they speak - how to access them Ethan Mollick / One Useful Thing : Som...
In a spat on X, Meta's Chief AI Scientist Yann LeCun calls out Elon Musk for saying xAI will pursue “truth” as Musk spreads “crazy-ass conspiracy theories” on X
But It Does Have Privacy Issues Emergent Behavior : 2024-05-29: Memorial Day Shenanigans Siddharth Jindal / AIM : Yann LeCun Delays Elon Musk's AGI Plans James Farrell / SiliconANGLE : Elon Musk and M...
Hugging Face, which is “profitable, or close to profitable”, commits $10M in free shared GPUs to help small developers, academics, and others create AI apps
Hugging Face, one of the biggest names in machine learning, is committing $10 million in free shared GPUs to help developers create new AI technologies. X: @osanseviero , @clementdelangue , and @brigi...
Microsoft releases PyRIT, a tool that the company's AI Red Team has been using to more efficiently check for risks in its generative AI systems, such as Copilot
https://www.microsoft.com/... I know, let's pretend that LLM security can be bolted on later after we have created a foundation model based on data scraped from the Internet that is FULL of poison, g...
DeepMind CEO Demis Hassabis pushes back on claims by Meta's Yann LeCun that he, Sam Altman, and Dario Amodei are fearmongering to achieve AI regulatory capture
Ng states that the idea that artificial intelligence could lead to the extinction of humanity is a lie being spread by big tech in the hope of triggering heavy regulation that would shut down competit...
Stanford unveils the Foundation Model Transparency Index, featuring 100 indicators; Llama 2 led at 54%, GPT-4 placed third at 48%, and PaLM 2 took fifth at 40%
https://www.nytimes.com/... [image] Mark Coggins / @coggins@mastodon.social : This is the kind of needed AI regulation—requiring model makers to reveal how they trained their language models—that #Ma...
Kuo: iPhone 13 Pro lineup will gain a 1TB storage option; Apple to announce AirPods 3 at September event, but will continue to sell current generation AirPods
I don't want it to be faster. I don't want a better camera. Gordon Kelly / Forbes : New Apple Leak Reveals iPhone 13 Release Shock Patrick O'Rourke / MobileSyrup : Apple's iPhone 13 could offer 1TB s...