Andrej Karpathy unveils nanochat, a full-stack training and inference implementation of an LLM in a single, dependency-minimal codebase, deployable in 4 hours
It provides a full ChatGPT-style LLM, including training, inference and a web Ui … X: Clem / @clementdelangue : Am I wrong in sensing a paradigm shift in AI? Feels like we're moving from a world obses...
Google DeepMind says Gemini Diffusion, an experimental text diffusion model demoed at Google I/O and available by waitlist, generates 1,000-2,000 tokens/second
Our state-of-the-art, experimental text diffusion model Jose Antonio Lanz / Decrypt : Google Doubles Down on AI: Veo 3, Imagen 4 and Gemini Diffusion Push Creative Boundaries Matthias Bastian / The De...
Sam Altman says OpenAI plans to “release a powerful new open-weight language model with reasoning in the coming months”, its first open-weight model since GPT-2
just look at the “T” in ChatGPT, which comes from the Transformer architecture openly shared by Google. Then came Garry Tan / @garrytan : Open weights 🚀 Alexander Doria / @dorialexander : Ok, this one...
Q&A with Google Gemini co-leads Jeff Dean and Noam Shazeer on Google's path to AGI, the future of Moore's Law, TPUs, inference scaling, open research, and more
“as we scale up [training], there may be a push to have a bit more asynchrony in our systems than we do now” 👀 Haider / @slow_developer : Google Chief Scientist, Jeff Dean “AI now generates 25% of Goo...
Google AI claims PaLM, its 540B parameter, dense decoder-only Transformer model, shows breakthrough capabilities in tasks like language, reasoning, and coding
In recent years, large neural networks trained for language understanding and generation have achieved impressive results across a wide range of tasks.