Cody Blakeney (Person)

Coverage Timeline

2025-11-19

The Information 1 related

Gemini co-lead Oriol Vinyals says Gemini 3's gains come from better pre-training and post-training, contradicting the idea that pre-training gains are falling

which we discussed in our NeurIPS '25 talk with @ilyasut and @quocleix—the team delivered a drastic jump. The delta between 2.5 and 3.0 is [image] Andrej Karpathy / @karpathy : I played with Gemini 3 ...

2025-11-19 View

2025-10-14

@karpathy 2 related

Andrej Karpathy unveils nanochat, a full-stack training and inference implementation of an LLM in a single, dependency-minimal codebase, deployable in 4 hours

It provides a full ChatGPT-style LLM, including training, inference and a web Ui … X: Clem / @clementdelangue : Am I wrong in sensing a paradigm shift in AI? Feels like we're moving from a world obses...

2025-10-14 View

2025-08-15

Google Developers Blog 15 related

Google announces Gemma 3 270M, a compact model designed for task-specific fine-tuning with strong capabilities in instruction following and text structuring

ai.google.dev/gemma/docs/c... Tim Duffy / @timfduffy.com : Google just released a 270M parameter Gemma model. As a tiny model lover I'm excited. Models in this size class are usually barely coherent...

2025-08-15 View

2024-05-07

VentureBeat 7 related

A study by Meta researchers suggests that training LLMs to predict multiple tokens at once, instead of just the next token, results in better and faster models

LLM approach to predict multiple tokens KAN: Kolmogorov-Arnold Networks —"promising alternatives to Multi-Layer Perceptrons" [image] Ethan / @ethan_smith_20 : it was only briefly touched upon, but is ...

2024-05-07 View

Loading articles...

Cody Blakeney

Top Voices

Explore Further

Coverage Timeline

Gemini co-lead Oriol Vinyals says Gemini 3's gains come from better pre-training and post-training, contradicting the idea that pre-training gains are falling

Andrej Karpathy unveils nanochat, a full-stack training and inference implementation of an LLM in a single, dependency-minimal codebase, deployable in 4 hours

Google announces Gemma 3 270M, a compact model designed for task-specific fine-tuning with strong capabilities in instruction following and text structuring

A study by Meta researchers suggests that training LLMs to predict multiple tokens at once, instead of just the next token, results in better and faster models

Quarterly Coverage

Top Sources

Narrative

Relationships