/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

Google releases Multi-Token Prediction drafters for its Gemma 4 models, which use a form of speculative decoding to guess future tokens for faster inference

Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI.

Ars Technica Ryan Whitwam

Discussion

  • @googlegemma @googlegemma on x
    Gemma 4 just got even faster! We're releasing Multi-Token Prediction (MTP) drafters that deliver up to a 3x speedup, without any degradation in output quality or reasoning logic. [image]
  • @googledevs @googledevs on x
    Gemma 4: Now up to 3x Faster. ⚡ Same quality, way more speed. Our new MTP drafters allow Gemma 4 to predict multiple tokens at once, effectively tripling your output speed without compromising intelligence. [image]
  • @zhijianliu_ Zhijian Liu on x
    DFlash for Gemma 4: Up to 6x Faster. ⚡⚡ Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash — open source, same quality, more speed!! https://github.com/... [video]
  • @nummanali Numman Ali on x
    MTP is increasing speeds of on device inference by 2x - 3x That's going from 20tps to 60tps ! Gemma 4 models now have official support
  • @bnjmn_marie Benjamin Marie on x
    Gemma 4 was very slow compared to Qwen3.6. Now, it's probably much faster! I'll publish my own numbers tomorrow