gm8xx8 · TEXXR

KIMI K2.5 VISUAL AGENTIC INTELLIGENCE AT 1T SCALE Kimi K2.5 is Moonshot's open-source successor to K2: ~15T mixed vision+text continual pretraining plus a real scale-out agent stack (Agent Swarm) across Instant / Thinking / Agent modes. KIMI K2.5: MoE backbone (1T total / 32B [image]

2026-01-27 View on X

Kimi

Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Today, we are introducing Kimi K2.5, the most powerful open-source model to date.

View original

KIMI K2.5 VISUAL AGENTIC INTELLIGENCE AT 1T SCALE Kimi K2.5 is Moonshot's open-source successor to K2: ~15T mixed vision+text continual pretraining plus a real scale-out agent stack (Agent Swarm) across Instant / Thinking / Agent modes. KIMI K2.5: MoE backbone (1T total / 32B [image]

2026-01-27 View on X

Bloomberg

Chinese startup Moonshot releases Kimi K2.5, saying the model can process text, images, and videos simultaneously and beats its open-source peers in some tests

Alibaba Group Holding Ltd.-backed Moonshot AI released an upgrade of its flagship model, heating up a domestic arms race ahead …

View original

Thinking Machines is moving with clarity and speed. I'm more bullish on them by the day. @soumithchintala is a rare kind of hire. The kind that quietly raises the ceiling of what a team can execute. Another strong move, IMO.

2026-01-15 View on X

Wired

A source close to Thinking Machines Lab alleges ex-CTO Barret Zoph, who is returning to OpenAI, had shared confidential company information with competitors

The departures are a blow for Thinking Machines Lab. Two narratives are already emerging about why they happened.

View original

Thinking Machines is moving with clarity and speed. I'm more bullish on them by the day. @soumithchintala is a rare kind of hire. The kind that quietly raises the ceiling of what a team can execute. Another strong move, IMO.

2026-01-15 View on X

@kyliebytes

Thinking Machines Lab parts ways with CTO Barret Zoph, with Soumith Chintala taking over the role; sources say the termination is due to “unethical conduct”

BREAKING: Thinking Machines has terminated its CTO, Barret Zoph, due to unethical conduct according to two sources familiar with the matter. CEO Mira Murati announced the news at a...

View original

RNJ-1 (8B BASE / INSTRUCT) Open-weight 8B dense Transformer trained from scratch by Essential AI, with fully specified architecture and training dynamics. Global self-attention with YaRN extending context to 32k tokens. Pre-training-first, compression-driven approach targeting [image]

2025-12-07 View on X

Essential AI

Essential AI, whose CEO co-wrote Google's Attention Is All You Need paper, unveils Rnj-1, an 8B-parameter open model with SWE-bench performance close to GPT-4o

The long-term advancement and equitable diffusion of AI technologies crucially depend on their development in the Open.

View original

GLM-4.6 is here. Zhipu extends the 4.5 line with: - Context: 128K → 200K tokens for more complex agent workflows - Coding: stronger results on benchmarks + real-world apps (Claude Code, Cline, Roo, Kilo), incl. polished front-end generation - Reasoning: better inference-time [image]

2025-10-01 View on X

Z.ai

Z.ai releases GLM-4.6, an open-weights model with a context window of up to 200K tokens, claiming near parity with Claude Sonnet 4 on coding and reasoning tasks

Today, we are releasing the latest version of our flagship model: GLM-4.6. Compared with GLM-4.5, this generation brings several key improvements:

View original

Xiaomi MiMo-7B: a 7B reasoning model series trained from scratch, outperforms 32B+ baselines on math and code via dense RL - pretrained on 25T tokens w/ multi-token prediction - RL rewards from rule-verifiable math/code tasks - cold-start RL model (MiMo-7B-RL-Zero) hits 93.6% [image]

2025-04-30 View on X

Bloomberg

Xiaomi unveils open-source AI reasoning model MiMo, joining other Chinese tech leaders hoping to make a splash in the burgeoning AI field endorsed by Beijing

Xiaomi debuted MiMo a day after Alibaba unveiled the latest version of its own flagship model, amplifying a race between China's tech players …

View original

Gemma 3 is a family of open models (1B, 4B, 12B, 27B) designed for efficient, on-device use. They support 140 languages, text and visual reasoning, 128k-token context, function calling, and structured outputs. Quantized versions reduce compute and memory with minimal performance [image]

2025-03-12 View on X

9to5Google

Google unveils Gemma 3, the “world's best single-accelerator model”, running on a single GPU, in 1B, 4B, 12B, and 27B sizes, and says it outperforms Llama-405B

Following version 1 in February 2024 and 2 in May, Google today announced Gemma 3 as its latest open model for developers.

View original

Qwen2.5-VL is a highly capable vision-language model designed for a wide range of tasks. It excels in visual understanding, device interaction, and long-form video analysis. Key Features: - Visual Understanding: Handles everything from simple objects to complex data [image]

2025-01-28 View on X

TechCrunch

Alibaba's Qwen team releases Qwen2.5-VL, a new series of AI models that can control PCs and phones, as well as perform a number of text and image analysis tasks

QWEN CHAT GITHUB HUGGING FACE MODELSCOPE DISCORD Anusuya Lahiri / Benzinga : Not Just DeepSeek - Alibaba Unveils AI Model To Rival OpenAI's Operator Markus Kasanmascheff / WinBuzze...

View original

Gemini 2.0 Flash Thinking Experimental > Multimodal Understanding: Handles tasks involving multiple data types. > Reasoning Capabilities: Excels at solving complex problems. > Coding: Tackles difficult code and math challenges. > Visible Thinking Process: Shows the model's

2024-12-20 View on X

TechCrunch

Google releases Gemini 2.0 Flash Thinking, an experimental “reasoning” model that “explicitly shows its thoughts” and can use them to strengthen its reasoning

Quick: what sort of prompts should you run against GPT-4o vs Gemini 1.5 Flash vs o1 vs o1-pro vs gemini-2.0-flash-thinking-exp? X: Jeff Dean / @jeffdean : Introducing Gemini 2.0 Fl...

View original