Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Today, we are introducing Kimi K2.5, the most powerful open-source model to date.

Kimi 2026-01-27

Discussion

@dedene Peter Dedene on x
The gap between closed-source and open-weight keeps getting closer. Fast.
@garyfung @garyfung on x
Kimi 2.5. A class of its own among Chinese open weights when not even bothering to bench compare w/ other Chinese models anymore. Gunning straight for frontier models from the West Kimi has been my favourite in SOTA creative writing. What'd coding + writing agent enable 🤔
@scaling01 @scaling01 on x
You sleep for 5 fucking hours and both DeepSeek and Kimi are dropping models without you 🥺 [image]
@chetaslua @chetaslua on x
I had access to this beast for the last 7 days Many of you guys guessed it right, Agents swarms - 100 subagents working in parallel, 1500 tools call Sota on HLE - 50.2% and Browse Comp- 74.9% [image]
@haoningtimothy Wu Haoning on x
https://www.kimi.com/... https://huggingface.co/... should be the most powerful ‘image-text-to-text’ on @huggingface now
@modelscope2022 @modelscope2022 on x
🚀 Meet Kimi K2.5! 🌙 This is Kimi's most intelligent and versatile model to date, achieving SOTA performance across coding, vision, and agentic workflows. Model: https://modelscope.cn/... Paper: https://www.kimi.com/... Highlights: ✅ Native Multimodal Architecture: Seamlessly [ima…
@gm8xx8 @gm8xx8 on x
KIMI K2.5 VISUAL AGENTIC INTELLIGENCE AT 1T SCALE Kimi K2.5 is Moonshot's open-source successor to K2: ~15T mixed vision+text continual pretraining plus a real scale-out agent stack (Agent Swarm) across Instant / Thinking / Agent modes. KIMI K2.5: MoE backbone (1T total / 32B [im…
@testingcatalog @testingcatalog on x
BREAKING 🚨: Kimi K2.5 open-source model is now live on Kimi Chat and APIs with a leading 50% score on HLE benchmark! It comes along with an Agentic Swarm feature, where up to 100 sub-agents would be working on a problem in parallel (Available in beta for some customers) [video]
@kimi_moonshot @kimi_moonshot on x
🥝 Meet Kimi K2.5, Open-Source Visual Agentic Intelligence. 🔹 Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%) 🔹 Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%) 🔹 Code with Taste: turn chats, [image]
@youjiacheng You Jiacheng on x
Interesting price change. cc @zephyr_z9 [image]
@casper_hansen_ Casper Hansen on x
1T parameter model and multimodal!! Honestly insane how much Kimi is pushing forward the frontier Weights are also on Huggingface, released with INT4 quantization
@zephyr_z9 @zephyr_z9 on x
Now this is really good [image]
@sungkim Sung Kim on bluesky
Moonshot AI, why are you making my life more complicated? — Now, I will need to revisit MQ, Kafka, and this stateless, client-side orchestration loop. — www.kimi.com/blog/kimi-k2... [image]
r/singularity r on reddit
Kimi K2.5 Released!!!
@scaling01 @scaling01 on x
I think this is the order in which I like to use the models (purely usability/usefulness): Kimi 2.5 >> GLM 4.7 > MiniMax M2.1 > DeepSeek V3.2 > Qwen3 235B Qwen just feels very slop and last gen by now. Both GLM and MiniMax absolutely destroy it. DeepSeek V3.2 is a strong model
@scaling01 @scaling01 on x
Kimi is still the most usable open-weights model Moonshot is honestly the Anthropic of China. A focus on taste and agentic behaviour.
@theo @theo on x
K2's been my default model in T3 Chat for awhile. Great writer. Only issue was the lack of image recognition. Did not expect this. Genuinely hyped to play with it.
@kimi_moonshot @kimi_moonshot on x
Here's a short video from our founder, Zhilin Yang. (It's his first time speaking on camera like this, and he really wanted to share Kimi K2.5 with you!) [video]
@haoningtimothy Wu Haoning on x
We are really taking a long time to prove this: everyone is building big macs but we bring you a kiwi🥝 instead. You have multimodal with K2.5 everywhere: chat with visual tools, code with vision, generate aesthetic frontend with visual refs...and most basically, it is a SUPER
@eliebakouch Elie on x
Kimi K2.5 is NOT just a small iteration on top of k2, it's now have fully multimodal understanding INCLUDING video! [image]
@bindureddy Bindu Reddy on x
Pretty cool - Kimi 2.5 just dropped - ahead of the new DeepSeek model Will be on LiveBench tomorrow alongside Qwen Max Thinking Open source tsunami! 🌊 [image]
@kimi_moonshot @kimi_moonshot on x
Introducing Kimi Code, an open-source coding agent under the Apache 2.0 License. 🔹 Python-based, easy to extend. 🔹 Fully transparent — clear, safe, reliable. 🔹 Seamlessly integrates with VS Code, Cursor, JetBrains, Zed, and more. 🔹 Fully-featured & out-of-the-box ready. [image]
@chujiezheng Chujie Zheng on x
This is our best, best model so far (I love it so so much). We have integrated adaptive reasoning, search and CI into it, and put massive efforts on improving real-world user experience. Also, with the closure of Qwen3, it won't be long before the launch of Qwen3.5. Stay tuned!
@eliebakouch Elie on x
very nice release by the kimi team, benchmarks are on par with opus 4.5, gpt 5.2 xhigh, gemini 3.0 pro there is also some nice details on the parallel RL part in the tech blog explaining how they build K2.5 agent swarm [image]
@kimi_moonshot @kimi_moonshot on x
Kimi K2.5 has arrived! 🥝 Here are 2 things to know: Aesthetic Coding x Agent Swarm. [video]
r/LocalLLaMA r on reddit
Introducing Kimi K2.5, Open-Source Visual Agentic Intelligence
r/LocalLLaMA r on reddit
Kimi K2.5 Released !
@kimiproduct Kimi Product on x
One-shot “Video to code” result from Kimi K2.5 It not only clones a website, but also all the visual interactions and UX designs. No need to describe it in detail, all you need to do is take a screen recording and ask Kimi: “Clone this website with all the UX designs.” [video]
@fireworksai_hq @fireworksai_hq on x
Kimi K2.5 is now live on Fireworks with full parameter RL tuning support! This is @Kimi_Moonshot Moonshot AI's flagship agentic model, a new SOTA open VLM that unifies vision, text, thinking, and multi-agent execution. Kimi K2.5 demonstrates that open source models are now [image…
@mweinbach Max Weinbach on x
KIMI K2.5 WEIGHTS ARE LIVE 1T total parameter MoE, fully multimodal It's competitive with Claude 4.5 Opus thinking https://huggingface.co/... [image]
@simonw Simon Willison on x
Notes and a pelican for the new Kimi K2.5 - a multi-modal (image input) model from Moonshot AI which also claims a “self-directed agent swarm paradigm” https://simonwillison.net/...
@shiri_shh Shirish on x
Kimi K2.5 just made video → code real. - screen record a site - ask it to clone - and Kimi ships the full code UX + interactions included 🤯 [video]
@vectro @vectro on x
Kimi K2.5 has only been on hugging face for 7 minutes and already has almost 10K downloads 😯 [image]
@dzhulgakov Dmytro Dzhulgakov on x
🌕 Kimi K2.5 = open SOTA reasoning + vision + 256K context + agentic coding 🏎 200+ t/s on @FireworksAI_HQ (soon even faster) ✅ Nails @simonw's “pelican on a bike” test in both directions Try it now on Fireworks and hats off to @Kimi_Moonshot [image]
@enricoros Enrico on x
Kimi-K2.5 believes it's an AI assistant named Claude. 🤔 Identity crisis, or training set? 😀 [image]
@gnotuy @gnotuy on x
Introducing Kimi K2.5: open-source visual agentic intelligence 🚀State-of-the-art benchmarks: Humanity's Last Exam full set (50.2%), BrowseComp (74.9%) Vision is humanity's native language. When Kimi understands what you see, creation becomes instinctive. No coding, No [video]
@soso_fun_yt @soso_fun_yt on x
Wow - the new Kimi K2.5 is so good to generate web content ! ✨ This is a very simple example, but it proves that Kimi is very productive, Gemini 3 Flash did a.. um, i should not show it to public haha, Gemini 3 Pro did have made a correct result, but in comparison to Kimi K2.5 [i…
@teksedge David Hendrickson on x
Kimi K2.5 for your Clawdbot should be much most cost effective than Claude 4.5 Opus and smarter than Gemini 3 Flash! [image]
@birdabo Sui Dev on x
KIMI K2.5 JUST MOGGED EVERYONE. > 1T PARAMETERS. > NATIVE MULTI MODAL. > BEATS CLAUDE 4.5 OPUS. > OPEN SOURCE. [video]
@eliamoharer Elia on x
Kimi K2.5 demonstrated something suuuper interesting. Using video inputs for generations like these is SICK imagine taking a video of your logic flow, describing it visually however you like, and letting your creativity appear in seconds. great work to the team!! [video]
@teortaxestex @teortaxestex on x
> built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base ...It's essentially a totally new model with new abilities. 30T tokens @ Muon. «Kimi K2.5 represents a meaningful step toward AGI for the open-source community» wow o…
@teortaxestex @teortaxestex on x
Huh. Indeed. Kimi-Thinking has been quietly updated to 2.5 and it's multimodal. [image]
@kimmonismus @kimmonismus on x
Really impressive release by MoonshotAI: Kimi K2.5 is SOTA in HLE with 50%, and Agents Benchmark wih 77%. It comes with an agent swarm mode and seems overall like a really really impressive release. going to check it out now [image]
@scaling01 @scaling01 on x
I've had this suspicion for a while makes me less bullish on moonshot, but honestly i don't care all that much, as long as we get cheap and good open-weights models [image]
@natolambert Nathan Lambert on x
These behaviors from Chinese models thinking they're built by American companies has a very large policy impact. It reinforces the theory that Chinese models are only good because they distill from closed western models. Distillation from API models definitely helps Chinese
@deepfates @deepfates on x
1T parameters with 32B active?? 😫
@dorialexander Alexander Doria on x
So DeepSeek is just opening gradually older model artifacts (1.5 years rolling basis?) and still hitting SOTA for size range/inference speed. Good flex.
@garyfung @garyfung on x
> 3b params DeepSeek-OCR 2 outperforms Gemini 3 Pro on benchmarks wtf are math geeks in China doing with LLM optimization? This is like LLM Ozempic 10x
@unslothai @unslothai on x
DeepSeek releases DeepSeek-OCR 2. 🐋 The new 3B model achieves SOTA visual, document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans, boosting OCR accuracy. Instead of traditional vision LLMs which read an [im…
@teortaxestex @teortaxestex on x
DeepSeek OCR 2. First reaction: it uses Qwen2-0.5B. Qwen 2 came out in May-Jun 2024. If they started this work ≤1 year ago, they'd have used 2.5 at least (July 2024). To me this confirms that OCR series have been done in DeepSeek V2 era. It's uncanny. [image]
@teortaxestex @teortaxestex on x
> clearly they are leaking the model weights back to China Holy shit this means we can download Claude from HF
@basedtorba Andrew Torba on x
China just released Kimi K2.5 and like clockwork the performance is on par with American frontier AI models. Just tested it for the first time and it identifies as Claude with a simple “hi” prompt lol. American AI companies all hire foreigners and clearly they are leaking the [im…
@_simonsmith Simon Smith on x
I tried Kimi K2.5 in Agent Swarm mode today and can say that the benchmarks don't lie. This is a great model and I don't understand how they've made something as powerful and user-friendly as Agent Swarm ahead of the big US labs.
@artificialanlys @artificialanlys on x
Moonshot's Kimi K2.5 is the new leading open weights model, now closer than ever to the frontier - with only OpenAI, Anthropic and Google models ahead Key takeaways: ➤ Impressive performance on agentic tasks: @Kimi_Moonshot's Kimi K2.5 achieves an Elo of 1309 on our GDPval-AA [im…
@valsai @valsai on x
Kimi 2.5 Thinking is the new #1 open-weight model, taking the top spot on our index (both multimodal and text-only)🥇 The model even compares favorably to leading closed-source providers: it places in the top 10 among all models on both indices 🚀 Congrats @Kimi_Moonshot [image]
@shaoguang_mao Shaoguang Mao on x
Beyond setting new SOTA, what excites us most is that we're approaching AGI on our own way🚀🚀🚀 I can't wait to share more about K2.5. We'll be releasing a technical report in the coming days, diving into the innovations behind K2.5—including deeper details on Pre-training
@artificialanlys @artificialanlys on x
Kimi K2.5 is slightly less token intensive than Kimi K2 Thinking. [image]
@artificialanlys @artificialanlys on x
Kimi K2.5 debuts with an Elo score of 1309 on the GDPval-AA Leaderboard, implying a win rate of 66% against GLM-4.7, the prior open weights leader. [image]
@artificialanlys @artificialanlys on x
Kimi K2.5 widens gap between the US and China in open weights model intelligence. The leading US open weights model remains OpenAI's gpt-oss-120b, which has now been eclipsed by an ever-growing list of open weights releases from China. [image]

Chronicles

Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Related Coverage

Discussion