Alibaba releases its open-weight Qwen3.5 Small Model Series in 0.8B, 2B, 4B, and 9B sizes, claiming the 9B model rivals OpenAI's gpt-oss-120b on some benchmarks

Earlier today, e-commerce giant Alibaba's Qwen Team of AI researchers, focused primarily on developing and releasing to the world …

VentureBeat 2026-03-03 Carl Franzen

Discussion

@alibaba_qwen @alibaba_qwen on x
🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast, [image…
@itspaulai Paul Couvert on x
How is this even possible?! Qwen has released 4 new models and the 4B version is almost as capable as the previous 80B A3B one 🤯 And the 9B is as good as GPT OSS 120B while being 13x smaller! - They can run on any laptop - 0.8B and 2B for your phone - Offline and open source
@karankendre Karan on x
Me realising I can run these models locally on my M1 MacBook Air for free [video]
@adrgrondin Adrien Grondin on x
The new Qwen 3.5 by @Alibaba_Qwen running on-device on iPhone 17 Pro. Qwen 3.5 beats models 4 times its size, has strong visual understanding, and can toggle reasoning on or off. The 2B 6-bit model here is running with MLX optimized for Apple Silicon. [video]
@ollama @ollama on x
The Qwen 3.5 small model series is now available ollama run qwen3.5:9b ollama run qwen3.5:4b ollama run qwen3.5:2b ollama run qwen3.5:0.8b All models support native tool calling, thinking, and multimodal capabilities in Ollama.
@jason @jason on x
Holy shit, it's happening
@justinlin610 Junyang Lin on x
final shot. these are the small models that i told u!
@nummanali Numman Ali on x
The new Qwen on device models receiving praise from Elon These are densely intelligent and replace the need for APIs for general tasks Within a year these could be Opus level as further distillation and training techniques improve They can even be embedded in web apps [image]
@rohanpaul_ai Rohan Paul on x
Big leap for on-device AI. Here Qwen 3.5 2B (6-bit) running on iPhone 17 Pro, MLX-optimized. Outperforms models 4X its size with strong visual intelligence. Powerful AI, now truly mobile. Opportunity for so many new products. [video]
@elonmusk Elon Musk on x
@Alibaba_Qwen Impressive intelligence density
@deryatr_ Derya Unutmaz on x
Alibaba Qwen 3.5 AI models have become unbelievable for their size! Soon we will enter the age of intelligent devices, when eventually an Apple Watch will have the expertise of a PhD on any topic! Then imagine running your AI agents on Raspberry Pis with Qwen 3.5! Not far away!
@alexfinn Alex Finn on x
Do you understand what this means? Are you aware how much the world just changed? You can now run frontier intelligence on a potato Your $600 Mac Mini can now run unlimited super intelligence for free. No authoritarian AI companies can cut you off Do this immediately, no
@abhijitwt Abhijit on x
finally, we can run an LLM locally on a 1GB RAM old laptop for free [video]
@cgtwts @cgtwts on x
Qwen 3.5 Small models - fully open source - beats models 4x it's size - 9B model performs on par with GPT OSS 120B while being 13x smaller - outperforms Gemini 3 flash and Claude sonnet 4.5 on select benchmarks - runs on any laptop - even works on a phone - completely free. [vide…
@cherry_cc12 Chen Cheng on x
We just dropped the Qwen3.5 Small series — 0.8B / 2B / 4B / 9B 🚀 Small doesn't mean limited anymore. Would love to hear what you build with them 👀
@slow_developer Haider on x
INCREDIBLE qwen just dropped 4 new qwen3.5 small models: 0.8b, 2b, 4b, and 9b. looking at the benchmarks, they're now matching gemini 3 flash and sonnet 4/4.5 in about half the tests, even on vision this is a big deal. because you can now run them locally, and they're good [image…
@udiwertheimer Udi Wertheimer on x
china just mogged @AlexFinn they waited for him to spend his life savings on 317 mac studios with a combined 4.7 petabytes of vram and then they released frontier models that are small enough to run on a single potato
@_tobiaslee Lei Li on x
Qwen 3.5 small models (0.8B-9B) with base models released 🫡 A 4B multimodal model that runs on a single 4090 — this is what makes LLM research accessible to PhD students without GPU clusters. Open base models = real SFT/RL research, not just prompting chat models. Respect to
@lioronai Lior Alexander on x
Alibaba shipped four Qwen 3.5 small models with a trick borrowed from their 397B model: Gated DeltaNet hybrid attention. Three layers of linear attention for every one layer of full attention. The linear layers handle routine computation with constant memory use. The full
@aakashgupta Aakash Gupta on x
The Qwen 3.5 small model hype is getting ahead of itself. Yes, the 9B beats GPT-5 Nano by 13 points on MMMU-Pro (70.1 vs 57.2) and 30+ points on document understanding. Yes, it outperforms Qwen's own previous-gen 30B on most benchmarks at a third the size. The bar charts look
@theahmadosman Ahmad on x
INCREDIBLE Qwen 3.5 smalls are here - 9B - 4B - 2B - 0.8B All image-to-text & text-to-text Built to run locally The models > Qwen 3.5 9B/9B-Base > Qwen 3.5 4B/4B-Base > Qwen 3.5 2B/2B-Base > Qwen 3.5 0.8B/0.8B-Base Small. Dense. Opensource. The future is local, Buy a GPU [image]
@furman Andrew Furman on x
Ok the local models are here Qwen 3.5 small models feel so slick on Locally AI app on iPhone [image]

Chronicles

Alibaba releases its open-weight Qwen3.5 Small Model Series in 0.8B, 2B, 4B, and 9B sizes, claiming the 9B model rivals OpenAI's gpt-oss-120b on some benchmarks

Related Coverage

Discussion