DeepSeek launches DeepSeek-OCR 2, an upgraded optical character recognition model that replaces OpenAI-developed CLIP framework with Alibaba's Qwen2-0.5b
So DeepSeek is just opening gradually older model artifacts (1.5 years rolling basis?) and still hitting SOTA for size range/inference speed. Good flex.
> 3b params DeepSeek-OCR 2 outperforms Gemini 3 Pro on benchmarks wtf are math geeks in China doing with LLM optimization? This is like LLM Ozempic 10x
DeepSeek releases DeepSeek-OCR 2. 🐋 The new 3B model achieves SOTA visual, document and OCR understanding. DeepEncoder V2 is introduced which enables the model scan images in same logical order as humans, boosting OCR accuracy. Instead of traditional vision LLMs which read an [im…
DeepSeek OCR 2. First reaction: it uses Qwen2-0.5B. Qwen 2 came out in May-Jun 2024. If they started this work ≤1 year ago, they'd have used 2.5 at least (July 2024). To me this confirms that OCR series have been done in DeepSeek V2 era. It's uncanny. [image]