DeepSeek releases DeepSeek-OCR, a vision language model designed for efficient vision-text compression, enabling longer contexts with less compute
the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model su...
Google unveils benchmarking platform Kaggle Game Arena, where LLMs compete head-to-head in strategic games, starting with a chess tournament from August 5 to 7
Watch models compete in complex games providing a verifiable and dynamic measure of their capabilities. Kaggle : Chess Text Input Leaderboard Nick Bild / Hackster : Shall We Play a Game? Maximilian Sc...
Baidu releases Ernie X1, an AI model that articulates its reasoning similarly to DeepSeek R1, and upgrades its flagship foundation model to Ernie 4.5, both free
And Costs Half As Much: ‘Excellent Multimodal Understanding Ability’ Nisha Gopalan / Yahoo Finance : China's Baidu Takes on DeepSeek With New AI Model Mike Wheatley / SiliconANGLE : Baidu debuts its f...
Yann LeCun says DeepSeek “profited from open research and open source” like Meta's Llama and is proof that open source models are surpassing proprietary ones
“Marc Andreessen, a co-inventor of the pioneering Mosaic web browser, co-founder of the Netscape browser company and current general partner at the famed Andreessen Horowitz (a16z) venture capital fir...
Industry insiders say DeepSeek's focus on research makes it a dangerous competitor as it's willing to share breakthroughs rather than protect them for profits
China is pulling the same trick. — www.ft.com/content/747a... Mastodon: Brian Kung / @briankung@hachyderm.io : “There's a pretty delicious, or maybe disconcerting irony to this, given OpenAI's found...
Alibaba releases 32.5B-parameter QwQ-32B-Preview under Apache 2.0 and claims the “reasoning” AI model beats OpenAI's o1-preview on the AIME and MATH tests
Introduction QwQ-32B-Preview is an experimental research model developed … Ananya Gairola / Benzinga : Alibaba's New AI Model Outperforms OpenAI's o1 In Specific Benchmarks, Now Available For Free Dow...
How Chinese hedge fund High-Flyer Capital Management developed its DeepSeek-V2 open-source LLM that costs less than rivals and ranks among the top in the world
1. Introduction Introducing DeepSeek LLM, an advanced language model comprising 67 billion parameters. DeepSeek on GitHub : DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language...