Meta launches Llama 4 Maverick with 400B parameters and Scout with 109B parameters and a 10M context window, and previews Behemoth with 2T total parameters
Takeaways — We're sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.
Meta
Related Coverage
- Meta Releases New Llama 4 AI Models With Multimodal Design Bloomberg · Vlad Savov
- Meta releases two Llama 4 AI models The Verge · Wes Davis
- What to know about Meta's Llama 4 model family TechTalks · Ben Dickson
- Introducing the Llama 4 herd in Azure AI Foundry and Azure Databricks Microsoft Azure · Asha Sharma
- Meta Unveils New Llama 4 AI Models With Massive Context Windows up to 10 Million Tokens WinBuzzer · Markus Kasanmascheff
- Meta releases first multimodal Llama-4 models, leaves EU out in the cold The Decoder · Matthias Bastian
- Meta rivals ChatGPT and Gemini with new Llama 4 models: What is it, how to use and more Livemint · Aman Gupta
- Meta's Llama 4 is now available on Workers AI The Cloudflare Blog
- Meta's latest open source AI models challenge GPT, Gemini, and Claude Digital Trends · Fionna Agomuoh
- Meta's Llama 4 spans extremes: From 15k-page analysis (Scout) to a 2T-parameter preview (Behemoth) Research & Development World · Brian Buntz
- Meta's Llama 4 models now available on Amazon Web Services About Amazon
- Meta Launches New Llama 4 Herd AI Models iClarified
- Meta launches new AI model Llama 4 AIN · Oleksandra Orekhova
- Meta's answer to DeepSeek is here: Llama 4 launches with long context Scout and Maverick models, and 2T parameter Behemoth on the way! VentureBeat · Carl Franzen
- Initial impressions of Llama 4 Simon Willison's Weblog · Simon Willison
- Quoting Ahmed Al-Dahle Simon Willison's Weblog · Simon Willison
- Meta debuts new Llama 4 models, but most powerful AI model is still to come CNBC · Annika Kim Constantino
- Meta Releases First Two Multimodal Llama 4 Models, Plans Two Trillion Parameter Model Analytics India Magazine · Siddharth Jindal
- Meta Platforms (META) Is Expected to Release Its Llama 4 AI Model This Month TipRanks Financial · Vince Condarcuri
- Meta's Llama 4 model is running behind schedule, but we might see it soon Android Central · Nickolas Diaz
- Cerebras Systems is proud to announce our partnership with Meta and the launch of Llama 4. From day one, we delivered the fastest Llama 3 in the industry, and are now doing the same for Llama 4. … Andrew Feldman
- 🦙🦙🦙🦙 Llama 4 is LIVE. The AI foundation model for the world. — I've been working on this for over a year. … Sean Bell
- The Llama 4 Herd Hacker News
- The Llama 4 herd Lobsters
- Meta Unveils Two New Llama 4 AI Models, Includes One That Can Run Efficiently on Single NVIDIA H100 TechEBlog · Jackson Chung
- ‘Never a dull day in AI’: Sundar Pichai reacts as Meta launches Llama 4 models to take on Gemini Livemint · Aman Gupta
- Meta Releases Llama 4 AI Models The Information · Kalley Huang
- Meta releases new AI model Llama 4 Reuters
- Meta unveils new Llama-4 AI model to compete with ChatGPT, Gemini Hindustan Times
- Meta's Llama 4 puts US back in lead to ‘win the AI race’ — David Sacks Cointelegraph · Ciaran Lyons
- The Sequence Radar #526: Llama 4 Scout and Maverick are Here! TheSequence · Jesus Rodriguez
- ChatGPT vs Meta AI: Which AI chatbot is better after the Llama 4 launch? Livemint · Aman Gupta
- Llama 4 brings 10M token context and MoE architecture with 3 new models TestingCatalog · Alexey Shabanov
- Meta debuts Llama 4 AI model family with multimodal capabilities Cryptopolitan · Nellius Irene
- How to use Meta's Llama 4: A quick guide for developers and enterprises The Economic Times
- Meta launches new AI models with advanced features Tech in Asia · Aiko Gao Ishida
- Meta rolls out Llama 4 as race for AI dominance heats up Moneycontrol · Vikas SN
- Welcome Llama 4 Maverick & Scout on Hugging Face Hugging Face · Rajat Arya
- Meta Launches Llama 4: Multimodal, Massive, and Made for Everyone Maginative · Chris McKay
- Meta AI Just Released Llama 4 Scout and Llama 4 Maverick: The First Set of Llama 4 Models MarkTechPost · Asif Razzaq
Discussion
-
@chirag
Chirag Mehta
on bluesky
You ask how fast AI models are evolving - you announce it on Saturday fast. [embedded post]
-
@jeremymorrell.dev
Jeremy Morrell
on bluesky
Meta introduced Llama 4 models and added this section near the very bottom of the announcement 😬 — “[LLMs] historically have leaned left when it comes to debated political and social topics.” — ai.meta.com/blog/llama-4... [images]
-
@seldo.com
Laurie Voss
on bluesky
Llama 4 models are very impressive, but why drop them at noon on a Saturday?? ai.meta.com/blog/llama-4...
-
@simonwillison.net
Simon Willison
on bluesky
Meta just dropped Llama 4 on a weekend! Two new open weight models (Scout and Maverick) and a preview of a model called Behemoth - Scout has a 10 million token context — Best information right now appears to be this blog post: ai.meta.com/blog/llama-4...
-
@timkellogg.me
Tim Kellogg
on bluesky
🚨Llama 4 Is Out! 🚨 — 2 out of 3 models just released — Scout: 109B / 17B active — Maverick: 400B / 17B active — Bohemoth: 2T / 288B active — ai.meta.com/blog/llama-4...
-
@ericflo
Eric Florenzano
on bluesky
Llama 4 looks interesting at least! 10M context length, MoE, early fusion multi-modal! ai.meta.com/blog/llama-4...
-
@taumuyi
Tau-Mu Yi
on bluesky
10M token context window! #AI [embedded post]
-
@chriscox
Chris Cox
on threads
Llama 4 is here! Two models are launching: Scout, a developer-friendly 17Bx16 experts model that's best-in-class for its size and can run on a single GPU, with a massive (10M+ tokens) context window. And Maverick, now the best-in-class multimodal model. …
-
@rpn
Roberto P. Nickson
on threads
The USA is back on top. Meta's Llama 4 Maverick just surpassed Deepseek as the number 1 open model - at only 17B parameters (small enough to run on a single host 🤯)
-
@ajassy
Andy Jassy
on threads
Llama 4, @meta's most powerful AI models to date, are now on AWS via Amazon SageMaker AI and will be coming soon to Amazon Bedrock as fully managed, serverless models. Have at it! https://www.aboutamazon.com/ ...
-
@satyanadella
Satya Nadella
on threads
Thrilled to bring Meta's Llama 4 Scout and Maverick to Foundry today, as we continue to make Azure the platform of choice for the world's most advanced AI models.
-
@latentchat
@latentchat
on threads
The LLama 4 herd has been revealed! 🦙 We got a bunch of new local models from meta: - Llama 4 Scout: A 17B parameter, multimodal model with a 10M context window that outperforms models like Gemma 3 and Mistral 3.1. - Llama 4 Maverick: A 17B parameter, matches DeepSeek v3 on reas…
-
@latesttechtalk
@latesttechtalk
on threads
Llama 4 Maverick just humbled every paid model.
-
@russsalakhutdinov
Russ Salakhutdinov
on threads
Llama4 models are out! Open sourced! Check them out: https://www.llama.com/ “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it…
-
@nvidiadeveloper
@nvidiadeveloper
on threads
👀 Accelerate performance of @AIatMeta Llama 4 Maverick and Llama 4 Scout using our optimizations in #opensource TensorRT-LLM⚡ ✅ NVIDIA Blackwell B200 delivers over 42,000 tokens per second on Llama 4 Scout, over 32,000 tokens per seconds on Llama 4 Maverick. …
-
@zuck
Mark Zuckerberg
on threads
That's when it was ready
-
@alphasignal.ai
@alphasignal.ai
on threads
Huge news. Meta just released the Llama 4 series—three powerful open-source multimodal models. They outperformed Mistral 3.1, GPT-4.5, and Claude 3.7. SCOUT ▸ Run long-context tasks like summarization or code search on one H100 ▸ Beats Mistral 3.1 ▸ 10M+ token context, native …
-
@omarsar0
Elvis
on threads
Llama 4 is here! - Llama 4 Scout & Maverick are up for download - Llama 4 Behemoth (preview) - Advanced problem solving & multilingual - Support long context up to 10M tokens - Great for multimodal apps & agents - Image grounding - Top performance at the lowest cost - Can be ser…
-
@natolambert
Nathan Lambert
on threads
Llama 4 license still a big loss. What a headache.
-
@arunsasi99
Arun S
on threads
Noted that Gemini 2.5 Pro is not mentioned in LLAMA4 release notes. Since it is still in training, we should wait for release & independent tests. #AI “Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks.” https://ai.meta.com/.…
-
@aaldahle
Ahmad Al-Dahle
on threads
📌 The Llama series have been re-designed to use state of the art mixture-of-experts (MoE) architecture and natively trained with multimodality. We're dropping Llama 4 Scout & Llama 4 Maverick, and previewing Llama 4 Behemoth. 📌 Llama 4 Scout is highest performing small model wi…
-
@natolambert
Nathan Lambert
on threads
Happy for my friends that contributed to llama 4, awesome models on quick checks, but bruh wtf you doing launching this on a Saturday?
-
@1ar.io
@1ar.io
on threads
llama 4 just dropped scout variant will be amazing: natively multimodal, with 10M (!) tokens context length, 108B parameters once there will be distilled variants, the context length will shrink but still might be around 1M tokens, which is a huge context to have locally
-
@han_fang_
Han Fang
on threads
Introducing our first set of Llama 4 models with MoE architecture and natively trained with multimodality! 📌 Llama 4 Scout (17B X 16 Experts) is highest performing small model with 17B activated parameters with 16 experts. It's crazy fast, natively multimodal, and very smart. …
-
@vthallam
Venkatesh Thallam
on threads
the Llama 4 Maverick, mid size model already beats GPT 4O and its right behind the flagship Gemini 2.5 Pro model! 🤯
-
@doreturn.in
@doreturn.in
on threads
APRIL AI PREDICTIONS ARE INSANE 🤯 Okay buckle up, April is gonna be WILD for AI launches. Hearing whispers about: - Llama-4 OMNI?! 🔥 - R2 open weights (finally?) - o3 AND o4-mini dropping 🤯 - Grok-3 API access Google's new image gen model - Veo-2 video AI - Sonnet getting tool…
-
@documentingmeta
@documentingmeta
on threads
Meta's Llama 4 Maverick hits in the top 5 across all categories on lmarena.ai Tied for #1 rank specifically in Hard Prompts, Coding, Math, Creative Writing, Longer Query and Multi-Turn!
-
@zuck
Mark Zuckerberg
on threads
Llama 4 Scout: • 17B x 16 experts • Natively multi-modal • 10M token context length • Runs on a single GPU • Highest performing small model
-
@aaldahle
Ahmad Al-Dahle
on threads
Introducing our first set of Llama 4 models! We've been hard at work doing a complete re-design of the Llama series. I'm so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 c…
-
@zuck
Mark Zuckerberg
on threads
For fun, here's Llama 4's performance (highest) vs cost (lowest)!
-
@ericflo
Eric Florenzano
on threads
Who is the Llama 4 Scout “can run on a single GPU” marketing line aimed at? People who don't understand this stuff won't understand why running on a single GPU is impressive, and people who do understand this stuff will know it's basically a lie or misleading at best. — RE: ht…
-
@yannlecun
Yann LeCun
on threads
BOOM! The Llama-4 brood is out. — Scout: 16 experts, 17B each, 109B total parameters, 10M+ context window, can run on a single GPU. — Maverick: multimodal, 128 experts, 17B each, 400B total parameters, 1M context window. — preview of Behemoth: 16 experts, 288B each, 2T total …
-
@ahmad_al_dahle
Ahmad Al-Dahle
on x
Introducing our first set of Llama 4 models! We've been hard at work doing a complete re-design of the Llama series. I'm so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 [im…
-
@aiatmeta
@aiatmeta
on x
Today is the start of a new era of natively multimodal AI innovation. Today, we're introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model [im…
-
@burkov
Andriy Burkov
on x
If today's disappointing release of Llama 4 tells us something, it's that even 30 trillion training tokens and 2 trillion parameters don't make your non-reasoning model better than smaller reasoning models. Model and data size scaling are over.
-
@abacaj
Anton
on x
llama 4 is really a bit disappointing, not a model I would use for assistance (code, etc). turns out gemini 2.5 is a really good model for code & sonnet for agentic tasks. not sure where llama 4 fits in with all of the available models today...
-
@jeremyphoward
Jeremy Howard
on x
Just read through the Llama 4 release announcement. I'm really grateful they've released this with open weights. But tbh I'm also pretty disappointed. The models are both giant MoEs that can't be run on consumer GPUs, even with quant. https://ai.meta.com/... A big loss.😢
-
@natolambert
Nathan Lambert
on x
It's very common for leadership at top labs to be in the know of other lab's release schedules, so the simplest explanation to Meta releasing today is that next week is going to be bonkers, or at least plausibly outshine llama 4 (and they wanted to release last week, but other
-
@miles_brundage
Miles Brundage
on x
Does this mean they should never open source it? No. But it means they should show compelling evidence of having done “high effort” capability elicitation and consult extensively with external experts + stakeholders before a decision.
-
@cloudflare
@cloudflare
on x
Llama 4 Scout 17B Instruct is now available on Workers AI: use this multimodal, Mixture of Experts AI model on Cloudflare's serverless AI platform to build next-gen AI applications. https://blog.cloudflare.com/ ...
-
@miles_brundage
Miles Brundage
on x
AI safety is also no joke and open sourcing at the frontier has wide-ranging implications, including for terrorism and US-China competition. Every day, AI companies find evidence of non-state and state actors using their technology for malicious purposes.
-
@emollick
Ethan Mollick
on x
Pretty impressive outcomes in the leaderboard as well for Llama 4.
-
@basetenco
@basetenco
on x
Llama 4 is here! 🦙🚀 Scout | 109B Parameters | 10M Context Maverick | 400B Parameters | 1M Context Llama 4 models are natively multimodal, use a MoE architecture, and set a new frontier for performance/cost. We're excited to offer dedicated deployments of Llama 4! [image]
-
@levie
Aaron Levie
on x
Wow. Meta just launched its new Llama 4 open-weight model family. Pretty incredible cost/performance. Up to 10M token context windows. We're reaching the point where you're not paying a penalty with open models, but you get all the benefits control. Pretty incredible. [image]
-
@miles_brundage
Miles Brundage
on x
As I've said, open source advanced AI can't always be made “safe for the world,” and more should be done to make the world safe for OS. But society also takes time to adapt, and we're not there yet. So I hope both these companies (and others) do + show their homework here.
-
@dylan522p
Dylan Patel
on x
Alibaba and Deepseek upcoming releases mog Meta to be clear
-
@levelsio
@levelsio
on x
This is insane and makes it finally possibly to vibe code up to giant code sizes The limit just weeks ago was context window, AI would get lost once your vibe coded game or app became too big Imagine an AI with memory loss, it starts breaking stuff With 10M tokens there's
-
@deanwball
Dean W. Ball
on x
In a video, zuck implies (but does not state explicitly) the 2 trillion parameter behemoth variant of llama will be open sourced. My guess is that it will be.
-
@nearcyan
Near
on x
saturday launch is the other interesting llama 4 thing. it's really hard to find a day to launch where you won't be instantly forgotten by competitor launches 4 hours later. the algorithms will only get more ruthless. maybe once people start sunday-launching we know ASI is near
-
@bindureddy
Bindu Reddy
on x
Quick summary on Llama-4 - all instruct/base models, which is excellent news!! - llama-4 maverick competes with Flash and V3.1 - llama-4 behemoth is the interesting one, but it is still in training - 10M (big) context rarely works in practice, so we will have to see how it
-
@redhat_ai
@redhat_ai
on x
Llama 4 Herd is here! It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences. With Day 0 support in vLLM, you can deploy Llama 4 with @vllm_project now! Let's dig into it. (a thread) [image]
-
@dorialexander
Alexander Doria
on x
I'm starting to suspect we actually have Llama 5.
-
@emollick
Ethan Mollick
on x
These are good stats for a small model. Looking forward to testing. And a 10M (!!) context window suggests approaches to RAG are going to need to change dramatically.
-
@miles_brundage
Miles Brundage
on x
Congrats to Meta on Llama 4! Making frontier models is no joke (and harder every year). Initial evals + my vibe tests are solid. At the same time, I am a bit worried re: whether they will sufficiently evaluate Behemoth's security + geopolitical implications before OSing it.
-
@dorialexander
Alexander Doria
on x
Even more so if the 200+ language support is just the usual set of tiny sentences rather than dedicated effort toward low resource language data collection... https://x.com/...
-
@emollick
Ethan Mollick
on x
Looks like even Llama Behemoth doesn't come that close to Gemini 2.5, though, so no open model parity with the state of the art in closed models, yet. [image]
-
@nearcyan
Near
on x
on large context windows: i havent played w the llama 4 series, but needle-in-haystack is woefully insufficient to know the strength of a context window. if you want needle-in-a-haystack, we have grep for that.
-
@eliebakouch
Elie
on x
Llama4 pre-training recap: > MetaP: MuP inspired method to set per layers hyperparameters that transfer across batch size, width, depth and training token (huge) > MoE with 16E and 128E > QK Norm with no learnable parameter (and the 128E have no QK Norm it seems) > FP8 Training
-
@openrouterai
@openrouterai
on x
Llama 4 tip: add “:nitro” to the end of any slug to get the fastest provider @GroqInc is getting 732 tokens per second, and lowest price 👀 [image]
-
@clementdelangue
Clem
on x
Dell is the first big tech company to support llama 4 thanks to our partnership. Let's go!
-
@togethercompute
@togethercompute
on x
We're thrilled to announce the launch of Llama 4 models on Together AI. As a Meta launch partner, we now support the two new groundbreaking multimodal models on the Together API - Llama 4 Maverick and Llama 4 Scout. Link & details below ⤵️ [image]
-
@cloudflaredev
@cloudflaredev
on x
Want access to Llama 4 Scout? It's now on Workers AI! Here's how this new model can help you: - Mixture-of-Experts architecture for fast inference - Natively multimodal (image and text understanding) - Excels at multi-document analysis, codebase reasoning, and personalized tasks
-
@drjimfan
@drjimfan
on x
Llama-4 doesn't disappoint! My notes: - Ease of deployment is now a more important OSS feature than sheer size. There's emphasis that Llama 4 Scout can run on a single H100, as opposed to Llama-3-401B, which was powerful but ultimately had lesser adoption. Mixture of Expert is a …
-
@miles_brundage
Miles Brundage
on x
Re: “but DeepSeek”: Meta has way more GPUs than DeepSeek, and while I think it's pretty robustly good for Meta to release models slightly better than DeepSeek has, something massively better is a distinct question.
-
@groqinc
@groqinc
on x
Llama 4 Scout and Maverick from @Meta are now live on GroqCloud™. Day-zero access. Real-time performance. Lowest cost—without compromise. No waiting. No tuning. Just build fast. [video]
-
@ritakozlov_
Rita Kozlov
on x
we said only one more announcement before developer week, but meta said LFG so here we go: llama 4 on @cloudflaredev workers ai 🦙🧡🚀
-
@lmarena_ai
@lmarena_ai
on x
Meta's Llama 4 Maverick hits in the top 5 across all categories. Tied for #1 rank specifically in Hard Prompts, Coding, Math, Creative Writing, Longer Query and Multi-Turn! [image]
-
@lmarena_ai
@lmarena_ai
on x
BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control [i…
-
@openrouterai
@openrouterai
on x
Llama 4 Scout & Maverick are now available on OpenRouter. Meta's flagship model series achieves a new record 10 million token context length 🚀 @togethercompute and @GroqInc are the first providers. We'll be adding more over the course of the weekend. [image]
-
@nearcyan
Near
on x
if Llama: Behemoth doesn't set the stage for Claude: Requiem nothing will
-
@simonw
Simon Willison
on x
Meta just dropped Llama 4 on a weekend! Two new open weight models (Scout and Maverick) and a preview of a model called Behemoth - Scout has a 10 million token context Best information right now appears to be this blog post: https://ai.meta.com/...
-
@emollick
Ethan Mollick
on x
The requirements to have llama branding everywhere when you use their open weights model would be less of a big deal if the model had a better name that wasn't just a joke based on the abbreviation LLM
-
@maximelabonne
Maxime Labonne
on x
Llama 4's new license comes with several limitations: - Companies with more than 700 million monthly active users must request a special license from Meta, which Meta can grant or deny at its sole discretion. - You must prominently display “Built with Llama” on websites, [image]
-
@brbcatonfire
@brbcatonfire
on x
Meta causally drops llama 4 with a 10M context window model on the weekend. [image]
-
@charliebholtz
Charlie Holtz
on x
“Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro”
-
@iscienceluvr
Tanishq Mathew Abraham, Ph.D.
on x
Other training and arch details of Llama 4: - Multimodal is with early fusion using MetaCLIP as vision encoder - Training with “MetaP” for hyperparameter selection which is probably like MuP - 10x more multilingual tokens than Llama-3 - “Mid-training” to improve core capabilities
-
@michaelsayman
Michael Sayman
on x
Excited to share what I've been working on at Meta for the new Meta AI real soon. ❤️ Today's Llama 4 Day launch is so exciting. 10M context window :) [image]
-
@abhisk_kadian
Abhishek Kadian
on x
Llama4 is out, it has been amazing participating in the training journey of these models (long days & nights) 🚀🦙
-
@bhutanisanyam1
Sanyam Bhutani
on x
Llama 4 supports 10M Context length! 🙏 Reading AN ENTIRE GitHub repo of 900k tokens and writing a guide on it takes under 3 minutes! We are launching two new models Scout and Maverick: - Upto 10M context length - Scout fits on single H100 with int4 quant - Upto 5 images - [image]
-
@dextersjab
@dextersjab
on x
llama 4 is out, but it feels like the benchmark results tell you which models are actually on top by omission [image]
-
@risphereeditor
@risphereeditor
on x
Meta AI just released Llama 4, a text AI (LLM). There are two models available right now: Llama 4 Scout and Llama 4 Maverick. Llama 4 Scout has a 10 million token context window, so it can take a text that is around 7.5 million words, and Llama 4 Maverick beats GPT-4o and
-
@clementdelangue
Clem
on x
Llama 4 Maverick and Scout are the first major models on Hugging Face uploaded with Xet, making the upload significantly faster for Meta and saving you 500GB on downloads. Very necessary as the size of models is increasing. Can't wait to see how much speed up Behemoth will get [i…
-
@natalianeverova
Natalia Neverova
on x
First multi-modal Llama 4 are finally out! Super proud of the team and how far they've pushed this over the past months.
-
@iscienceluvr
Tanishq Mathew Abraham, Ph.D.
on x
Llama 4 release summary: Llama 4 Scout - 109B multimodal MoE, 10M context, can run on a single H100 GPU Llama 4 Maverick - 400B multimodal MoE, beats GPT-4o and Gemini 2.0 Flash, similar to DeepSeek V3 Both are distilled from Llama 4 Behemoth, a 2T MoE still in training [image]
-
@huggingface
@huggingface
on x
We are excited to partner with @AIatMeta to welcome Llama 4 Maverick (402B) & Scout (109B) natively multimodal Language Models on the Hugging Face Hub with Xet 🤗 Both MoE models trained on up-to 40 Trillion tokens, pre-trained on 200 languages and significantly outperforms its [i…
-
@krishnanrohit
Rohit
on x
hahhahah, great use of tokens [image]
-
@simonw
Simon Willison
on x
Dropping Llama 4 on a weekend just isn't fair! Business days only, please
-
@dorialexander
Alexander Doria
on x
Most interesting part for the llama 4 release for me: extensive multilingual support. Low-resource language is reversed chinchilla logic, not much data, so as much parameters as you can. [image]
-
@deanwball
Dean W. Ball
on x
Llama 4 directory references reasoning models but from what I can tell meta has not released details on this. A lightweight reasoner with a 10m token context window is conceivably an extremely useful thing. [image]
-
@xlr8harder
@xlr8harder
on x
If Meta actually follows through on releasing a 2T Llama 4 Maverick it will be incredible.
-
@jconorgrogan
Conor
on x
Llama 4 looks live on Meta's website right now (else its hallucinating something very specific) 10M context window! 🫢 [image]
-
@sharan0909
Sharan Narang
on x
Really amazing work on long context for Llama 4
-
@4xiom_
Joshua
on x
BREAKING 🚨‼️ LLAMA 4 Released. And it's SOTA. The best open source model yet. $META cooked. [video]
-
@sharan0909
Sharan Narang
on x
Very excited to share Llama 4 models with the world. The pre-training team has cooked over the past few months to launch Llama 4 Scout, Maverick, and Behemoth. A 🧵about pretraining Blog link: https://ai.meta.com/... [image]
-
@ariaurelium
@ariaurelium
on x
benchmarks for llama 4 are.... fine, i guess. nothing to write home about [image]
-
@maximelabonne
Maxime Labonne
on x
🦙 Llama 4 is here! → Llama 4 introduces three models: Scout (17B active parameters/16 experts), Maverick (17B active parameters/128 experts), and Behemoth (288B active parameters/16 experts), with only Scout and Maverick being released now. → These are Meta's first natively [imag…
-
@manohar_paluri
Manohar Paluri
on x
Llama 4 has arrived! Multimodal intelligence that is best in class for each size. Hope the world enjoys them as much as the entire team did in building them. Check out the blog for more details: https://ai.meta.com/... [image]
-
@alew3
Alessandro
on x
Meta has just released their Llama 4 models. The “Llama 4 Scout” variant features an impressive 10 million token context window. https://www.llama.com/... #llama4 #llama [image]
-
@maximelabonne
Maxime Labonne
on x
Llama 4 is here with 2T, 400B, and 109B MoEs. This bad boy just got outllamaed. https://ai.meta.com/... [image]
-
@giffmana
Lucas Beyer
on x
10M context length for llama4! This is the moment I've been waiting for to plug one of my fun projects: https://longcat.wtf/ Sound ON! (And congrats to the team who made the 10M contest happen. Gemini is not alone anymore) [image]
-
@tomosman
@tomosman
on x
Was thinking we hadn't heard from @finkd for a hot minute. Llama 4 is here and its a beast. https://x.com/...
-
@rsdgpt
Ryan
on x
Llama 4 is out. 3 sizes. 2T behemoth still in training. Benchmarks look insane and 10M context window. Could be a game changer for the OS community. [image]
-
@asaddhamani
Asad Dhamani
on x
Llama 4 🔥 10M context, 2T params, 128 experts, this thing is lit!
-
@the_ai_investor
@the_ai_investor
on x
Breaking: Mark Zuckerberg just announced LLaMA 4, saying, “For the first time, the best small, mid size, and potentially soon frontier models will be open source.” Llama4 Behemoth with 2T+ parameters [video]
-
@middleclassdesi
@middleclassdesi
on x
🚀 Llama 4: The Future of AI is Here! 🔥 Meta's Llama 4 lineup redefines innovation with four cutting-edge models: • Llama 4 Scout 🔍 •Lightning fast, multi-modal 🖼️🎙️ •10M token context length 📖 •Runs on a single GPU 💻 •17B params × 16 experts 🤖 • Llama 4 Maverick 🦙 [image]
-
@andymstone
Andy Stone
on x
It's Llama 4 day! Today we're dropping the first two open source Llama 4 models and we've got two more on the way. Our goal is to build the world's leading AI, open source it, and make it universally accessible so everyone in the world benefits. [image]
-
@btibor91
Tibor Blaho
on x
Meta released two new open-source multimodal models, Llama 4 Scout and Llama 4 Maverick, which feature a mixture-of-experts architecture, unprecedented context length, and outperform models like GPT-4o and Gemini 2.0 Flash on widely-used benchmarks - Llama 4 Scout - 17 billion [i…
-
@_impl
Sean Bell
on x
🦙🦙🦙🦙 Llama 4 is LIVE - Scout: 17B active x 16 experts, 10M context, fits on 1 GPU, multimodal. - Maverick: 17B active x 128 experts, fits on 1 host, multimodal. - Behemoth: 288B active x 16 experts, 2T total params, multimodal, still training. https://ai.meta.com/... [image]
-
@ashter_haider
Ashter Haider
on x
meta just dropped its Llama 4 models. Llama 4 Scout has 10M context window whoa [image]
-
@deryatr_
Derya Unutmaz
on x
Wow! LLaMA 4 just dropped by @AIatMeta ! It seems like an insanely good open-source model! Scout is 17B model as good as GPT-4o ?! The LLaMA 4 Behemoth is still training on 2 trillion parameters! That's insane! The pace of AI advancement is becoming exponentially fantastic!
-
@bradthilton
Brad Hilton
on x
Llama 4 is here, and it appears to be an absolute BEAST [image]
-
@burny_tech
Burny
on x
Damn, Llama 4 Behemoth is pretty epic, but I would have liked more comparisons with more models on more benchmarks, it's definitely cherrypicked, but still epic [image]
-
@matthewberman
@matthewberman
on x
Saturday release of Llama 4?! [image]
-
@rsalakhu
Russ Salakhutdinov
on x
Llama4 models are out! Open sourced! Check them out: “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it” https://www.llama.com/
-
@loredanacrisan
Loredana Crisan
on x
Llama 4 just landed, try it out in @messenger @WhatsApp and @instagram DMs.
-
@bindureddy
Bindu Reddy
on x
Llama-4 has a 10M context window and it's open weights! It's like I died and went to heaven 🔥😍 [image]
-
@_arohan_
Rohan Anil
on x
Llama 4 sets a new bar and pops previous pareto frontier! Its open weights and we support open science. Come join us, its hard work but you will have the most fun doing it with most fun gang!!! Check out ⬇️
-
@jpineau1
Joelle Pineau
on x
We're shipping the first models in the Llama 4 herd! Llama 4 Scout and Llama 4 Maverick, are our first MoE models, natively multimodal and are our most advanced yet — open-sourced as always. https://ai.meta.com/...
-
@eaccelerate_42
@eaccelerate_42
on x
Llama 4 is here with 10M context window !!!!! Multimodal - Llama 4 Scout: 17B x 16 experts (109B) 10M context 🤯🐘 - Llama 4 Maverick: 17B x 128 experts (400B) - Llama 4 Behemoth Preview: Used to distill Scout and Maverick, and still in training!
-
@madiator
Mahesh Sathiamoorthy
on x
Oh wow, did they just drop Llama 4 silently on a Saturday morning? * Multimodal and multilingual * 10M context length for Llama 4 scout! * MoE [image]
-
@frameworkputer
@frameworkputer
on x
Llama 4 Scout looks like a perfect model for Framework Desktop! 109B, so it will fit in the 128GB configuration at Q6 (or possibly higher), and it's an MoE model, so only 17B parameters are active at a time. Looking forward to trying this @metaai! [image]
-
@astonzhangaz
Aston Zhang
on x
Our Llama 4's industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I'd been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B [image…
-
@naklecha
@naklecha
on x
llama4 with 10 mil context window is lowkey kinda insane.
-
@cfgeek
Charles Foster
on x
Llama4 appears to be here. [image]
-
@deanwball
Dean W. Ball
on x
“Most powerful open source multimodal model.” Use of the modifier “open source” suggests that there may be a more powerful closed source variant of llama 4 coming.
-
@rishdotblog
Rishabh Srivastava
on x
WTF Llama4 has a 10M context window! They did it, they actually did it! [image]
-
r/ABoringDystopia
r
on reddit
Under the heading of “Addressing bias in LLMs”, Meta explains that its new Llama models are being factored to have a less “left-leaning bias”. …
-
r/singularity
r
on reddit
Llama 4 Scout with 10M tokens
-
r/artificial
r
on reddit
Llama 4 is here
-
r/technology
r
on reddit
Meta's new AI model Llama 4 has been released
-
r/singularity
r
on reddit
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
-
r/LocalLLaMA
r
on reddit
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
-
@deanwball
Dean W. Ball
on x
👀 [image]
-
r/artificial
r
on reddit
One-Minute Daily AI News 4/5/2025
-
@zuck
Mark Zuckerberg
on threads
Reasoning: • Coming soon!