Meta launches Llama 4 Maverick with 400B parameters and Scout with 109B parameters and a 10M context window, and previews Behemoth with 2T total parameters

Takeaways — We're sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.

Meta 2025-04-06

Discussion

@chirag Chirag Mehta on bluesky
You ask how fast AI models are evolving - you announce it on Saturday fast. [embedded post]
@jeremymorrell.dev Jeremy Morrell on bluesky
Meta introduced Llama 4 models and added this section near the very bottom of the announcement 😬 — “[LLMs] historically have leaned left when it comes to debated political and social topics.” — ai.meta.com/blog/llama-4... [images]
@seldo.com Laurie Voss on bluesky
Llama 4 models are very impressive, but why drop them at noon on a Saturday?? ai.meta.com/blog/llama-4...
@simonwillison.net Simon Willison on bluesky
Meta just dropped Llama 4 on a weekend! Two new open weight models (Scout and Maverick) and a preview of a model called Behemoth - Scout has a 10 million token context — Best information right now appears to be this blog post: ai.meta.com/blog/llama-4...
@timkellogg.me Tim Kellogg on bluesky
🚨Llama 4 Is Out! 🚨 — 2 out of 3 models just released — Scout: 109B / 17B active — Maverick: 400B / 17B active — Bohemoth: 2T / 288B active — ai.meta.com/blog/llama-4...
@ericflo Eric Florenzano on bluesky
Llama 4 looks interesting at least! 10M context length, MoE, early fusion multi-modal! ai.meta.com/blog/llama-4...
@taumuyi Tau-Mu Yi on bluesky
10M token context window! #AI [embedded post]
@chriscox Chris Cox on threads
Llama 4 is here! Two models are launching: Scout, a developer-friendly 17Bx16 experts model that's best-in-class for its size and can run on a single GPU, with a massive (10M+ tokens) context window. And Maverick, now the best-in-class multimodal model. …
@rpn Roberto P. Nickson on threads
The USA is back on top. Meta's Llama 4 Maverick just surpassed Deepseek as the number 1 open model - at only 17B parameters (small enough to run on a single host 🤯)
@ajassy Andy Jassy on threads
Llama 4, @meta's most powerful AI models to date, are now on AWS via Amazon SageMaker AI and will be coming soon to Amazon Bedrock as fully managed, serverless models. Have at it! https://www.aboutamazon.com/ ...
@satyanadella Satya Nadella on threads
Thrilled to bring Meta's Llama 4 Scout and Maverick to Foundry today, as we continue to make Azure the platform of choice for the world's most advanced AI models.
@latentchat @latentchat on threads
The LLama 4 herd has been revealed! 🦙 We got a bunch of new local models from meta: - Llama 4 Scout: A 17B parameter, multimodal model with a 10M context window that outperforms models like Gemma 3 and Mistral 3.1. - Llama 4 Maverick: A 17B parameter, matches DeepSeek v3 on reas…
@latesttechtalk @latesttechtalk on threads
Llama 4 Maverick just humbled every paid model.
@russsalakhutdinov Russ Salakhutdinov on threads
Llama4 models are out! Open sourced! Check them out: https://www.llama.com/ “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it…
@nvidiadeveloper @nvidiadeveloper on threads
👀 Accelerate performance of @AIatMeta Llama 4 Maverick and Llama 4 Scout using our optimizations in #opensource TensorRT-LLM⚡ ✅ NVIDIA Blackwell B200 delivers over 42,000 tokens per second on Llama 4 Scout, over 32,000 tokens per seconds on Llama 4 Maverick. …
@zuck Mark Zuckerberg on threads
That's when it was ready
@alphasignal.ai @alphasignal.ai on threads
Huge news. Meta just released the Llama 4 series—three powerful open-source multimodal models. They outperformed Mistral 3.1, GPT-4.5, and Claude 3.7. SCOUT ▸ Run long-context tasks like summarization or code search on one H100 ▸ Beats Mistral 3.1 ▸ 10M+ token context, native …
@omarsar0 Elvis on threads
Llama 4 is here! - Llama 4 Scout & Maverick are up for download - Llama 4 Behemoth (preview) - Advanced problem solving & multilingual - Support long context up to 10M tokens - Great for multimodal apps & agents - Image grounding - Top performance at the lowest cost - Can be ser…
@natolambert Nathan Lambert on threads
Llama 4 license still a big loss. What a headache.
@arunsasi99 Arun S on threads
Noted that Gemini 2.5 Pro is not mentioned in LLAMA4 release notes. Since it is still in training, we should wait for release & independent tests. #AI “Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks.” https://ai.meta.com/.…
@aaldahle Ahmad Al-Dahle on threads
📌 The Llama series have been re-designed to use state of the art mixture-of-experts (MoE) architecture and natively trained with multimodality. We're dropping Llama 4 Scout & Llama 4 Maverick, and previewing Llama 4 Behemoth. 📌 Llama 4 Scout is highest performing small model wi…
@natolambert Nathan Lambert on threads
Happy for my friends that contributed to llama 4, awesome models on quick checks, but bruh wtf you doing launching this on a Saturday?
@1ar.io @1ar.io on threads
llama 4 just dropped scout variant will be amazing: natively multimodal, with 10M (!) tokens context length, 108B parameters once there will be distilled variants, the context length will shrink but still might be around 1M tokens, which is a huge context to have locally
@han_fang_ Han Fang on threads
Introducing our first set of Llama 4 models with MoE architecture and natively trained with multimodality! 📌 Llama 4 Scout (17B X 16 Experts) is highest performing small model with 17B activated parameters with 16 experts. It's crazy fast, natively multimodal, and very smart. …
@vthallam Venkatesh Thallam on threads
the Llama 4 Maverick, mid size model already beats GPT 4O and its right behind the flagship Gemini 2.5 Pro model! 🤯
@doreturn.in @doreturn.in on threads
APRIL AI PREDICTIONS ARE INSANE 🤯 Okay buckle up, April is gonna be WILD for AI launches. Hearing whispers about: - Llama-4 OMNI?! 🔥 - R2 open weights (finally?) - o3 AND o4-mini dropping 🤯 - Grok-3 API access Google's new image gen model - Veo-2 video AI - Sonnet getting tool…
@documentingmeta @documentingmeta on threads
Meta's Llama 4 Maverick hits in the top 5 across all categories on lmarena.ai Tied for #1 rank specifically in Hard Prompts, Coding, Math, Creative Writing, Longer Query and Multi-Turn!
@zuck Mark Zuckerberg on threads
Llama 4 Scout: • 17B x 16 experts • Natively multi-modal • 10M token context length • Runs on a single GPU • Highest performing small model
@aaldahle Ahmad Al-Dahle on threads
Introducing our first set of Llama 4 models! We've been hard at work doing a complete re-design of the Llama series. I'm so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 c…
@zuck Mark Zuckerberg on threads
For fun, here's Llama 4's performance (highest) vs cost (lowest)!
@ericflo Eric Florenzano on threads
Who is the Llama 4 Scout “can run on a single GPU” marketing line aimed at? People who don't understand this stuff won't understand why running on a single GPU is impressive, and people who do understand this stuff will know it's basically a lie or misleading at best. — RE: ht…
@yannlecun Yann LeCun on threads
BOOM! The Llama-4 brood is out. — Scout: 16 experts, 17B each, 109B total parameters, 10M+ context window, can run on a single GPU. — Maverick: multimodal, 128 experts, 17B each, 400B total parameters, 1M context window. — preview of Behemoth: 16 experts, 288B each, 2T total …
@ahmad_al_dahle Ahmad Al-Dahle on x
Introducing our first set of Llama 4 models! We've been hard at work doing a complete re-design of the Llama series. I'm so excited to share it with the world today and mark another major milestone for the Llama herd as we release the *first* open source models in the Llama 4 [im…
@aiatmeta @aiatmeta on x
Today is the start of a new era of natively multimodal AI innovation. Today, we're introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model [im…
@burkov Andriy Burkov on x
If today's disappointing release of Llama 4 tells us something, it's that even 30 trillion training tokens and 2 trillion parameters don't make your non-reasoning model better than smaller reasoning models. Model and data size scaling are over.
@abacaj Anton on x
llama 4 is really a bit disappointing, not a model I would use for assistance (code, etc). turns out gemini 2.5 is a really good model for code & sonnet for agentic tasks. not sure where llama 4 fits in with all of the available models today...
@jeremyphoward Jeremy Howard on x
Just read through the Llama 4 release announcement. I'm really grateful they've released this with open weights. But tbh I'm also pretty disappointed. The models are both giant MoEs that can't be run on consumer GPUs, even with quant. https://ai.meta.com/... A big loss.😢
@natolambert Nathan Lambert on x
It's very common for leadership at top labs to be in the know of other lab's release schedules, so the simplest explanation to Meta releasing today is that next week is going to be bonkers, or at least plausibly outshine llama 4 (and they wanted to release last week, but other
@miles_brundage Miles Brundage on x
Does this mean they should never open source it? No. But it means they should show compelling evidence of having done “high effort” capability elicitation and consult extensively with external experts + stakeholders before a decision.
@cloudflare @cloudflare on x
Llama 4 Scout 17B Instruct is now available on Workers AI: use this multimodal, Mixture of Experts AI model on Cloudflare's serverless AI platform to build next-gen AI applications. https://blog.cloudflare.com/ ...
@miles_brundage Miles Brundage on x
AI safety is also no joke and open sourcing at the frontier has wide-ranging implications, including for terrorism and US-China competition. Every day, AI companies find evidence of non-state and state actors using their technology for malicious purposes.
@emollick Ethan Mollick on x
Pretty impressive outcomes in the leaderboard as well for Llama 4.
@basetenco @basetenco on x
Llama 4 is here! 🦙🚀 Scout | 109B Parameters | 10M Context Maverick | 400B Parameters | 1M Context Llama 4 models are natively multimodal, use a MoE architecture, and set a new frontier for performance/cost. We're excited to offer dedicated deployments of Llama 4! [image]
@levie Aaron Levie on x
Wow. Meta just launched its new Llama 4 open-weight model family. Pretty incredible cost/performance. Up to 10M token context windows. We're reaching the point where you're not paying a penalty with open models, but you get all the benefits control. Pretty incredible. [image]
@miles_brundage Miles Brundage on x
As I've said, open source advanced AI can't always be made “safe for the world,” and more should be done to make the world safe for OS. But society also takes time to adapt, and we're not there yet. So I hope both these companies (and others) do + show their homework here.
@dylan522p Dylan Patel on x
Alibaba and Deepseek upcoming releases mog Meta to be clear
@levelsio @levelsio on x
This is insane and makes it finally possibly to vibe code up to giant code sizes The limit just weeks ago was context window, AI would get lost once your vibe coded game or app became too big Imagine an AI with memory loss, it starts breaking stuff With 10M tokens there's
@deanwball Dean W. Ball on x
In a video, zuck implies (but does not state explicitly) the 2 trillion parameter behemoth variant of llama will be open sourced. My guess is that it will be.
@nearcyan Near on x
saturday launch is the other interesting llama 4 thing. it's really hard to find a day to launch where you won't be instantly forgotten by competitor launches 4 hours later. the algorithms will only get more ruthless. maybe once people start sunday-launching we know ASI is near
@bindureddy Bindu Reddy on x
Quick summary on Llama-4 - all instruct/base models, which is excellent news!! - llama-4 maverick competes with Flash and V3.1 - llama-4 behemoth is the interesting one, but it is still in training - 10M (big) context rarely works in practice, so we will have to see how it
@redhat_ai @redhat_ai on x
Llama 4 Herd is here! It brings a lot of goodies, like MoE architecture and native multimodality, enabling developers to build personalized multimodal experiences. With Day 0 support in vLLM, you can deploy Llama 4 with @vllm_project now! Let's dig into it. (a thread) [image]
@dorialexander Alexander Doria on x
I'm starting to suspect we actually have Llama 5.
@emollick Ethan Mollick on x
These are good stats for a small model. Looking forward to testing. And a 10M (!!) context window suggests approaches to RAG are going to need to change dramatically.
@miles_brundage Miles Brundage on x
Congrats to Meta on Llama 4! Making frontier models is no joke (and harder every year). Initial evals + my vibe tests are solid. At the same time, I am a bit worried re: whether they will sufficiently evaluate Behemoth's security + geopolitical implications before OSing it.
@dorialexander Alexander Doria on x
Even more so if the 200+ language support is just the usual set of tiny sentences rather than dedicated effort toward low resource language data collection... https://x.com/...
@emollick Ethan Mollick on x
Looks like even Llama Behemoth doesn't come that close to Gemini 2.5, though, so no open model parity with the state of the art in closed models, yet. [image]
@nearcyan Near on x
on large context windows: i havent played w the llama 4 series, but needle-in-haystack is woefully insufficient to know the strength of a context window. if you want needle-in-a-haystack, we have grep for that.
@eliebakouch Elie on x
Llama4 pre-training recap: > MetaP: MuP inspired method to set per layers hyperparameters that transfer across batch size, width, depth and training token (huge) > MoE with 16E and 128E > QK Norm with no learnable parameter (and the 128E have no QK Norm it seems) > FP8 Training
@openrouterai @openrouterai on x
Llama 4 tip: add “:nitro” to the end of any slug to get the fastest provider @GroqInc is getting 732 tokens per second, and lowest price 👀 [image]
@clementdelangue Clem on x
Dell is the first big tech company to support llama 4 thanks to our partnership. Let's go!
@togethercompute @togethercompute on x
We're thrilled to announce the launch of Llama 4 models on Together AI. As a Meta launch partner, we now support the two new groundbreaking multimodal models on the Together API - Llama 4 Maverick and Llama 4 Scout. Link & details below ⤵️ [image]
@cloudflaredev @cloudflaredev on x
Want access to Llama 4 Scout? It's now on Workers AI! Here's how this new model can help you: - Mixture-of-Experts architecture for fast inference - Natively multimodal (image and text understanding) - Excels at multi-document analysis, codebase reasoning, and personalized tasks
@drjimfan @drjimfan on x
Llama-4 doesn't disappoint! My notes: - Ease of deployment is now a more important OSS feature than sheer size. There's emphasis that Llama 4 Scout can run on a single H100, as opposed to Llama-3-401B, which was powerful but ultimately had lesser adoption. Mixture of Expert is a …
@miles_brundage Miles Brundage on x
Re: “but DeepSeek”: Meta has way more GPUs than DeepSeek, and while I think it's pretty robustly good for Meta to release models slightly better than DeepSeek has, something massively better is a distinct question.
@groqinc @groqinc on x
Llama 4 Scout and Maverick from @Meta are now live on GroqCloud™. Day-zero access. Real-time performance. Lowest cost—without compromise. No waiting. No tuning. Just build fast. [video]
@ritakozlov_ Rita Kozlov on x
we said only one more announcement before developer week, but meta said LFG so here we go: llama 4 on @cloudflaredev workers ai 🦙🧡🚀
@lmarena_ai @lmarena_ai on x
Meta's Llama 4 Maverick hits in the top 5 across all categories. Tied for #1 rank specifically in Hard Prompts, Coding, Math, Creative Writing, Longer Query and Multi-Turn! [image]
@lmarena_ai @lmarena_ai on x
BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control [i…
@openrouterai @openrouterai on x
Llama 4 Scout & Maverick are now available on OpenRouter. Meta's flagship model series achieves a new record 10 million token context length 🚀 @togethercompute and @GroqInc are the first providers. We'll be adding more over the course of the weekend. [image]
@nearcyan Near on x
if Llama: Behemoth doesn't set the stage for Claude: Requiem nothing will
@simonw Simon Willison on x
Meta just dropped Llama 4 on a weekend! Two new open weight models (Scout and Maverick) and a preview of a model called Behemoth - Scout has a 10 million token context Best information right now appears to be this blog post: https://ai.meta.com/...
@emollick Ethan Mollick on x
The requirements to have llama branding everywhere when you use their open weights model would be less of a big deal if the model had a better name that wasn't just a joke based on the abbreviation LLM
@maximelabonne Maxime Labonne on x
Llama 4's new license comes with several limitations: - Companies with more than 700 million monthly active users must request a special license from Meta, which Meta can grant or deny at its sole discretion. - You must prominently display “Built with Llama” on websites, [image]
@brbcatonfire @brbcatonfire on x
Meta causally drops llama 4 with a 10M context window model on the weekend. [image]
@charliebholtz Charlie Holtz on x
“Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro”
@iscienceluvr Tanishq Mathew Abraham, Ph.D. on x
Other training and arch details of Llama 4: - Multimodal is with early fusion using MetaCLIP as vision encoder - Training with “MetaP” for hyperparameter selection which is probably like MuP - 10x more multilingual tokens than Llama-3 - “Mid-training” to improve core capabilities
@michaelsayman Michael Sayman on x
Excited to share what I've been working on at Meta for the new Meta AI real soon. ❤️ Today's Llama 4 Day launch is so exciting. 10M context window :) [image]
@abhisk_kadian Abhishek Kadian on x
Llama4 is out, it has been amazing participating in the training journey of these models (long days & nights) 🚀🦙
@bhutanisanyam1 Sanyam Bhutani on x
Llama 4 supports 10M Context length! 🙏 Reading AN ENTIRE GitHub repo of 900k tokens and writing a guide on it takes under 3 minutes! We are launching two new models Scout and Maverick: - Upto 10M context length - Scout fits on single H100 with int4 quant - Upto 5 images - [image]
@dextersjab @dextersjab on x
llama 4 is out, but it feels like the benchmark results tell you which models are actually on top by omission [image]
@risphereeditor @risphereeditor on x
Meta AI just released Llama 4, a text AI (LLM). There are two models available right now: Llama 4 Scout and Llama 4 Maverick. Llama 4 Scout has a 10 million token context window, so it can take a text that is around 7.5 million words, and Llama 4 Maverick beats GPT-4o and
@clementdelangue Clem on x
Llama 4 Maverick and Scout are the first major models on Hugging Face uploaded with Xet, making the upload significantly faster for Meta and saving you 500GB on downloads. Very necessary as the size of models is increasing. Can't wait to see how much speed up Behemoth will get [i…
@natalianeverova Natalia Neverova on x
First multi-modal Llama 4 are finally out! Super proud of the team and how far they've pushed this over the past months.
@iscienceluvr Tanishq Mathew Abraham, Ph.D. on x
Llama 4 release summary: Llama 4 Scout - 109B multimodal MoE, 10M context, can run on a single H100 GPU Llama 4 Maverick - 400B multimodal MoE, beats GPT-4o and Gemini 2.0 Flash, similar to DeepSeek V3 Both are distilled from Llama 4 Behemoth, a 2T MoE still in training [image]
@huggingface @huggingface on x
We are excited to partner with @AIatMeta to welcome Llama 4 Maverick (402B) & Scout (109B) natively multimodal Language Models on the Hugging Face Hub with Xet 🤗 Both MoE models trained on up-to 40 Trillion tokens, pre-trained on 200 languages and significantly outperforms its [i…
@krishnanrohit Rohit on x
hahhahah, great use of tokens [image]
@simonw Simon Willison on x
Dropping Llama 4 on a weekend just isn't fair! Business days only, please
@dorialexander Alexander Doria on x
Most interesting part for the llama 4 release for me: extensive multilingual support. Low-resource language is reversed chinchilla logic, not much data, so as much parameters as you can. [image]
@deanwball Dean W. Ball on x
Llama 4 directory references reasoning models but from what I can tell meta has not released details on this. A lightweight reasoner with a 10m token context window is conceivably an extremely useful thing. [image]
@xlr8harder @xlr8harder on x
If Meta actually follows through on releasing a 2T Llama 4 Maverick it will be incredible.
@jconorgrogan Conor on x
Llama 4 looks live on Meta's website right now (else its hallucinating something very specific) 10M context window! 🫢 [image]
@sharan0909 Sharan Narang on x
Really amazing work on long context for Llama 4
@4xiom_ Joshua on x
BREAKING 🚨‼️ LLAMA 4 Released. And it's SOTA. The best open source model yet. $META cooked. [video]
@sharan0909 Sharan Narang on x
Very excited to share Llama 4 models with the world. The pre-training team has cooked over the past few months to launch Llama 4 Scout, Maverick, and Behemoth. A 🧵about pretraining Blog link: https://ai.meta.com/... [image]
@ariaurelium @ariaurelium on x
benchmarks for llama 4 are.... fine, i guess. nothing to write home about [image]
@maximelabonne Maxime Labonne on x
🦙 Llama 4 is here! → Llama 4 introduces three models: Scout (17B active parameters/16 experts), Maverick (17B active parameters/128 experts), and Behemoth (288B active parameters/16 experts), with only Scout and Maverick being released now. → These are Meta's first natively [imag…
@manohar_paluri Manohar Paluri on x
Llama 4 has arrived! Multimodal intelligence that is best in class for each size. Hope the world enjoys them as much as the entire team did in building them. Check out the blog for more details: https://ai.meta.com/... [image]
@alew3 Alessandro on x
Meta has just released their Llama 4 models. The “Llama 4 Scout” variant features an impressive 10 million token context window. https://www.llama.com/... #llama4 #llama [image]
@maximelabonne Maxime Labonne on x
Llama 4 is here with 2T, 400B, and 109B MoEs. This bad boy just got outllamaed. https://ai.meta.com/... [image]
@giffmana Lucas Beyer on x
10M context length for llama4! This is the moment I've been waiting for to plug one of my fun projects: https://longcat.wtf/ Sound ON! (And congrats to the team who made the 10M contest happen. Gemini is not alone anymore) [image]
@tomosman @tomosman on x
Was thinking we hadn't heard from @finkd for a hot minute. Llama 4 is here and its a beast. https://x.com/...
@rsdgpt Ryan on x
Llama 4 is out. 3 sizes. 2T behemoth still in training. Benchmarks look insane and 10M context window. Could be a game changer for the OS community. [image]
@asaddhamani Asad Dhamani on x
Llama 4 🔥 10M context, 2T params, 128 experts, this thing is lit!
@the_ai_investor @the_ai_investor on x
Breaking: Mark Zuckerberg just announced LLaMA 4, saying, “For the first time, the best small, mid size, and potentially soon frontier models will be open source.” Llama4 Behemoth with 2T+ parameters [video]
@middleclassdesi @middleclassdesi on x
🚀 Llama 4: The Future of AI is Here! 🔥 Meta's Llama 4 lineup redefines innovation with four cutting-edge models: • Llama 4 Scout 🔍 •Lightning fast, multi-modal 🖼️🎙️ •10M token context length 📖 •Runs on a single GPU 💻 •17B params × 16 experts 🤖 • Llama 4 Maverick 🦙 [image]
@andymstone Andy Stone on x
It's Llama 4 day! Today we're dropping the first two open source Llama 4 models and we've got two more on the way. Our goal is to build the world's leading AI, open source it, and make it universally accessible so everyone in the world benefits. [image]
@btibor91 Tibor Blaho on x
Meta released two new open-source multimodal models, Llama 4 Scout and Llama 4 Maverick, which feature a mixture-of-experts architecture, unprecedented context length, and outperform models like GPT-4o and Gemini 2.0 Flash on widely-used benchmarks - Llama 4 Scout - 17 billion [i…
@_impl Sean Bell on x
🦙🦙🦙🦙 Llama 4 is LIVE - Scout: 17B active x 16 experts, 10M context, fits on 1 GPU, multimodal. - Maverick: 17B active x 128 experts, fits on 1 host, multimodal. - Behemoth: 288B active x 16 experts, 2T total params, multimodal, still training. https://ai.meta.com/... [image]
@ashter_haider Ashter Haider on x
meta just dropped its Llama 4 models. Llama 4 Scout has 10M context window whoa [image]
@deryatr_ Derya Unutmaz on x
Wow! LLaMA 4 just dropped by @AIatMeta ! It seems like an insanely good open-source model! Scout is 17B model as good as GPT-4o ?! The LLaMA 4 Behemoth is still training on 2 trillion parameters! That's insane! The pace of AI advancement is becoming exponentially fantastic!
@bradthilton Brad Hilton on x
Llama 4 is here, and it appears to be an absolute BEAST [image]
@burny_tech Burny on x
Damn, Llama 4 Behemoth is pretty epic, but I would have liked more comparisons with more models on more benchmarks, it's definitely cherrypicked, but still epic [image]
@matthewberman @matthewberman on x
Saturday release of Llama 4?! [image]
@rsalakhu Russ Salakhutdinov on x
Llama4 models are out! Open sourced! Check them out: “Native multimodality, mixture-of-experts models, super long context windows, step changes in performance, and unparalleled efficiency. All in easy-to-deploy sizes custom fit for how you want to use it” https://www.llama.com/
@loredanacrisan Loredana Crisan on x
Llama 4 just landed, try it out in @messenger @WhatsApp and @instagram DMs.
@bindureddy Bindu Reddy on x
Llama-4 has a 10M context window and it's open weights! It's like I died and went to heaven 🔥😍 [image]
@_arohan_ Rohan Anil on x
Llama 4 sets a new bar and pops previous pareto frontier! Its open weights and we support open science. Come join us, its hard work but you will have the most fun doing it with most fun gang!!! Check out ⬇️
@jpineau1 Joelle Pineau on x
We're shipping the first models in the Llama 4 herd! Llama 4 Scout and Llama 4 Maverick, are our first MoE models, natively multimodal and are our most advanced yet — open-sourced as always. https://ai.meta.com/...
@eaccelerate_42 @eaccelerate_42 on x
Llama 4 is here with 10M context window !!!!! Multimodal - Llama 4 Scout: 17B x 16 experts (109B) 10M context 🤯🐘 - Llama 4 Maverick: 17B x 128 experts (400B) - Llama 4 Behemoth Preview: Used to distill Scout and Maverick, and still in training!
@madiator Mahesh Sathiamoorthy on x
Oh wow, did they just drop Llama 4 silently on a Saturday morning? * Multimodal and multilingual * 10M context length for Llama 4 scout! * MoE [image]
@frameworkputer @frameworkputer on x
Llama 4 Scout looks like a perfect model for Framework Desktop! 109B, so it will fit in the 128GB configuration at Q6 (or possibly higher), and it's an MoE model, so only 17B parameters are active at a time. Looking forward to trying this @metaai! [image]
@astonzhangaz Aston Zhang on x
Our Llama 4's industry leading 10M+ multimodal context length (20+ hours of video) has been a wild ride. The iRoPE architecture I'd been working on helped a bit with the long-term infinite context goal toward AGI. Huge thanks to my incredible teammates! 🚀Llama 4 Scout 🔹17B [image…
@naklecha @naklecha on x
llama4 with 10 mil context window is lowkey kinda insane.
@cfgeek Charles Foster on x
Llama4 appears to be here. [image]
@deanwball Dean W. Ball on x
“Most powerful open source multimodal model.” Use of the modifier “open source” suggests that there may be a more powerful closed source variant of llama 4 coming.
@rishdotblog Rishabh Srivastava on x
WTF Llama4 has a 10M context window! They did it, they actually did it! [image]
r/ABoringDystopia r on reddit
Under the heading of “Addressing bias in LLMs”, Meta explains that its new Llama models are being factored to have a less “left-leaning bias”. …
r/singularity r on reddit
Llama 4 Scout with 10M tokens
r/artificial r on reddit
Llama 4 is here
r/technology r on reddit
Meta's new AI model Llama 4 has been released
r/singularity r on reddit
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
r/LocalLLaMA r on reddit
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
@deanwball Dean W. Ball on x
👀 [image]
r/artificial r on reddit
One-Minute Daily AI News 4/5/2025
@zuck Mark Zuckerberg on threads
Reasoning: • Coming soon!

Chronicles

Meta launches Llama 4 Maverick with 400B parameters and Scout with 109B parameters and a 10M context window, and previews Behemoth with 2T total parameters

Related Coverage

Discussion