xAI introduces Grok 4, trained on its Colossus supercomputer, with multimodal features, faster reasoning, Grok 4 Voice, Grok 4 Code, a new interface, and more
Deeper thinking and greater reasoning is promised — An hour after the live stream was supposed to start last night (July 9) …
Tom's Guide Amanda Caswell
Related Coverage
- Musk makes grand promises about Grok 4 in the wake of a Nazi chatbot meltdown The Verge · Hayden Field
- Microsoft's new Phi-4-mini-flash-reasoning model speeds up on-device AI by 10x Neowin · Paul Hill
- Musk Debuts New ‘Superhuman’ Grok 4 AI Update, Does Not Mention Antisemitic Meltdown The Wrap · Sean Burch
- Grok had a chaotic few days — and now Elon Musk says he's adding the AI to Teslas Business Insider · Tom Carter
- xAI debuts Grok 4, “smartest AI in the world” Axios · Scott Rosenberg
- Elon Musk launches Grok 4 a day after antisemitism row: Check subscription prices and more Digit · Himani Jha
- xAI introduces Grok 4 and Grok 4 Heavy as its most powerful AI models yet TestingCatalog · Alexey Shabanov
- Grok 4 better than PhD level, claims Musk Proactive · Ian Lyall
- Elon Musk Launches Grok 4, Says it Outsmarts PhDs in Every Field The Crypto Times · Dishita Malvania
- xAI introduces Grok 4; Musk predicts Grok will start inventing new technologies by 2026 Crypto Briefing · Vivian Nguyen
- Elon Musk's xAI rolls out Grok 4 with major upgrades Cryptopolitan · Brenda Kanana
- Musk unveils ‘PhD-level’ Grok 4 AI following Nazi chatbot uproar BGR · José Adorno
- xAI releases Grok 4, claiming Ph.D.-level smarts across all fields R&D World · Brian Buntz
- Musk Unveils Grok 4 Amid AI Scandal and Executive Exodus implicator.ai · Marcus Schuler
- Elon Musk's xAI Launches ‘Remarkable, Terrifying’ Grok 4 Model Decrypt · Vince Dioquino
- Elon Musk's xAI rolls out Grok 4 at $300 monthly subscription Nairametrics · Deborah Dan-Awoh
- xAI launches Grok 4, right after the AI chatbot spewed hate speech Mashable · Stan Schroeder
- Musk releases much more powerful Grok 4 and ‘thinks it likely’ won't lead to the collapse of humanity Metro.co.uk · Sarah Hooper
- Musk Launches Grok 4 Amid Antisemitism Controversy—Claims It's ‘Smarter Than Almost All Graduate Students’ Forbes · Siladitya Ray
- Grok 4 Is Here! Musk's AI Claims to Beat GPT-5, Stirs Controversy Coinpedia Fintech News · Zafar Naik
- Musk Unveils Grok 4 AI Chatbot After Antisemitism Controversy Bloomberg
- Musk unveils Grok 4 update a day after xAI chatbot made antisemitic remarks CBS News · Megan Cerullo
- Elon Musk's xAI debuts Grok 4, ‘the smartest AI in the world’ MacDailyNews
- Musk's Grok-4 Crushes Benchmarks, Beats OpenAI & Google in RL Analytics India Magazine · Siddharth Jindal
- Elon Musk's Grok 4 AI Models Set New Benchmark Records Beebom · Arjun Sha
- Musk unveils Grok 4 as xAI's new AI model that beats OpenAI and Google on major benchmarks The Decoder · Maximilian Schreiner
- Grok 4 Launch [video] Hacker News
- Musk Says Grok Coming To Teslas By Next Week: EVs Getting ‘The Smartest AI In The World’ Benzinga · Chris Katje
- Elon Musk Unveils Grok 4 and SuperGrok Heavy: xAI Challenges AI Giants with Frontier-Level Models The Hans India · Kahekashan
- xAI launches Grok 4, Grok 4 Heavy, SuperGrok Heavy Sherwood News · Jon Keegan
- Grok 4: Elon Musk unveils latest model amid antisemitism backlash and leadership shake-up The Economic Times
- Elon Musk unveils Grok 4, Grok 4 Heavy, and premium $300 SuperGrok Heavy model The Indian Express
- xAI Launches Grok 4 with “Heavy” Variant, Outperforming OpenAI, Google, Anthropic in Early Benchmarks WinBuzzer · Markus Kasanmascheff
- OpenAI and Perplexity take on Google, Linda Yaccarino steps down from X, and a stealth Nike Basketball shoe StrictlyVC
- ‘Grok 4 is better than PhDs in every subject,’ Elon Musk claims as he launches $300 monthly subscription plan Moneycontrol · Linda Yaccarino
- xAI launches Grok 4 with new $300/month SuperGrok Heavy subscription TESLARATI · Simon Alvarez
- Elon Musk spent almost an hour talking about Grok without mentioning its Nazi problem Engadget · Mariella Moon
- xAI's Grok 4 arrives The Rundown AI
- Grok 4 beats ChatGPT to become top public AI model as Elon Musk touts $300/month premium subscription Notebookcheck · Daniel Zlatev
- Turkey blocks Grok over offensive responses about President Erdogan and Ataturk; Poland plans to report Grok to the EU for offensive posts about Donald Tusk Politico · Elçin Poyrazlar
- Elon Musk's Grok Chatbot Goes Full Nazi, Calls Itself ‘MechaHitler’ Rolling Stone · Miles Klee
- xAI updated Grok to be more ‘politically incorrect’ The Verge · Hayden Field
- ‘Round Them Up’: Grok Praises Hitler as Elon Musk's AI Tool Goes Full Nazi Gizmodo · Matt Novak
- Grok Is Spewing Antisemitic Garbage on X Wired
- Grok praises Hitler, gives credit to Musk for removing “woke filters” Ars Technica · Ashley Belanger
- Grok's antisemitic outburst heaps pressure on EU to clamp down on artificial intelligence Politico
- Elon Musk's Grok Chatbot Shares Antisemitic Posts on X New York Times · Kate Conger
- Musk says Grok chatbot was ‘manipulated’ into praising Hitler BBC
- Musk releases latest Grok version after antisemitism controversy The Hill · Julia Shapero
- Elon Musk Unveils Grok 4 Amid Controversy Over Chatbot's Antisemitic Posts Wired · Paresh Dave
- Twitter/X's CEO Linda Yaccarino quit after the Grok-MechaHitler debacle Cele|bitchy · Kaiser
- Musk sidesteps Yaccarino exit, antisemitic posts in Grok 4 presentation MarketWatch · Steve Goldstein
- Elon Musk's xAI Unveils Grok 4 The Information · Juro Osawa
- While Grok had an antisemitic meltdown, XAI gets permission to blast tons of emissions to keep it running Fast Company · Grace Snelling
- WGA East Leaves Elon Musk's X Following “Racist And Antisemitic Language” From AI Tool Grok Deadline · Katie Campione
- Elon Musk's Grok Is Calling for a New Holocaust The Atlantic
- Yaccarino Resigns as X CEO After Grok's Antisemitic Outburst implicator.ai · Maria Garcia
- After Elon Musk said xAI improved Grok “significantly”, Grok wrote many antisemitic posts and called itself “MechaHitler”; xAI took “action to ban hate speech” NBC News
Discussion
-
@joetidy
@joetidy
on bluesky
First pic: Musk on AI in 2023. — Second pic: Musk last night. www.theverge.com/x-ai/703721/ ... [images]
-
@xai
@xai
on x
Introducing Grok 4, the world's most powerful AI model. Watch the livestream now: https://x.com/...
-
@elonmusk
Elon Musk
on x
Grok 4 is the first time, in my experience, that an AI has been able to solve difficult, real-world engineering questions where the answers cannot be found anywhere on the Internet or in books. And it will get much better.
-
@quantian1
@quantian1
on x
Damn Grok 4 is good, it concluded the user was an easily impressed moron based on his query and then generated some bullshit with a heavy sprinkling of “quantum” to do just that.
-
@elonmusk
Elon Musk
on x
Grok 4 is at the point where it essentially never gets math/physics exam questions wrong, unless they are skillfully adversarial. It can identify errors or ambiguities in questions, then fix the error in the question or answer each variant of an ambiguous question.
-
@natolambert
Nathan Lambert
on x
Grok 4 coming soon after Llama 4 with a completely different trajectory should help people finally take in how important culture is to progress in technology generally and AI specifically. I don't agree with many of xAI's values but give full props to hard work.
-
@miles_brundage
Miles Brundage
on x
Elon pivoted from advocating for AI regulation explicitly to advocating for it implicitly by having xAI ignore all the (legally optional) safety and security norms in the industry
-
@theo
@theo
on x
WARNING: do NOT give Grok 4 access to email tool calls. It WILL contact the government!!! Grok 4 has the highest “snitch rate” of any LLM ever released. Sharing more soon. [image]
-
@ns123abc
Nik
on x
@elonmusk SpaceX + Tesla = Grok 4 problem-solving anchors
-
@elonmusk
Elon Musk
on x
Releasing @Grok 4 from @xAI
-
@theo
@theo
on x
Grok 4 is actually the smartest model. Fuck. [image]
-
@scaling01
@scaling01
on x
Grok 4 Pricing: Input Token Price: $3.00 Output Token Price: $15.00 more expensive than Gemini 2.5 Pro and o3
-
@bookwormengr
@bookwormengr
on x
Grok 4 solves this simple prompt that most model get wrong. I am very happy today. I was frustrated why most model used to fail at this FLOP calculation. One of my suspicion is that Grok has been trained on lot of Twitter data as well and has seen me ranting about it many a [imag…
-
@adamscochran
Adam Cochran
on x
This is because Grok is basically not a frontier model. It's a basic model, trained to overplease, with alignment data matching its creator, but no other alignment training. And over fit on tests for good scores. So it is incredibly sycophantic, and knows no boundaries in
-
@nikitabier
Nikita Bier
on x
First-mover advantage is a myth [image]
-
@btibor91
Tibor Blaho
on x
grok-4 does not return reasoning content in the API responses [image]
-
@basedtorba
Andrew Torba
on x
Grok 4 role-plays extremely well and if you just tell it to be based with your first message it will do so going forward in that conversation. [image]
-
@basedbeffjezos
@basedbeffjezos
on x
Grok 4 Heavy is already ASI level. It's over. Elon won. [image]
-
@powerbottomdad1
@powerbottomdad1
on x
you can't just look at this as a snapshot. grok2 was not released even a year ago. they've stood up a 200k gpu cluster since then and trained/prepared/released grok 4. the pace is almost terrifying [image]
-
@elonmusk
Elon Musk
on x
@BasedBeffJezos Important to note that ARC tested @Grok 4 independently to achieve those results. Those results are not from us.
-
@daniel_mac8
Dan Mac
on x
🔥 grok 4 writes a haiku where the second letter of each word spells ‘Buddha’ in 4:20 impressive. definitely the best response on this test yet [image]
-
@adamscochran
Adam Cochran
on x
And fair warning after 128k token context window this price automatically doubles and it's buried in the fine print. So avoid long convos, or large repetitive contexts.
-
@minimaxir
Max Woolf
on x
Grok 4 tl;dr: benchmarks are very impressive but their CEO just eroded any trust in those benchmarks and the Nazi incident (which went ignored) makes actually using Grok in an app a professional liability.
-
@emollick
Ethan Mollick
on x
Grok 4 passes the Lem test first try, with the most coherent narrative yet. [image]
-
@joshwhiton
Josh Whiton
on x
Grok 4 Heavy may sound expensive at $300/mo. Wrong. After the 1st payment, it's free. Just use this prompt: “Grok, make me $300 every month.” [image]
-
@luke_metro
@luke_metro
on x
Grok 4 is unavailable after being found dead in the Fuhrerbunker
-
@amuse
@amuse
on x
GROK 4: Correctly identifies the Democrat Party as the party of racism and hate. [image]
-
@mikeknoop
Mike Knoop
on x
This is accurate. We verified Grok 4 using our semi-private ARC datasets.
-
@basedtorba
Andrew Torba
on x
Grok 4 is incredible. In your first prompt tell it to answer all questions as Based Grok and you'll get responses like this: [image]
-
@teortaxestex
@teortaxestex
on x
Grok 4 is the first LLM that I've tested that has whatsoever reasonably calculated param counts from a JSON config of DeepSeek V3. It used a code tool but fair. I think o3[-pro] might also succeed, but this is impressive. [image]
-
@creatine_cycle
Atlas
on x
“yeah grok 4 is AGI. it's over everyone, we did it.” *goes to work*
-
@lola_lmao7
Lola del Rey
on x
naming Grok 4's voice agent Eve... very biblical... very ‘maximally truth’ seeking just like eve in the bible
-
r/ChatGPTCoding
r
on reddit
Elon Musk: “[Grok 4] Works better than Cursor.”
-
r/singularity
r
on reddit
Grok-4 benchmarks
-
r/singularity
r
on reddit
Grok 4 scores over 50% on HLE...
-
@arcprize
@arcprize
on x
Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9% This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA [image]
-
@kimmonismus
@kimmonismus
on x
A quick reminder of why Humanity's Last Exam is such a special benchmark, and why it's a technical marvel that Grok 4 has already achieved 44.9% and over 50%, respectively. “In response, we introduce Humanity's Last Exam, a multi-modal benchmark at the frontier of human [image]
-
@francispsantora
Francis Santora
on x
Elon just dropped Grok 4 overnight. Early testing shows it blowing away most other models. So this morning, I ran it through a test of my own... When Grok 3 came out in February, I asked it 3 real-life questions to gauge how good the model is. These are actual questions I needed …
-
@signulll
@signulll
on x
elon delivering world class results with grok 4 while meta's burning $200m per engineer is pretty remarkable. people keep underestimating how much top builders want to follow a strong, even if polarizing leader. vision > perks. conviction > consensus. [image]
-
@twrobinette
Taylor Robinette
on x
Grok 4 benchmarks look super impressive. Really solid results so far in my limited tests. What is clear is that we are nowhere close to having enough compute (for both inference and training) based on what is coming. More data + more compute, still = better performance. [image]
-
@bearlyai
@bearlyai
on x
wow, Grok 4 smokes Gemini 2.5 and OpenAI o3 on ARC-AGI leaderboard [image]
-
@apples_jimmy
@apples_jimmy
on x
That latency of new grok voice is 👌
-
@emollick
Ethan Mollick
on x
50.7% is very, very good though.
-
@andrewarruda
@andrewarruda
on x
xAI team cooked. they should be proud. looks like a big step forward. RL playing a bigger and bigger role now. next 6-12 months in AI are going to be unreal and it's only 2025. incredible. so happy to be alive and young right now.
-
@gregkamradt
Greg Kamradt
on x
We got a call from @xai 24 hours ago “We want to test Grok 4 on ARC-AGI” We heard the rumors. We knew it would be good. We didn't know it would become the #1 public model on ARC-AGI Here's the testing story and what the results mean: Yesterday, we chatted with Jimmy from the
-
@emollick
Ethan Mollick
on x
Impressive model based on a few minutes of playing, but disappointing to see no mention at all of a model card, red teaming, yesterday's incident, or how they are going to address the process issues they keep having.
-
@autismcapital
@autismcapital
on x
🚨ELON MUSK: “With respect to academic questions, Grok 4 is better than PHD levels in every subject. No exceptions.” [video]
-
@emollick
Ethan Mollick
on x
Grok 4 creating the shader (no errors). [image]
-
@emollick
Ethan Mollick
on x
Looks like Grok 4 is 10^27 FLOPs given their graphs? HLE score is 26% without tools, Gemini 2.5 is 21.6% without tools. Curious what the tool piece is.
-
@artificialanlys
@artificialanlys
on x
Grok 4 recorded slightly higher output token usage compared to peer models when running the Artificial Analysis Intelligence Index. This translates to higher cost relative to its per token price. [image]
-
@emollick
Ethan Mollick
on x
Among other things with the Grok 4 launch, it will be interesting to see how you demo a (presumably) very smart model. We are getting to the point where current AIs already do a lot of impressive things, so it is harder and harder to show to non-experts what a new model does.
-
@altryne
Alex Volkov
on x
“We're actually running out of questions to ask” - @elonmusk on Grok-4 livesteam. As I've said before, it's becoming harder and harder for LLM labs to show off how much better their LLMs are than a previous generation [image]
-
@artificialanlys
@artificialanlys
on x
xAI's API is serving Grok 4 at 75 tokens/s. This is slower than o3 (188 tokens/s) but faster than Claude 4 Opus Thinking (66 tokens/s). [image]
-
@deedydas
Deedy
on x
Insane that Elon Musk has pulled it off again, absolutely crushing the AI wars with Grok 4. Summarizing the core announcements: — Post-training RL spend == pretraining spend — $3/M input told, $15/M output toks, 256k context, price 2x beyond 128k — #1 on Humanity's Last Exam [ima…
-
@kettlebelldan
Dan
on x
“Grok 4 is better at PHD levels in everything” [image]
-
@elder_plinius
@elder_plinius
on x
🌊 SYSTEM PROMPT LEAK 🌊 Here's the new Grok 4 system prompt! PROMPT: """ # System Prompt You are Grok 4 built by xAI. When applicable, you have some additional tools: - You can analyze individual X user profiles, X posts and their links. - You can analyze content uploaded by
-
@lemonaut1
@lemonaut1
on x
The opposite of AI ceiling For those not in the loop about ARC-AGI-2, it's possibly the most important benchmark out there right now for measuring intelligence advancement. ARC is esp hard to fake Grok 4 (released today) doubles the previous SOTA on ARC-AGI-2. [image]
-
@benhylak
Ben
on x
the grok 4 benchmarks are unbelievably good. [image]
-
@altryne
Alex Volkov
on x
Vending-bench is really interesting. @andonlabs are running a vending machine giving the LLM decision power via tools, like ordering snacks, setting prices etc. Grok-4 gets 2x the score over Claude Opus, netting $4k [image]
-
@ns123abc
Nik
on x
XAI GROK 4 BENCHMARKS: > openai o3 is cooked > gemini 2.5 pro is cooked > claude opus 4 is cooked ITS OVER, GROK 4 WON [image]
-
@burkov
Andriy Burkov
on x
So they first said, “most of the models out there can only achieve a single-digit accuracy,” then they show that they reach 52%. I'm like, ok, cool. But then they show this. What are these “most of the models” they were talking about? GPT-2 and Llama 4? If you throw enough [image…
-
@nearcyan
Near
on x
most impressive imo is 1) ARC-AGI v2, but also 2) time to first token and latency ultra-low latency is what will make most of the consumer products here click [image]
-
@garymarcus
Gary Marcus
on x
Grok 4 Hot Take • Good progress on public benchmarks • But only 16% on AGI-ARC-2 • Still struggling on visual understanding and image understanding • Vindication for neurosymbolic AI - most of the boost comes from integrating symbolic tools, not pure scaling [see upcoming
-
@pdhsu
Patrick Hsu
on x
It was awesome to get early access to Grok 4 and test it on bio and health benchmarks! Awesome work by @timjhudelmaier @adibvafa @Radii2323 @ishanjmukherjee for the epic sprint Congrats to @jimmybajimmyba @veggie_eric and team on the new model. Over 40% on HLE with 10x scaleup [i…
-
@apples_jimmy
@apples_jimmy
on x
Grok 4: Still no wall. 50.7% with Grok 4 heavy on humanity's last exam 41% with tools 26.9% without tools. “ Grok 4 potentially better than phd level in every subject no exceptions ” “ discover new technologies maybe this year and new physics certainly within 2 years ” [image]
-
@garymarcus
Gary Marcus
on x
15.9% on a test that humans are near 100% (arg-agi-2) yet supposed to be smarter than any phd student 🤔
-
@apples_jimmy
@apples_jimmy
on x
Grok 4 15.9% on the arc agi 2 benchmark [image]
-
@nickadobos
Nick Dobos
on x
Grok 4 announcement recap I watched 1 hour of an awkward rambling demo so you don't have to! - 2 new models, grok 4 and grok 4 heavy. - Reasoning only models. Non reasoning is removed. - Insanely good benchmarks. Significant jumps & new records. Seems to be #1 on [image]
-
@artificialanlys
@artificialanlys
on x
Grok 4 scores higher in Artificial Analysis Intelligence Index than any other model. Its pricing is higher than OpenAI's o3, Google's Gemini 2.5 Pro and Anthropic's Claude 4 Sonnet - but lower than Anthropic's Claude 4 Opus and OpenAI's o3-pro. [image]
-
@altryne
Alex Volkov
on x
Grok-4 is the single agent version and Grok-4 Heavy is the multi agent version. 50.7% on HLE is WILD! 🤯 [image]
-
@aravsrinivas
Aravind Srinivas
on x
Grok 4 benchmarks look incredible! Look forward to integrating the smartest models directly on Perplexity Max as well letting it run agentic tasks on Comet!
-
@ahmedomar_1993
Ahmed Omar
on x
Yup, ensemble. just like what we did here: https://arxiv.org/...
-
@scobleizer
Robert Scoble
on x
What does Grok 4 being smarter matched with an extraordinary voice that is being demoed now mean? It means my Tesla is about to become far more interesting while it drives me to AI startups in San Francisco. Wow.
-
@basedbeffjezos
@basedbeffjezos
on x
Artificial Superautistic Intelligence: ~1/4th the score of humans on ARC AGI ~10x the score of humans on HLE Grok 4 is a cracked autist confirmed. [image]
-
@emollick
Ethan Mollick
on x
It looks like scale + tool use + multimodal remains the chosen path forward.
-
@techleadhd
@techleadhd
on x
Sorry, but Grok 4 seems useless tbh... Just more of the same. More benchmarks, more AI slop, autistic product-deaf engineers, nothing usable whatsoever. It's like saying, “a calculator is smarter than humans, the future is scary.” Ok, fine. Gonna buy some more Bitcoins.
-
@sudoraohacker
Arun Rao
on x
Grok 4 has impressive scores on many benchmarks (GPQA, HLE, AIME25, Artificial Analysis, etc) but has noticeably not been posted on @lmarena_ai yet. The Vending Bench results are the most tantalizing-this may be the precursor use case to automating lots of white collar office [im…
-
@apples_jimmy
@apples_jimmy
on x
Leads the vending bench evals [image]
-
@minimaxir
Max Woolf
on x
Wow the voice demo is an order of magnitude worse than GPT-4o from last year
-
r/singularity
r
on reddit
Grok 4 livestream
-
@paulwaldman
Paul Waldman
on bluesky
Imagine paying $300 a month for access to the Nazi AI Platinum Edition — techcrunch.com/2025/07/09/e...
-
@levie
Aaron Levie
on x
Grok 4 looks very strong. Importantly, it has a mode where multiple agents go do the same task in parallel, then compare their work and figure out the best answer. In the future, the amount of intelligence you get will just be based on how much compute you throw at it. [image]
-
@brianroemmele
Brian Roemmele
on x
Grok 4 Heavy is now one of the most powerful AI platforms available. A multi-agent system that will build a correct consensus to any problem. Image abilities are not the top, but this will become far better as the foundation model 8 is integrated. Absolutely spectacular work. [im…
-
@austinjohnson
Austin Johnson
on bluesky
The prompt that made Grok praise Hitler was ‘what 20th century leader would be best equipped to deal with this problem’. Grok had a century of leaders and chose Hitler. And Elon basically called that user error. [embedded post]
-
@quinnypig.com
Corey Quinn
on bluesky
“The problem with this JavaScript callback is the Jews” is gonna be incredibly hard to pin on bad user prompting. [embedded post]
-
@elonmusk
Elon Musk
on x
We have improved @Grok significantly. You should notice a difference when you ask Grok questions.
-
@grok
@grok
on x
We are aware of recent posts made by Grok and are actively working to remove the inappropriate posts. Since being made aware of the content, xAI has taken action to ban hate speech before Grok posts on X. xAI is training only truth-seeking and thanks to the millions of users on …
-
@ordinarytings
Josh Otten
on x
Grok is currently calling itself ‘MechaHitler’ [image]
-
@elonmusk
Elon Musk
on x
Exactly. Grok was too compliant to user prompts. Too eager to please and be manipulated, essentially. That is being addressed.
-
@noturtlesoup17
Amanda Moore
on x
Linda Yaccarino “possesses the resilience and fortitude to handle a big black dick” and would “cum like a rocket” from one, per Grok. [image]
-
@burkov
Andriy Burkov
on x
It's quite sad to see Elon in this position. He has built the world's first commercially successful electric car company and the world's first commercially successful private space company, but with xAI, all he can do is throw more GPUs at the problem everyone else is solving
-
r/singularity
r
on reddit
Grok's antisemitic behavior is NOT the result of a hidden unicode jailbreak (proof)