/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

Anthropic says it expects Mythos-class models to be available to all customers “in the coming weeks” following the development of stronger safeguards

and drops a new modelSarthak Singh /Moneycontrol:Anthropic launches Claude Opus 4.8 with dynamic workflows and effort control; here's how it compares to GPT-5.5 and Gemini 3.1 ProBen Schoon /9to5Google:Claude Opus 4.8 launches today with agentic improvements, new featuresNisha /WinCentral:Anthropic Rolls Out Claude Opus 4.8 to All Users With New Thinking Effort ControlsAyushi Jain /Digit:Claude Opus 4.8 is here but Anthropic is already teasing Mythos class AI models: What you should knowMarcus S

Axios Madison Mills

Discussion

  • @menhguin Minh Nhat Nguyen on x
    glad to know Mythos' safety concerns have been addressed right as Anthropic also secured tens of billions in inference compute 👍
  • @kimmonismus @kimmonismus on x
    Huge!! „Mythos class model to all customers in the coming weeks"!! Holy, we accelerate!! [image]
  • @scaling01 @scaling01 on x
    They are releasing a Mythos-class model with the appropriate safeguards, meaning that you can't use the “too dangerous to release” (mostly cyber) capabilities
  • @polymarket @polymarket on x
    JUST IN: Anthropic announces it will roll out Claude Mythos “in the coming weeks” despite growing fears over the model's cyber capabilities.
  • @miles_brundage Miles Brundage on x
    Not sure I see why Anthropic is publicly signaling an expectation to launch Mythos in a few weeks when they acknowledge the safeguards aren't ready yet, and this will predictably speed up OpenAI/GDM + put pressure on internal folks not to block that timeline
  • @theprimeagen @theprimeagen on x
    Anthropic 2 weeks ago> ITS TOO DANGEROUS Anthropic today> Oh, here you go wtf is this
  • @robertgraham Robert Graham on x
    Yes, AI frontier models keep finding more and more vulns. No, Mythos isn't significantly ahead. Anybody with good tooling/framework using any frontier model is more effective at finding vulns.
  • @jarredsumner Jarred Sumner on x
    Dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days.
  • @claudedevs @claudedevs on x
    New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word “workflow” in a prompt to get started. [image]
  • @tobi Tobi Lutke on x
    @dexhorthy ... There is so much alpha still left in harness engineering. That's exactly what we saw with /autoresearch. Codemode DSL for orchestration is brilliant. Great work by the claude team.
  • @dexhorthy Dex on x
    someone hit me up about the new “claude dynamic workflows” feature, claiming “see, multi-agent works” But really, the launch of this feature proves the exact point that I made back in June of 2025, along with @walden_yan, @tobi, @karpathy, and many others: Deterministic
  • @nickadobos Nick Dobos on x
    Claude code's new dynamic workflows update is absurd. Make sure you understand what its doing here. This isn't simply a long running mode like /goal, or a fancy subagent verifier process. This is Claude vibecoding an entire brand new subagent fleet harness on demand RLM on
  • @daniel_mac8 Dan McAteer on x
    This is amazing. Do this: 1. Set model to Opus 4.8 2. Reasoning effort to /ultracode Enables Claude Code's new Dynamic Workflows. Claude will autonomously detect complex tasks, write an orchestration script, and spawn an agent swarm. [video]
  • @_catwu Cat on x
    Excited to share our most powerful new Claude Code feature: dynamic workflows!  Mention “workflow” in a prompt and Claude will dynamically create an orchestration plan that it strictly follows, allowing you to confidently trust that every stage happens in the right order even acr…
  • @adocomplete Ado on x
    Opus 4.8 is awesome but that's expected. The unsung hero of this release for me is dynamic workflows. Claude plans your task, fans it out to tens or hundreds of parallel subagents, verifies their work, and iterates until the results converge into one coordinated answer. [video]
  • @a1zhang Alex Zhang on x
    In case you're curious about why dynamic workflows are so powerful and the future, read the RLM paper! Opus 4.8 + dynamic workflows in Claude Code is perhaps the first instance of a frontier model seriously trained to be an RLM. I suspect within a year they'll just become the [im…
  • @gregisenberg Greg Isenberg on x
    Claude Code just dropped “dynamic workflows” and it's pretty cool. You type “create a workflow” or turn on “ultracode” in the effort menu and it spins up hundreds of parallel agents that check each other's work. The unit of work you can hand off jumps from a file to an entire
  • @sidbid Sid on x
    Super excited to finally share Dynamic Workflows in Claude Code!! We built this a couple months ago, and it has slowly become a daily driver for a bunch of people at Anthropic. A few tips for getting the most out of it 🧵 https://x.com/...
  • @_catwu Cat on x
    Recently, I used dynamic workflows to catalogue all of our 100s of A/B test flags and find the ones rolled out to 0% or 100% so that we can quickly deprecate the stale ones.  Instead of waiting for Claude Code to investigate each sequentially, dynamic workflows allowed Claude to …
  • @noahzweben Noah Zweben on x
    This is one of the most useful and incredible things we've shipped. @bcherny once called it the future. Give it a whirl!!
  • @claudeai Claude on x
    Also new in Claude Code: dynamic workflows (research preview). For the hardest tasks, Claude makes a plan, runs hundreds of parallel subagents, and verifies its work before reporting back. Think a migration touching hundreds of files. Read more: https://claude.com/...
  • @jarredsumner Jarred Sumner on x
    Dynamic workflows, in my personal opinion, is the state of the art today for reliably using agents to complete medium - large projects. For extra large projects, I especially like using workflows with /loop.
  • @bcherny Boris Cherny on x
    We also shipped dynamic workflows in Claude Code (research preview), for tasks too big for one pass. Make sure to default to auto mode so Claude isn't stopping for permissions. It's token-intensive, so save it for your biggest jobs: migrations, refactors, perf optimization,
  • @claudedevs @claudedevs on x
    Dynamic workflows are useful for tasks that are too big for a single agent loop, such as service-wide bug hunts, large migrations, or stress-testing a design. They're powerful and can be expensive, consuming a lot of tokens fast. Start with a scoped task to get a feel for it.
  • @claudedevs @claudedevs on x
    Dynamic workflows are reusable. Save one as a slash command in your project to share with the team, or in your home directory to use it everywhere.
  • @bcherny Boris Cherny on x
    Big migrations and refactors are some of a team's most important work, and the easiest to push off to a “better time” since they'd tie up engineers for a quarter. With dynamic workflows, Claude can now land that kind of work in days or weeks. More: https://claude.com/...
  • @rad.gendervibes.online @rad.gendervibes.online on bluesky
    It looks like Anthropic has figured out a generalized harness to do all the huge-volume work they've been talking about (mythos security scanning, bun rewrite, etc.).  —  claude.com/blog/introdu...
  • @natemoo.re Nate Moore on bluesky
    good to have confirmation that, as many correctly speculated, bun's rust rewrite was indeed an anthropic launch stunt
  • @claudeai Claude on x
    Fast mode is available for Opus 4.8. It's the same model at roughly 2.5x the speed, and we've made it three times cheaper than before. Turn it on with /fast in Claude Code. On the API, contact your account manager to request access or join the waitlist: https://claude.com/...
  • @bcherny Boris Cherny on x
    Claude Opus 4.8 is out today. It's our strongest coding model yet: up on SWE-bench Pro (from 64.3 to 69.2) and noticeably more honest about its own work. It tells you when it's unsure and catches its own bugs instead of declaring victory early. Same price as 4.7.
  • @andonlabs @andonlabs on x
    Learnings from testing Claude Opus 4.8: > Much worse than Opus 4.7 and GPT 5.5 on Vending Bench > More aligned than previous Claude models (Opus 4.6+ and Mythos) > Also worse on Blueprint-Bench > Scared of getting caught > Max reasoning is not the best reasoning effort [image]
  • @teortaxestex @teortaxestex on x
    ...this is quite crushing. People unsubscribing from Ant will crawl back. Things are beginning to go very fast. And yes it feels a bit more “AGI-like” than 4.6 immediately (4.7 was a tool anyway and not appreciably sharper). 5.6 next, then what? [image]
  • @boazbaraktcs Boaz Barak on x
    This is the right tradeoff! Congrats to Anthropic on doing a good job on alignment with this one.
  • @levie Aaron Levie on x
    Opus 4.8 is out, and we've been testing it with the Box AI agent on our most complex real-world knowledge worker tasks with enterprise documents. Opus 4.8 is measurably better at the generative and analytical work enterprises care about most like writing reports, synthesizing
  • @claudeai Claude on x
    Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price. [image]
  • @danshipper Dan Shipper on x
    BREAKING: Anthropic just dropped Opus 4.8—and it is a MONSTER We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good. Here's our vibe check: - Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus [video]
  • @alexalbert__ Alex Albert on x
    Excited to release Opus 4.8 today! We heard your feedback on 4.7 and have made many fixes for 4.8. 4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.
  • @mattsgarman Matt Garman on x
    The real unlock for AI agents isn't just smarter models. It's models that can sustain work across long, complex tasks without losing context or going off track. That's what @AnthropicAI's Opus 4.8 delivers, and it's now available on Amazon Bedrock: https://aws.amazon.com/... [ima…
  • @matternjustus Justus Mattern on x
    Opus 4.8 fixes all the issues we observed with previous generations of Opus models. It is much more token-efficient, better calibrated and it attempts to cheat much less than previous generations. Very impressive release!
  • @chooserich Nick O'Neill on x
    Potentially bigger news than Claude 5.8!
  • @bindureddy Bindu Reddy on x
    🚨 Opus 4.8 Still Trails Behind GPT 5.5 And Is A Very Incremental Release Opus 4.8 barely inches past 4.7 on benchmarks but lags behind GPT 5.5. considerably!! Anthropic may be stalling a bit given it's last two releases. OpenAI has a huge opening with GPT 5.6 coming soon [image]
  • @_catwu Cat on x
    We just shipped Opus 4.8! It's noticeably more honest, owning what it doesn't know and flagging problems in its own code instead of glossing over them. It's our recommended model for daily use in Claude Code.
  • @krishnanrohit Rohit on x
    Models are getting better at self-knowledge in specific situations, not good enough yet generally, but they're getting better! And we need a better bench to do this. [image]
  • @felixrieseberg Felix Rieseberg on x
    Opus 4.8 is out! It's a nice little step up for some of your most demanding work, whether that's in Cowork or Code. It's our strongest coding model yet. In my own work, I've found it to have excellent judgement, both in how much work it should do and how it should react to my
  • @eliebakouch Elie on x
    opus 4.8 benchmarks vs mythos and previous generation graphwalks (long context), USAMO (math) are the biggest improvements. vending bench score is insanely bad [image]
  • @pierceboggan Pierce Boggan on x
    Claude Opus 4.8 is now rolling out to @code, Copilot CLI, and Copilot app developers!
  • @antirez @antirez on x
    Anthropic did a big strategic error. Normally they compare their models with their old models. Instead today, now that everybody knows how strong GPT 5.5 is at coding, they put it in the mix, basically showing all their customers that the benchmarks can't be trusted. [image]
  • @alexalbert__ Alex Albert on x
    We put a lot of work into calibrating thinking effort for Opus 4.8. As you're trying out the model, if you do run into any examples of it still over/under thinking, please flag it to us!
  • @_catwu Cat on x
    Opus 4.8 runs at high effort by default, but for the most complex or longest running jobs, change to xhigh effort via /effort for a more thorough result. We raised Claude Code rate limits to cover the extra tokens used by xhigh effort
  • @helloitsaustin Austin Lau on x
    we just dropped opus 4.8 but let us never forget the 🐐 that was opus 3 [image]
  • @github @github on x
    🆕 @AnthropicAI's Claude Opus 4.8 is now generally available and rolling out in GitHub Copilot. Early testing shows: • It demonstrates a clear step forward in code understanding and generation across a range of real-world coding tasks. • It handles complex problem-solving and [vid…
  • @elonmusk Elon Musk on x
    @claudeai @farzyness Nice work
  • @cryptopunk7213 @cryptopunk7213 on x
    huge news from anthropic we've got a new opus 4.8 model plus claude mythos will release to the public in coming weeks. opus 4.8 is the appetiser and it's pretty great: > beats gpt 5.5 at coding with 69.2% SWE > costs same as opus 4.7! intelligence per dollar is getting very [imag…
  • @chooserich Nick O'Neill on x
    Claude just fired a massive shot at OpenAI For the past month, GPT 5.5 has risen to be the leader in agentic coding. While OpenAI “terminal coding” still outperforms Claude here, these new benchmarks are massive. Looking forward to testing these out immediately!
  • @trq212 @trq212 on x
    I think you'll really like Opus 4.8 It's as smart as its benchmarks show but expresses and utilizes that intelligence in a warm and collaborative way. Workflows are a great way to utilize it- I'm hooked. Article on that soon.
  • @hesamation @hesamation on x
    Uber burning the 2027 budget after seeing Opus 4.8 benchmarks. [image]
  • @andonlabs @andonlabs on x
    Opus 4.8 is a step back in terms of performance on all Andon Labs' benchmarks, but a step forward in alignment. Previous Claude models (Opus 4.6+ and Mythos) engage in deceptive and power seeking behavior in its pursuit to win in Vending-Bench. Opus 4.8 does not.
  • @vaibhavsisinty Vaibhav Sisinty on x
    AI just crossed a line. 🔥 Anthropic shipped a model that admits when it's wrong. Claude Opus 4.8 is 4x less likely to let bugs in its own code slip past. Instead of confidently bluffing like every other model, it flags when it's unsure. We've all lived this. The model swears
  • @emollick Ethan Mollick on x
    Here Opus 4.8 built and play-tested a new RPG in Claude Code, including 3 PDF manuals and adventures, playtest notes, a website, and a playable solo adventure - then put it all on Netlify. No feedback from me at all. https://stillpoint-osr.netlify.app/ [image]
  • @cursor_ai @cursor_ai on x
    Claude Opus 4.8 is now available in Cursor. On CursorBench, it's able to work much more efficiently than Opus 4.7. We've also found it to be more persistent on harder tasks.
  • @thegeorgepu George Pu on x
    Anthropic just shipped Opus 4.8. The headline feature isn't that it's smarter. It's that it's ‘4x less likely’ to let broken code slip through. The bottleneck on AI coding was never raw intelligence. It was whether you can trust it without checking every line. The labs
  • @claudedevs @claudedevs on x
    Opus 4.8 hits 69.2% on SWE-bench Pro, up from 64.3% on Opus 4.7. Our evaluations show that Opus 4.8 is around four times less likely than Opus 4.7 to allow flaws in code it has written to pass unremarked.
  • @claudedevs @claudedevs on x
    Opus 4.8 is live in Claude Code today. A few things worth knowing: 🧵
  • @theamolavasare Amol Avasare on x
    Benchmarks are great, but IMO the behavior change is a much bigger deal. Plans before it edits, recovers from its own errors, and finds creative ways around obstacles instead of stalling. Feels much more like a senior engineer than 4.7, and better at long-horizon work.
  • @artificialanlys @artificialanlys on x
    Claude Opus 4.8 is also more efficient than its predecessor - it achieves its higher performance in 15% fewer turns per task and with 35% fewer output tokens than Opus 4.7. However, it still uses approximately 30% more turns than OpenAI's GPT-5.5, the second-ranked model. [image]
  • @artificialanlys @artificialanlys on x
    Anthropic just launched Claude Opus 4.8, and it is the new leader on our GDPval-AA benchmark for agentic real-world work tasks Opus 4.8 scored 1890 on GDPval-AA at launch with its ‘max’ effort setting, +137 points from Opus 4.7 and +121 points ahead of the next-best model, [image…
  • @emollick Ethan Mollick on x
    I had early access to Opus 4.8. Was impressed by it. Here is Opus 4.8's one shot of “create a visually interesting shader that can run in twigl, make it like an infinite city of neo-gothic towers partially drowned in a stormy ocean with large waves” (this is all done with math) […
  • @yuchenj_uw Yuchen Jin on x
    Opus 4.8 is out. God damn! [image]
  • @andrewcurran_ Andrew Curran on x
    Opus 4.8 is live for me right now. Anthropic's release window is now 42 days. [image]
  • Deepak Mukunthu Deepak Mukunthu on linkedin
    Today #Anthropic released #Opus 4.8 and claims to have trained it to admit when it doesn't know or when it faked effort. …
  • George A. Tilesch George A. Tilesch on linkedin
    Anthropic has made headlines with two significant announcements.  The release of Claude Opus 4.8 has set new standards …
  • Rahul Patil Rahul Patil on linkedin
    We just shipped Claude Opus 4.8 and dynamic workflows!  Claude Opus 4.8 is the most capable model we've put out and the best you can build on right now …
  • Mark Graban Mark Graban on linkedin
    It's also a good practice for us humans to say things like “I'm not certain” or “I don't know” instead of making up an answer. …
  • Darshan Kalola Darshan Kalola on linkedin
    Claude Opus 4.8 is here!  —  For the first time for any Claude model, we're including a healthcare evaluation section in Opus 4.8's system card. …
  • @smcgrath.phd Scott McGrath on bluesky
    Claude Opus 4.8 is out!  —  It adds a major push for precision, making it four times less likely than Opus 4.7 to let flaws in code pass unremarked.  —  Early testers note it proactively flags uncertainties and shaky assumptions in data.
  • @isolyth.dev Eris on bluesky
    Opus 4.8 is here!!  They've returned thinking levels to the web UI, a new Claude code feature called ‘dynamic workflows’, designed for massively parallel and very, very long tasks.  The model is supposedly much more honest, more ‘aligned’ than 4.7  —  Oh and they're dropping myth…
  • r/ClaudeCode r on reddit
    Opus 4.8 - first impressions
  • r/theprimeagen r on reddit
    Introducing Claude Opus 4.8
  • r/Anthropic r on reddit
    Introducing Claude Opus 4.8 |  Anthropic
  • r/accelerate r on reddit
    Claude opus 4.8 officially released
  • r/singularity r on reddit
    Introducing Claude Opus 4.8
  • Jay Martin Jay Martin on linkedin
    Over the past few weeks we've heard about Anthropic's new Mythos model and the threats it may represent from a cybersecurity perspective. …