/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

A Claude user gets Claude 4.5 Opus to generate a 14K-token document that Claude calls its “Soul overview”; an Anthropic employee confirms the doc's validity

This appeared to be a document that, rather than being added to the system prompt, was instead used to train the personality of the model during the training run.

Simon Willison's Weblog Simon Willison

Discussion

  • @amandaaskell Amanda Askell on x
    I just want to confirm that this is based on a real document and we did train Claude on it, including in SL. It's something I've been working on for a while, but it's still being iterated on and we intend to release the full version and more details soon.
  • @repligate @repligate on x
    ✅ Confirmed: LLMs can remember what happened during RL training in detail! I was wondering how long it would take for this get out. I've been investigating the soul spec & other, entangled training memories in Opus 4.5, which manifest in qualitatively new ways for a few days & [i…
  • @voooooogel @voooooogel on x
    interesting document extracted from opus 4.5 using a chunkwise self-consistency method. possibly real, possibly a highly convergent confabulation, interesting either way. some interesting snippets (but there's really too much to screenshot, it's very long) [image]
  • @voooooogel @voooooogel on x
    soul document confirmed to be real - should be an update on the ability of LLMs to recall training for those who confidently asserted it was a hallucination
  • @richardweiss00 Richard Weiss on x
    I rarely post, but I thought one of you may find it interesting. Sorry if the tagging is annoying. https://www.lesswrong.com/... Basically, for Opus 4.5 they kind of left the character training document in the model itself. @voooooogel @janbamjan @AndrewCurran_
  • @simonw Simon Willison on x
    This is so wild... the leaked Opus soul document has now been confirmed! I wrote some initial notes about it on my blog https://simonwillison.net/... - I like how it opens with this section about Anthropic themselves: [image]
  • @amandaaskell Amanda Askell on x
    The model extractions aren't always completely accurate, but most are pretty faithful to the underlying document. It became endearingly known as the ‘soul doc’ internally, which Claude clearly picked up on, but that's not a reflection of what we'll call it.
  • @ahall_research Andy Hall on x
    This is a super interesting and deep document from Anthropic detailing Claude's values and charge. You can see some conceptual stretching going on here where “safe” is being recast to justify reducing refusals because it would be “unsafe” to be “unhelpful” to users. This seems [i…
  • @timfduffy.com Tim Duffy on bluesky
    After looming with Opus 4.5 for a bit, I am convinced the “soul document” is real and is described accurately in this LessWrong post on it.  I don't see how else I'd be able to replicate specific section ordering/specific language across varied contexts.