/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

Researchers say GPT 4.1, Claude 3.7 Sonnet, Gemini 2.5 Pro, and Grok 3 can reproduce long excerpts from books they were trained on when strategically prompted

On tuesday, researchers at Stanford and Yale revealed something that AI companies would prefer to keep hidden.

The Atlantic Alex Reisner

Discussion

  • @dbrody David Brody on bluesky
    According to new research, “AI does not absorb information like a human mind does.  Instead, it stores information and accesses it.  —  In fact, many AI developers use a more technically accurate term when talking about these models: lossy compression.
  • @damonberes.com Damon Beres on bluesky
    Big new piece: @alexreisner.bsky.social presents the most compelling evidence yet that generative AI directly stores and reproduces training material—it does not “learn,” not really.  This could have substantial legal consequences for the tech industry.
  • @segyges SE Gyges on bluesky
    the interesting thing is that claude has such high memorization imho [embedded post]
  • @dmnd.me Jeremy Diamond on bluesky
    At least the water usage bullshit fools people who have no anchor for what qualifies as a lot of water  —  This is just constantly disproved by users' own experiences [embedded post]
  • @megangray Megan Gray on bluesky
    AI'S MEMORIZATION CRISIS  —  Large language models don't “learn”—they copy.  And that could change everything for the tech industry.  —  www.theatlantic.com/technology/ ...
  • @yyahn Yy Ahn on bluesky
    “In some cases, jailbroken Claude 3.7 Sonnet outputs entire books near-verbatim ... Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs.”  —  arxiv.org/abs/2601.02671
  • r/books r on reddit
    Extracting books from production language models - Researchers were able to reproduce up to 96% of Harry Potter with commercial LLMs
  • r/technology r on reddit
    Researchers extract up to 96% of Harry Potter word-for-word from leading AI models