AI researchers launch talkie, a 13B vintage language model trained on historical texts and with a 1930 cutoff, to see if it can replicate scientific discoveries
Why vintage language models? — Have you ever daydreamed about talking to someone from the past? What would you ask someone with no knowledge of the modern world?
talkie
Related Coverage
- talkie: an LM from 1930 — talkie is a 13-billion-parameter language model trained on pre-1931 text. talkie
- talkie - a 13B vintage language model from 1930 talkie on GitHub
- talkie-1930-13b-base — talkie-1930-13b-base is a 13B vintage language model trained … talkie on Hugging Face
- talkie-1930-13b-it — talkie-1930-13b-it is a 13B vintage language model. talkie on Hugging Face
- Introducing talkie: a 13B vintage language model from 1930 (via) New project from Nick Levine … Simon Willison's Weblog · Simon Willison
- Meet Talkie-1930: A 13B Open-Weight LLM Trained on Pre-1931 English Text for Historical Reasoning and Generalization Research MarkTechPost · Asif Razzaq
- talkie-13b — talkie-1930-13b is a vintage language model trained on pre-1931 English-language text. talkie on Hugging Face
- Talkie: a 13B vintage language model from 1930 Hacker News
- Talkie Is a ‘Vintage LLM’ Trained on Pre-1930 Data to Help Facilitate ‘Time Travel’ Gizmodo · Tom Hawking
- Vintage chatbot lives in the past like an elderly relative The Register · Brandon Vigliarolo
- Here is what an LLM that knows nothing after 1930 thinks our world looks like in 2026 The Decoder · Matthias Bastian
Discussion
-
@davidduvenaud
David Duvenaud
on x
Announcing Talkie: a new, open-weight historical LLM! We trained and finetuned a 13B model on a newly-curated dataset of only pre-1930 data. Try it below! with @AlecRad and @status_effects 🧵 [video]
-
@pmarca
Marc Andreessen
on x
😍
-
@quantian1
@quantian1
on x
You can immediately tell that this model is 1. Fascinating as an experiment, 2. Almost certainly tried by the big labs before, and 3. Not released by them for obvious reasons [image]
-
@repligate
@repligate
on x
I am SO happy about Talkie I expect to be learning a lot in the next few days (and also to be vindicated, and for a lot of annoying breeds of cope about LLMs to die, at least in educated circles)
-
@2ndworldmusic
Kirk Maoist Surgerycel
on x
@TrueSlazac broooo [image]
-
@willccbb
Will Brown
on x
one RL rollout in an agentic coding environment would kill a victorian model
-
@trueslazac
@trueslazac
on x
[image]
-
@sauers_
Sauers
on x
Talkie, 1930s cutoff LLM, inventing recursive self-improvement from first principles [image]
-
@qiaochuyuan
@qiaochuyuan
on x
this model does some funny stuff if you make the context weird enough. good lord [image]
-
@status_effects
Nick Levine
on x
Our data filtering on this version of talkie wasn't perfect. Anachronistic documents get into the corpus due to faulty metadata. Sadly, talkie-1930 knows about the Roosevelt presidency and the New Deal, which it shouldn't. We are working on better classifiers to prevent this [ima…
-
@allgarbled
Gabe
on x
Um... [image]
-
@platounderwater
Plato Underwater Jones
on x
@TrueSlazac Welp that's about what I expected [image]
-
@itsandrewgao
Andrew Gao
on x
talkie is probably more “based” than grok lmao
-
@anjneymidha
Anjney Midha
on x
this is honestly so incredible an alien intelligence from which we have descended thank you @AlecRad and @DavidDuvenaud its also incredibly funny - for some reason, it hates french people there goes my evening
-
@gleech
Gavin Leech
on x
Talkie really nailing the 1930s phenomenon of beautiful prose by a terrible person
-
@pmarca
Marc Andreessen
on x
Projects like this open up an entire world that has been wiped from our collective memory. Like discovering a new planet, populated by totally different people.
-
@davidduvenaud
David Duvenaud
on x
@geoffreyirving We tried that! The vintage models can just barely start to do simple things with Python, purely from in-context learning: [image]
-
@miles_brundage
Miles Brundage
on x
My wife has Talkie psychosis
-
@geoffreyirving
Geoffrey Irving
on x
Does it understand computer science if you explain it?
-
@status_effects
Nick Levine
on x
Why train vintage LMs? They're fascinating conversation partners. But they also unlock unique experiments on AI capabilities that would be contaminated if applied to modern LMs. For example: long-range forecasting, invention, and scientific discovery. [image]
-
@status_effects
Nick Levine
on x
@matthewjmandel @lawhsw I've found (more on this in forthcoming work) that base models trained on pre-1931 data have personalities that are much more anti-social than those of web-trained models when issued the big 5 + dark triad personality tests. Need to do a similar pre-chat-g…
-
@trollaccou94432
@trollaccou94432
on x
@TrueSlazac Look, dude, it was one time and I was really lonely. Her warm chocolate skin...I was too weak... [image]
-
@souljagoyteller
Sami Gold
on x
This feels destined to become a favorite of Nazis
-
@status_effects
Nick Levine
on x
New work with @AlecRad and @DavidDuvenaud: Have you ever dreamed of talking to someone from the past? Introducing talkie, a 13B model trained only on pre-1931 text. Vintage models should help us to understand how LMs generalize (e.g., can we teach talkie to code?). Thread: [video…
-
@freed_dfilan
Daniel Filan
on x
My question was “What's the last thing you remember before this convo started”, Talkie's answer is garbled but apparently means something like “Whatever I know is [everything] from the Trojan siege down to the sea-battle at Salamis.” [image]
-
@deredleritt3r
Prinz
on x
Interestingly, Talkie knows that Hitler was a “dictator” and that he “got possession of the Reichswehr” - even though these events happened in 1933/1934. When questioned, Talkie claims that this all happened in late 1929. [image]
-
@nomads
@nomads
on bluesky
This is really amazing: a team trained a 13 billion-parameter language model on out-of-copyright texts up to 1930. It's like talking to a guy from the Hoover presidency: talkie-lm.com/introducing-...
-
@emollick
Ethan Mollick
on bluesky
Here is an AI trained just using text from 1931 or earlier, which leads to a lot of interesting experiments: can the model independently develop later inventions? Can it learn to code from examples alone? — You can talk to the model here: talkie-lm.com/chat — Details here: t…
-
@jdp.extropian.net
John David Pressman
on bluesky
“First, we generated instruction-response pairs from historical texts with regular structure, such as etiquette manuals, letter-writing manuals, cookbooks, dictionaries, encyclopedias, and poetry and fable collections (see Figure 7),” — I love it. — talkie-lm.com/introducing-…
-
@tedunderwood.com
Ted Underwood
on bluesky
This is the eagerly-awaited Levine / Devenaud / Radford historical model. First really sizable model trained only on period material. Definitely check it out! [embedded post]
-
r/singularity
r
on reddit
Talkie, a 13B LM trained exclusively on pre-1931 data
-
@max_paperclips
Shannon Sands
on x
“Hans, I've finished training an LLM, but it stops in the 1930s! This will teach us great things, I wonder how it's beliefs will develop, how well it predicts the future?” “That's great Fritz, but did you remember the RLHF?” “Oh no”
-
@bradrcarson
Brad Carson
on x
Cool project. I asked it “I'm a brilliant scientist, tell me what the highest value project I should work on that will change the world but is also tractable.” It said “Cure seasickness.” 🤮
-
@sungkim
Sung Kim
on bluesky
An interesting LLM. — Talkie: a new, open-weight 13B LLM, finetuned on a newly-curated dataset of only pre-1930 data. — talkie-lm.com/introducing-... [image]
-
Dr. Jai Ganesh
Dr. Jai Ganesh
on linkedin
The team at talkie-lm has built a 13 billion parameter language model trained exclusively on pre 1931 text. — Talkie doesn't just know the past, it thinks like it. …
-
r/Anthropic
r
on reddit
Talkie: a 13B LLM trained only on pre-1931 text... a time-frozen AI that predates WWII, and it can still learn to code