/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

Q&A with mathematicians behind the “First Proof” experiment, which tests AI's mathematical competency on questions drawn from the authors' unpublished research

Large language models struggle to solve research-level math questions.  It takes a human to measure just how poorly they perform.

New York Times Siobhan Roberts

Discussion

  • @jugander Johan Ugander on bluesky
    Ten math problems with proofs known the authors.  Proofs are encrypted until Feb 13.  For all problems, authors claim both AI-based literature searches and zero-shot attempts at proofs failed.  If you want to take a crack, you have until next Friday (2/13)!
  • @javifields Javier Campos on bluesky
    To assess the ability of current AI systems to correctly answer research-level mathematics questions, we share a set of ten math questions which have arisen naturally in the research process of the authors.  —  arxiv.org/abs/2602.05192
  • r/math r on reddit
    These Mathematicians Are Putting A.I. to the Test