Q&A with mathematicians behind the “First Proof” experiment, which tests AI's mathematical competency on questions drawn from the authors' unpublished research
Large language models struggle to solve research-level math questions. It takes a human to measure just how poorly they perform.