2024-12-26
Time
1 related
A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench
more interesting than it sounds! LinkedIn: Ross Dawson : The frontier of “evals”. Evaluations comparing AI ahd human capabilities are evolving rapidly as AI rapidly leaves existing benchmarks in the ...
2024-12-18
Financial Times
1 related
Q&A with Microsoft Chief Product Officer of Responsible AI Sarah Bird on generative AI, its impact on work, Copilots, OpenAI, AGI, AI agents, bias, and more
Chief product officer of ‘responsible AI’ says the focus needs to be on augmenting — not replicating — human capabilities
Loading articles...