Peter Wildeford

Coverage Timeline

2025-12-22

@metr_evals 4 related

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

just careful, meticulous rigor. Nikola Jurkovic / @nikolaj2030 : This result updates me towards 4 month doubling times being my median estimate for the next two years. That means by EOY 2026 the time ...

2025-12-22 View

2025-03-26

TechCrunch 8 related

The Arc Prize Foundation says its new ARC-AGI-2 test stumps most AI models; humans get 60% of the questions right but GPT-4.5 and Claude 3.7 Sonnet score ~1%

[image] François Chollet / @fchollet : Unlike ARC-AGI-1, this new version is not easily brute-forced. Current top AI approaches score 0-4%. All base LLMs (GPT-4.5, Claude 3.7 Sonnet, Gemini 2, etc.)...

2025-03-26 View

2025-02-27

TechCrunch 1 related

Anthropic's Claude 3.7 Sonnet reportedly cost “a few tens of millions of dollars” to train, similar to Claude 3.5 and cheaper than GPT-4, which cost over $100M

“Assuming Claude 3.7 Sonnet indeed cost just ‘a few tens of millions of dollars’ to train, not factoring in related expenses, it's a sign of how relatively cheap it's becoming to release state-of-the-...

2025-02-27 View

2025-02-25

TechCrunch 23 related

Anthropic releases Claude 3.7 Sonnet, a hybrid model that can produce fast responses or extended, step-by-step thinking, and Claude Code, an agentic coding tool

and it could be a game changer Ghacks : Anthropic Unveils Claude 3.7: First Hybrid Reasoning AI Model Rowan Cheung / The Rundown AI : Claude enters the reasoning era Siddharth Jindal / Analytics India...

2025-02-25 View

Loading articles...

Patterns

Related Entities

Top Voices

Explore Further

Coverage Timeline

METR: Claude Opus 4.5 has a 50% task completion time horizon of about 4 hours and 49 minutes, more than double that of Claude Opus 4 released earlier this year

The Arc Prize Foundation says its new ARC-AGI-2 test stumps most AI models; humans get 60% of the questions right but GPT-4.5 and Claude 3.7 Sonnet score ~1%

Anthropic's Claude 3.7 Sonnet reportedly cost “a few tens of millions of dollars” to train, similar to Claude 3.5 and cheaper than GPT-4, which cost over $100M

Anthropic releases Claude 3.7 Sonnet, a hybrid model that can produce fast responses or extended, step-by-step thinking, and Claude Code, an agentic coding tool

Quarterly Coverage

Top Sources

Narrative

Relationships