Navigation

Chronicles

Browse all articles

Explore

Semantic exploration

Research

Entity momentum

Nexus

Correlations & relationships

Story Arc

Topic evolution

↻

Drift Map

Semantic trajectory animation

Posts

Analysis & commentary

Browse

Entities

Companies, people, products, technologies

◇

Domains

Browse by publication source

☉

Handles

Browse by social media handle

Detection

Concept Search

Semantic similarity search

High Impact Stories

Top coverage by position

Sentiment Analysis

Positive/negative coverage

Anomaly Detection

Unusual coverage patterns

Analysis

Rivalry Report

Compare two entities head-to-head

Semantic Pivots

Narrative discontinuities

Crisis Response

Event recovery patterns

Connected

Nav: C E R N

Search: /

Command: ⌘K

Embeddings: large

TEXXR

Chronicles

The story behind the story

← → days · ↑ ↓ browse · Enter similar · o open

A look at the more challenging AI evaluations emerging in response to the rapid progress of models, including FrontierMath, Humanity's Last Exam, and RE-Bench

more interesting than it sounds! LinkedIn: Ross Dawson : The frontier of “evals”. Evaluations comparing AI ahd human capabilities are evolving rapidly as AI rapidly leaves existing benchmarks in the dust. …

Time 2024-12-26