Artificial Analysis (Company)

The Decoder 5 related

Nvidia launches Nemotron 3 Ultra, a 550B-parameter MoE open model; Artificial Analysis says it is the smartest open US model, but trails Chinese model Kimi K2.6

It has roughly 550 billion total parameters, with about 55 billion active at any given time.

2026-06-02 View

@artificialanlys

Artificial Analysis announces AA-Omniscience, a benchmark for knowledge and hallucination across 40+ topics; Claude 4.1 Opus takes first place in its key metric

@artificialanlys : X: @artificialanlys , @emollick , @scaling01 , @teortaxestex , @artificialanlys , @zephyr_z9 , @artificialanlys , @artificialanlys , @mweinbach , @artificialanlys , and @artificial...

2025-11-18 View

Simon Willison's Weblog 1 related

A new Artificial Analysis benchmark, focusing on OpenAI's gpt-oss-120b, shows how open-weight LLMs exhibit inconsistent performance across hosting providers

Artificial Analysis published a new benchmark the other day, this time focusing on how an individual model - OpenAI's gpt-oss-120b - performs across different hosted providers.

2025-08-17 View

TechCrunch 3 related

Recraft, whose image model Recraft V3 beat OpenAI's DALL-E and Midjourney on the Artificial Analysis benchmark last year, raised a $30M Series B led by Accel

they empower creativity, enable brand storytelling, and give designers precision and control. …

2025-05-06 View

TechCrunch 1 related

AI reasoning models cost more to benchmark, making it harder to independently verify claims; Artificial Analysis says evaluating OpenAI's o1 costs $2,767.05

AI labs like OpenAI claim that their so-called “reasoning” AI models, which can “think” through problems step by step …

2025-04-11 View

Artificial Analysis

Patterns

Top Voices

Explore Further

Coverage Timeline

Nvidia launches Nemotron 3 Ultra, a 550B-parameter MoE open model; Artificial Analysis says it is the smartest open US model, but trails Chinese model Kimi K2.6

Artificial Analysis announces AA-Omniscience, a benchmark for knowledge and hallucination across 40+ topics; Claude 4.1 Opus takes first place in its key metric

A new Artificial Analysis benchmark, focusing on OpenAI's gpt-oss-120b, shows how open-weight LLMs exhibit inconsistent performance across hosting providers

Recraft, whose image model Recraft V3 beat OpenAI's DALL-E and Midjourney on the Artificial Analysis benchmark last year, raised a $30M Series B led by Accel

AI reasoning models cost more to benchmark, making it harder to independently verify claims; Artificial Analysis says evaluating OpenAI's o1 costs $2,767.05

Quarterly Coverage

Top Sources

Narrative

Key Moments

Relationships