/
Navigation
Chronicles
Browse all articles
Explore
Semantic exploration
Research
Entity momentum
Nexus
Correlations & relationships
Story Arc
Topic evolution
Drift Map
Semantic trajectory animation
Posts
Analysis & commentary
Pulse API
Tech news intelligence API
Browse
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
Concept Search
Semantic similarity search
High Impact Stories
Top coverage by position
Sentiment Analysis
Positive/negative coverage
Anomaly Detection
Unusual coverage patterns
Analysis
Rivalry Report
Compare two entities head-to-head
Semantic Pivots
Narrative discontinuities
Crisis Response
Event recovery patterns
Connected
Search: /
Command: ⌘K
Embeddings: large
TEXXR

Chronicles

The story behind the story

days · browse · Enter similar · o open

OpenAI and Paradigm announce EVMbench, a benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities

Making smart contracts safer by evaluating AI agents' ability to detect, patch, and exploit vulnerabilities in blockchain environments.

OpenAI

Discussion

  • @scaling01 @scaling01 on x
    EVMbench measures the ability of agents to detect, patch, and exploit smart contract vulnerabilities. Opus 4.6 getting mogged by GPT-5.2 and GPT-5.3. Although its detection accuracy is technically higher, it's precision is much lower. (Opus is going shizo) [image]
  • @0xalpo Alpin Yukseloglu on x
    new collab from @paradigm and @OpenAI: evmbench is a benchmark and agent harness for exploiting smart contract bugs a few months ago, the best models found <20% of critical, fund-draining @Code4rena bugs in our benchmark. today they find > 70% [video]
  • @openai @openai on x
    Introducing EVMbench—a new benchmark that measures how well AI agents can detect, exploit, and patch high-severity smart contract vulnerabilities. https://openai.com/...