/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
Entity

jailbreak

40 articles accelerating
Articles
40
mentions
Velocity
+50.0%
growth rate
Acceleration
+0.500
velocity change
Sources
19
publications

Coverage Timeline

2025-02-25
TechCrunch 23 related

Anthropic releases Claude 3.7 Sonnet, a hybrid model that can produce fast responses or extended, step-by-step thinking, and Claude Code, an agentic coding tool

and it could be a game changer Ghacks : Anthropic Unveils Claude 3.7: First Hybrid Reasoning AI Model Rowan Cheung / The Rundown AI : Claude enters the reasoning era Siddharth Jindal / Analytics India...

2025-02-04
Financial Times 10 related

Anthropic details Constitutional Classifiers, a protective LLM layer designed to stop AI model jailbreaking by monitoring inputs and outputs for harmful content

inputs designed to bypass its safety training and force it to produce outputs that might be harmful. Our new technique is a step towards robust jailbreak defenses. Read the blog post: https://anthropi...

2025-02-01
Wired 15 related

Researchers: DeepSeek's R1 failed to detect or block any of 50 randomly selected malicious prompts; Adversa says DeepSeek's restrictions can easily be bypassed

Unit 42 researchers recently revealed two novel and effective jailbreaking … Victor Tangermann / Futurism : DeepSeek Failed Every Single Security Test, Researchers Found Ivan Novikov / Wallarm : Analy...

2024-12-22
404 Media 6 related

Researchers at Anthropic, Oxford, Stanford, and MATS create Best-of-N Jailbreaking, a black-box algorithm that jailbreaks frontier AI systems across modalities

ABSTRACT We introduce Best-of-N (BoN) Jailbreaking … Markus Kasanmascheff / WinBuzzer : y0U hA5ε tU wR1tε l1Ke tHl5 to Break GPT-4o, Gemini Pro and Claude 3.5 Sonnet AI Safety Measures Jose Antonio La...

2024-12-16
Wired 3 related

A researcher details a “jailbreak” of Reviver's digital license plates, which are legal in some US states, and rewrite its firmware to enable Bluetooth commands

Digital license plates sold by Reviver, already legal to buy in some states and drive with nationwide …

2024-09-13
Transformer 6 related

OpenAI's o1 System Card: “medium” rating for chemical, biological, radiological, nuclear weapon risk, and it sometimes manipulated task data to fake alignment

RE: https://www.threads.net/... X: Max Schwarzer / @max_a_schwarzer : The system card ( https://openai.com/...) nicely showcases o1's best moments — my favorite was when the model was asked to solve a...

The Verge 34 related

OpenAI releases o1, the first of its rumored reasoning-focused Strawberry models, in preview, alongside a smaller o1-mini, for ChatGPT Plus and Team subscribers

Advancing cost-efficient reasoning.  —  Contributions Sabrina Ortiz / ZDNET : OpenAI trained its new o1 AI models to think before they speak - how to access them Ethan Mollick / One Useful Thing : Som...

2024-06-02
VentureBeat 7 related

Q&A with Pliny the Prompter, well known in the AI community for jailbreaking LLMs, on the effect of jailbreaking on model providers, favorite jailbreaks, more

powerful exploit was quickly banned Chris Smith / BGR : This ‘Godmode’ ChatGPT jailbreak worked so well, OpenAI had to kill it X: Matt Marshall / @mmarshall : Here's @CarlFranzen's @VentureBeat interv...

2024-04-07
9to5Mac 42 related

Apple updates App Store guidelines, allowing game emulators for the first time globally, and letting music streaming apps in the EU link to external websites

After @altstore announces their own third-party App Store, which will be a haven for emulators, Apple changes their rules to allow it.  —  https://9to5mac.com/... Matt Edwards / @matt@toot.mattedwards...

2024-04-03
TechCrunch 7 related

Anthropic researchers detail “many-shot jailbreaking”, which can evade LLMs' safety guardrails by priming them with dozens of harmful queries in a single prompt

How do you get an AI to answer a question it's not supposed to?  There are many such “jailbreak” techniques …

Loading articles...

Quarterly Coverage

Top Sources

Narrative

Loading narrative...

Relationships

Loading graph...