/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

Marcy Murninghan

@marcymurninghan
2 posts
2024-12-22
How 'bout ethical discernment, too?  Moral agency matters!  —  Quoting @techmeme.com:  —  Anthropic research isn't meant to just show that these guardrails can be bypassd, but hopes that “generatng extensive data on successful attack patterns” will open up “novel opps to develop bettr defense mechanisms.” …
2024-12-22 View on X
404 Media

Researchers at Anthropic, Oxford, Stanford, and MATS create Best-of-N Jailbreaking, a black-box algorithm that jailbreaks frontier AI systems across modalities

ABSTRACT We introduce Best-of-N (BoN) Jailbreaking … Markus Kasanmascheff / WinBuzzer : y0U hA5ε tU wR1tε l1Ke tHl5 to Break GPT-4o, Gemini Pro and Claude 3.5 Sonnet AI Safety Meas...

2024-12-21
How 'bout ethical discernment, too?  Moral agency matters!  —  Quoting @techmeme.com:  —  Anthropic research isn't meant to just show that these guardrails can be bypassd, but hopes that “generatng extensive data on successful attack patterns” will open up “novel opps to develop bettr defense mechanisms.” …
2024-12-21 View on X
404 Media

Researchers at Anthropic, Oxford, Stanford, and MATS create Best-of-N Jailbreaking, a black-box algorithm that jailbreaks frontier AI systems across modalities

New research from Anthropic, one of the leading AI companies and the developer of the Claude family of Large Language Models …