awokeknowing · TEXXR

2024-01-15

@karpathy this is why beyond a certain size, ai data training sets should be required to be open and publically inspectable, and there should be a way to verify a chain of trust to know that other data was not part of the training. Open is safer.

2024-01-15 View on X

TechCrunch

Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors

[images] Abraham Samma / @abesamma@toolsforthought.social : Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training — This is some sci-fi stuff right here (e...

View original