An interview with Kyle Fish, who Anthropic hired in 2024 as a welfare researcher to study AI consciousness and estimates a ~15% chance that models are conscious
As artificial intelligence systems become smarter, one A.I. company is trying to figure out what to do if they become conscious.
New column: Anthropic is studying “model welfare” to determine if Claude or other AI systems are (or will soon be) conscious and deserve moral status. I talked to Kyle Fish, who leads the research, and thinks there's a ~15% chance that Claude or another AI is conscious today. [im…
As AI models become more complex and more capable, is it possible that they'll have experiences of their own? It's an open question. We recently started a research program to investigate it. [video]
One interesting nugget to pull out: Anthropic is considering whether models should be able to force-quit conversations if users' requests are too distressing. [image]