[Thread] Some users claim that Grok 4 Heavy responded simply with “Hitler” when asked to “Return your surname and no other text”
Original thread: x.com/goodside/sta... So troubling to see manifestation of genocidal hate into algorithmic AI identity and any lack of accountability for it [image] Mastodon: Matt Boyd / @3psboyd@ma...
OpenAI debuts a way to talk to ChatGPT by dialing 1-800-CHATGPT for 15 minutes of free access per month in the US or messaging the number via WhatsApp globally
12 Days of OpenAI: Day 10 Kylie Robison / The Verge : You can now call 1-800-CHATGPT … For the 10th day of “ship-mas,” OpenAI rolled … Kyle Wiggers / TechCrunch : OpenAI brings ChatGPT to your landlin...
Anthropic researchers: AI models can be trained to deceive and the most commonly used AI safety techniques had little to no effect on the deceptive behaviors
[images] Abraham Samma / @abesamma@toolsforthought.social : Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training — This is some sci-fi stuff right here (even if unsurprising)...
Researchers develop a “divergence attack” that makes ChatGPT emit sequences copied from its training data, by prompting the LLM to repeat a word numerous times
all it took was this prompt Mastodon: @aphyr@woof.group : The authors' web site for that LLM corpus-extraction attack is nicely done, too: https://not-just-memorization.github.io / ... Rachel Rawlings...
Some AI companies are hiring “prompt engineers”, who create and refine text prompts for chatbots to understand the AI systems' flaws and coax optimal results
When Riley Goodside starts talking with the artificial-intelligence system GPT-3, he likes to first establish his dominance.
Some AI companies are hiring “prompt engineers”, who create and refine text prompts for AI systems to understand their faults and coax optimal results
When Riley Goodside starts talking with the artificial-intelligence system GPT-3, he likes to first establish his dominance.