Microsoft researchers say AI models can be used to design toxins or pathogens that evade biosecurity systems used to screen DNA orders for potential biothreats
and presenting first-of-its-kind red teaming & mitigations to strengthen biosecurity in the age of AI. LinkedIn: Satya Nadella : Published today in Science Magazine: a landmark study led by Microsoft ...
OpenAI releases gpt-oss-120b and gpt-oss-20b, its first open-weight models since GPT-2; the smaller gpt-oss-20b can run locally on a device with 16GB+ of RAM
gpt-oss-120b and gpt-oss-20b push the frontier of open-weight reasoning models Simon Willison / Simon Willison's Weblog : OpenAI's new open weight (Apache 2) models are really good OpenAI on GitHub : ...
Google unveils benchmarking platform Kaggle Game Arena, where LLMs compete head-to-head in strategic games, starting with a chess tournament from August 5 to 7
Watch models compete in complex games providing a verifiable and dynamic measure of their capabilities. Kaggle : Chess Text Input Leaderboard Nick Bild / Hackster : Shall We Play a Game? Maximilian Sc...
Anthropic details Constitutional Classifiers, a protective LLM layer designed to stop AI model jailbreaking by monitoring inputs and outputs for harmful content
inputs designed to bypass its safety training and force it to produce outputs that might be harmful. Our new technique is a step towards robust jailbreak defenses. Read the blog post: https://anthropi...
Researchers: DeepSeek's R1 failed to detect or block any of 50 randomly selected malicious prompts; Adversa says DeepSeek's restrictions can easily be bypassed
Unit 42 researchers recently revealed two novel and effective jailbreaking … Victor Tangermann / Futurism : DeepSeek Failed Every Single Security Test, Researchers Found Ivan Novikov / Wallarm : Analy...
OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025
12 Days of OpenAI: Day 12 Naomi Li Gan / Tech in Asia : OpenAI unveils AI model for advanced reasoning Bojan Stojkovski / Interesting Engineering : OpenAI unveils o3 reasoning AI model to tackle compl...
An evaluation of six frontier AI models for in-context scheming when strongly nudged to pursue a goal: only OpenAI's o1 was capable of scheming in all the tests
It presents a new safety challenge that OpenAI is trying to address. — techcrunch.com/2024/12/05/o... Anders Sandberg / @arenamontanus : In an IVA discussion on AI yesterday evening professor Kristi...
Microsoft delays Recall to test it with the Windows Insider Program and won't ship it with Copilot+ PCs next week, after saying it would make the feature opt-in
will arrive via Windows Update later this year Richi Jennings / Security Boulevard : Recall ‘Delayed Indefinitely’ — Microsoft Privacy Disaster is Cut from Copilot+ PCs Katie Bartlett / CNBC : Microso...
Microsoft releases PyRIT, a tool that the company's AI Red Team has been using to more efficiently check for risks in its generative AI systems, such as Copilot
https://www.microsoft.com/... I know, let's pretend that LLM security can be bolted on later after we have created a foundation model based on data scraped from the Internet that is FULL of poison, g...
OpenAI launches the OpenAI Red Teaming Network, a contracted group of experts to help inform the company's AI model risk assessment and mitigation strategies
Kyle Wiggers / TechCrunch :