/
Navigation
C
Chronicles
Browse all articles
C
E
Explore
Semantic exploration
E
R
Research
Entity momentum
R
N
Nexus
Correlations & relationships
N
~
Story Arc
Topic evolution
S
Drift Map
Semantic trajectory animation
D
P
Posts
Analysis & commentary
P
Browse
@
Entities
Companies, people, products, technologies
Domains
Browse by publication source
Handles
Browse by social media handle
Detection
?
Concept Search
Semantic similarity search
!
High Impact Stories
Top coverage by position
+
Sentiment Analysis
Positive/negative coverage
*
Anomaly Detection
Unusual coverage patterns
Analysis
vs
Rivalry Report
Compare two entities head-to-head
/\
Semantic Pivots
Narrative discontinuities
!!
Crisis Response
Event recovery patterns
Connected
Nav: C E R N
Search: /
Command: ⌘K
Embeddings: large
VOICE ARCHIVE

Benj Edwards

@benjedwards
39 posts
2024-12-22
I think we should call o1-style AI models “simulated reasoning” or “SR models,” since they don't reason like humans, but they do simulate a type of artificial reasoning process that can produce useful results https://arstechnica.com/...
2024-12-22 View on X
TechCrunch

OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025

12 Days of OpenAI: Day 12 Naomi Li Gan / Tech in Asia : OpenAI unveils AI model for advanced reasoning Bojan Stojkovski / Interesting Engineering : OpenAI unveils o3 reasoning AI m...

2024-12-21
I think we should call o1-style AI models “simulated reasoning” or “SR models,” since they don't reason like humans, but they do simulate a type of artificial reasoning process that can produce useful results https://arstechnica.com/...
2024-12-21 View on X
TechCrunch

OpenAI unveils o3 and o3-mini, trained to “think” before responding via what OpenAI calls a “private chain of thought”, and plans to launch them in early 2025

OpenAI announced its new o3 models on Friday.  —  In a tweet ahead of its final livestream for its …

2024-10-15
I usually write about AI for Ars Technica, but BBS history was more important today. Special thanks to @textfiles for the interview and the images
2024-10-15 View on X
Ars Technica

Ward Christensen, co-inventor of the computer bulletin board system (BBS), a foundational technology of the internet age, died at age 78 on October 11

On Friday, Ward Christensen, co-inventor of the computer bulletin board system (BBS), died at age 78 in Rolling Meadows, Illinois. Threads: @blinkenjim . Mastodon: @CyberpunkLibrar...

2024-09-21
I've created a new fruit-based benchmark for LLMs: “How many Rs are *not* in the word strawberry?” 😁 See how o1-preview fares vs. GPT-4o in the screenshots below (cc:@goodside) [image]
2024-09-21 View on X
Understanding AI

OpenAI's o1 models are dramatically better at reasoning than previous LLMs, but they struggle with spatial reasoning and are far from human-level intelligence

Timothy B Lee / Understanding AI :

2024-09-13
OpenAI's awkward “o1” AI model branding is kinda strange. “Strawberry” was right there, already christened and used by people to describe it for months
2024-09-13 View on X
The Verge

OpenAI releases o1, the first of its rumored reasoning-focused Strawberry models, in preview, alongside a smaller o1-mini, for ChatGPT Plus and Team subscribers

Advancing cost-efficient reasoning.  —  Contributions Sabrina Ortiz / ZDNET : OpenAI trained its new o1 AI models to think before they speak - how to access them Ethan Mollick / On...

OpenAI's o1-preview does pretty well on my “magenta” test. But the first LLM that just answers “no” without any qualifications will probably be AGI.😁 Reading its internal reasoning can be pretty amusing [image]
2024-09-13 View on X
Simon Willison's Weblog

OpenAI's o1 models aren't as simple as the next step up from GPT-4o as they introduce major cost and performance trade-offs in exchange for improved “reasoning”

OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is also a preview …

OpenAI's awkward “o1” AI model branding is kinda strange. “Strawberry” was right there, already christened and used by people to describe it for months
2024-09-13 View on X
Simon Willison's Weblog

OpenAI's o1 models aren't as simple as the next step up from GPT-4o as they introduce major cost and performance trade-offs in exchange for improved “reasoning”

OpenAI released two major new preview models today: o1-preview and o1-mini (that mini one is also a preview …

OpenAI's o1-preview does pretty well on my “magenta” test. But the first LLM that just answers “no” without any qualifications will probably be AGI.😁 Reading its internal reasoning can be pretty amusing [image]
2024-09-13 View on X
The Verge

OpenAI releases o1, the first of its rumored reasoning-focused Strawberry models, in preview, alongside a smaller o1-mini, for ChatGPT Plus and Team subscribers

Advancing cost-efficient reasoning.  —  Contributions Sabrina Ortiz / ZDNET : OpenAI trained its new o1 AI models to think before they speak - how to access them Ethan Mollick / On...

2024-06-06
@simonw The word “Open” has sort of become like greenwashing in software. The appearance of doing good without necessarily doing it
2024-06-06 View on X
TechCrunch

Stability AI releases Stable Audio Open, a text-to-audio model that generates up to 47 seconds of samples and sound effects, prohibited for commercial use

Kyle Wiggers / TechCrunch :

2024-05-14
With the release of GPT-4o and its apparent “artificial emotional intelligence,” you might call it, this seems like a good day to resurface this 2016 tweet by @sama
2024-05-14 View on X
TechCrunch

OpenAI unveils GPT-4o, a new flagship generative AI model that is faster and natively multimodal, rolling out for free to all ChatGPT users in the coming weeks

There are two things from our announcement today I wanted to highlight. OpenAI : Hello GPT-4o  —  We're announcing GPT-4o, our new flagship model that can reason across audio, visi...

2024-03-28
For the first time since it appeared on the Chatbot Arena in May 2023, reigning champ GPT-4 (and family) has been surpassed in #1 ranking Anthropic's Claude 3 Opus is now the top-ranked LLM on the leaderboard, GPT-4 Turbo is #2. https://arstechnica.com/...
2024-03-28 View on X
Ars Technica

Anthropic's Claude 3 Opus surpassed OpenAI's GPT-4 on Chatbot Arena, a crowdsourced LLM leaderboard used by AI researchers; GPT-4 has been first since launch

Anthropic's Claude 3 is first to unseat GPT-4 since launch of Chatbot Arena in May '23.  —  On Tuesday, Anthropic's Claude 3 …

2024-02-15
@Techmeme @aaronpholmes “The search engine would be partly powered by Bing.” [image]
2024-02-15 View on X
The Information

Source: OpenAI has been developing a web search product partly powered by Bing

OpenAI has been developing a web search product that would bring the Microsoft-backed startup into more direct competition with Google, according to someone with knowledge of OpenA...

2023-12-10
Nothing inspires confidence like a company that acknowledges that its software is basically acting up—and they have no idea why btw Microsoft just reoriented its entire company around this technology
2023-12-10 View on X
@chatgptapp

OpenAI says it is aware of feedback about GPT-4 getting “lazier” and is “looking into fixing it”, and notes that “model behavior can be unpredictable”

we've heard all your feedback about GPT4 getting lazier! we haven't updated the model since Nov 11th, and this certainly isn't intentional. model behavior can be unpredictable, and...

2023-12-09
Nothing inspires confidence like a company that acknowledges that its software is basically acting up—and they have no idea why btw Microsoft just reoriented its entire company around this technology
2023-12-09 View on X
@chatgptapp

OpenAI says it is aware of feedback about GPT-4 getting “lazier” and is “looking into fixing it”, and explains that model behavior can be unpredictable

we've heard all your feedback about GPT4 getting lazier! we haven't updated the model since Nov 11th, and this certainly isn't intentional. model behavior can be unpredictable, and...

2023-07-17
As educators panic about students using ChatGPT to write papers, many reach for AI detectors that perform slightly better than random chance Many painful false accusations are the result. I wrote about why people should not rely on this imperfect tech: https://arstechnica.com/...
2023-07-17 View on X
Ars Technica

Experts explain why there's no magic formula to always distinguish human-written and AI-written text, meaning AI writing detectors can only make a strong guess

Can AI writing detectors be trusted?  We dig into the theory behind them.  —  If you feed America's most important legal document …

2023-04-09
Right now, AI chatbots are the ultimate bullsh*t machines, easily inventing histories, citations, and biographical details that don't exist Why is that, and is there anything researchers can do to fix it? I asked several experts about it for Ars: https://arstechnica.com/... https://twitter.com/...
2023-04-09 View on X
Ars Technica

Experts say that when ChatGPT confabulates, the bot is reaching for information not in its training data and filling in the blanks with plausible-sounding words

A look inside the hallucinating artificial minds of the famous text prediction bots.  —  Over the past few months …

2023-04-08
Right now, AI chatbots are the ultimate bullsh*t machines, easily inventing histories, citations, and biographical details that don't exist Why is that, and is there anything researchers can do to fix it? I asked several experts about it for Ars: https://arstechnica.com/... https://twitter.com/...
2023-04-08 View on X
Ars Technica

Experts say when ChatGPT confabulates, it is reaching for information that is absent from its training data and filling in blanks with plausible-sounding words

A look inside the hallucinating artificial minds of the famous text prediction bots.  —  Over the past few months …

2022-10-15
It's remarkable that Meta gives Carmack a large platform to openly critique its products. I'm not sure I've ever seen any other situation like it in business (And if they're wise, they will take his advice) https://twitter.com/...
2022-10-15 View on X
Kotaku

Meta says the segment at Connect 2022 announcing the addition of legs to Horizon World avatars in 2023 “featured animations created from motion capture”

Parmy Olson of Bloomberg was not impressed with Meta's announcements at Connect 2022: James Troughton / TheGamer : Meta's Virtual Leg Update Was Actually Motion Capture Jonathan Va...

2022-10-14
It's remarkable that Meta gives Carmack a large platform to openly critique its products. I'm not sure I've ever seen any other situation like it in business (And if they're wise, they will take his advice) https://twitter.com/...
2022-10-14 View on X
Kotaku

Meta says the segment at Connect 2022 announcing the addition of legs to Horizon World avatars in 2023 “featured animations created from motion capture”

A subsequent statement from Meta says ‘the segment featured animations created from motion capture’

It's remarkable that Meta gives Carmack a large platform to openly critique its products. I'm not sure I've ever seen any other situation like it in business (And if they're wise, they will take his advice) https://twitter.com/...
2022-10-14 View on X
Ars Technica

Meta adviser and ex-Oculus CTO John Carmack expresses heavy skepticism at pushing for avatar fidelity, focusing on the Quest Pro to drive VR adoption, and more