_simonsmith · TEXXR

We've seen how much speed affects people's model preferences recently (e.g. the arena @swyx is running), so I think Codex Spark will be well-received. Also interesting that this initial release is a step towards combining long-horizon and real-time agents, including delegating to [image]

2026-02-14 View on X

ZDNET

OpenAI debuts a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for ChatGPT Pro users

View original

We've seen how much speed affects people's model preferences recently (e.g. the arena @swyx is running), so I think Codex Spark will be well-received. Also interesting that this initial release is a step towards combining long-horizon and real-time agents, including delegating to [image]

2026-02-13 View on X

ZDNET

OpenAI debuts a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for ChatGPT Pro users

ZDNET's key takeaways — OpenAI targets “conversational” coding, not slow batch-style agents. — Big latency wins: 80% faster roundtrip, 50% faster time-to-first-token.

View original

Microsoft is ubiquitous in businesses, yet Suleyman believes in 1-1.5 years AI will automate most of its customers' knowledge work. What's the pivot for Microsoft if companies don't need as many seat licenses, because they have fewer people, and AI agents don't need office apps,

2026-02-12 View on X

Financial Times

Mustafa Suleyman says Microsoft is pursuing “true self-sufficiency” in AI by building models for enterprise and health care and reducing its reliance on OpenAI

View original

The GLM-5 benchmark chart doesn't compare the model to Opus 4.6 or GPT-5.3, but is still impressive and, on the heels of Kimi K2.5, suggests China is very close to the frontier in many (but not all) domains. And with video, looking at Seedance 2, China might be ahead.

2026-02-12 View on X

Reuters

Z.ai says it will raise prices by at least 30% for new GLM coding plan subscribers to accommodate surging demand for its AI coding tools

View original

The GLM-5 benchmark chart doesn't compare the model to Opus 4.6 or GPT-5.3, but is still impressive and, on the heels of Kimi K2.5, suggests China is very close to the frontier in many (but not all) domains. And with video, looking at Seedance 2, China might be ahead.

2026-02-12 View on X

Z.ai

Z.ai launches GLM-5, saying its flagship open-weight model has “best-in-class performance among all open-source models” in reasoning, coding, and agentic tasks

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways …

View original

We've seen how much speed affects people's model preferences recently (e.g. the arena @swyx is running), so I think Codex Spark will be well-received. Also interesting that this initial release is a step towards combining long-horizon and real-time agents, including delegating to [image]

2026-02-12 View on X

ZDNET

OpenAI launches a research preview of GPT-5.3-Codex-Spark, a smaller version of GPT-5.3-Codex that it claims generates code 15 times faster, for Pro users

ZDNET's key takeaways — OpenAI targets “conversational” coding, not slow batch-style agents. — Big latency wins: 80% faster roundtrip, 50% faster time-to-first-token.

View original

Substantial deep research update in ChatGPT: - Can specify apps and sites - Makes a plan, which you can modify - Puts output in a nice report viewer The report viewer looks slick, similar to the spreadsheet viewer and PowerPoint viewer OpenAI previously released, and leading me

2026-02-11 View on X

The Decoder

OpenAI updates ChatGPT's deep research tool with GPT-5.2, a full-screen report view, an option to focus research on specific websites, and search interruption

The feature now runs on the new GPT-5.2 model, as OpenAI announced on X. A key addition is that users can connect apps to ChatGPT and—potentially very useful—search specific websit...

View original

Substantial deep research update in ChatGPT: - Can specify apps and sites - Makes a plan, which you can modify - Puts output in a nice report viewer The report viewer looks slick, similar to the spreadsheet viewer and PowerPoint viewer OpenAI previously released, and leading me

2026-02-11 View on X

Matt Shumer

GPT-5.3-Codex and Claude Opus 4.6 can meaningfully contribute to the improvement of AI models, a sign of what's coming for most knowledge work within five years

Think back to February 2020. — If you were paying close attention, you might have noticed a few people talking about a virus spreading overseas.

View original

Substantial deep research update in ChatGPT: - Can specify apps and sites - Makes a plan, which you can modify - Puts output in a nice report viewer The report viewer looks slick, similar to the spreadsheet viewer and PowerPoint viewer OpenAI previously released, and leading me

2026-02-11 View on X

Sources

Q&A with Fidji Simo on ChatGPT ads, OpenAI's efforts to ship a new model soon to end Sam Altman's Code Red, Anthropic's Super Bowl ads, Sora, Codex, and more

How ads in ChatGPT will work, what will end the Code Red, those Anthropic attack ads, working with Sam Altman, and much more...

View original

OpenAI isn't saying it, but Frontier to me is basically like a platform for digital employees. Or, like OpenClaw for organizations, where you create autonomous digital agents, give them skills and tools, and oversee their work. We're truly going through an evolution into an era [image]

2026-02-06 View on X

The Verge

OpenAI launches Frontier, an AI agent management platform that provides shared context, onboarding, and permission boundaries, for “a limited set of customers”

View original

Again, I don't want ads in my chatbot either, but buying Super Bowl ads to criticize ads in chatbots feels... off. It's an implicit moral statement about the value of time spent in chatbots versus time spent watching sports.

2026-02-05 View on X

@sama

Sam Altman says Anthropic's Super Bowl ads are funny but “dishonest”, and Anthropic serves a “product to rich people” while OpenAI is “committed to free access”

First, the good part of the Anthropic ads: they are funny, and I laughed. But I wonder why Anthropic would go for something so clearly dishonest. Our most important principle for a...

View original

Again, I don't want ads in my chatbot either, but buying Super Bowl ads to criticize ads in chatbots feels... off. It's an implicit moral statement about the value of time spent in chatbots versus time spent watching sports.

2026-02-05 View on X

Wall Street Journal

Anthropic plans a 30-second ad during the Super Bowl that parodies the prospect of intrusive ads in AI conversations, and will also air a 60-second pregame ad

Ads are coming to AI.Taylor Herzlich /New York Post:Super Bowl commercial sees Anthropic mock OpenAI for bringing ads to ChatGPTJames Peckham /PCMag:Anthropic Says No Ads on Claude...

View original

So, so close to massive automation. Put these specialists into an agent swarm coordinated by an orchestrator, and...

2026-01-31 View on X

TechCrunch

Anthropic expands its agentic plugins, which let enterprise users automate department-specific workflows, from Claude Code to its new general-use tool Cowork

Earlier this month, Anthropic launched Cowork, a new agentic tool designed to take the benefits of its AI coding assistant Claude Code …

View original

So, so close to massive automation. Put these specialists into an agent swarm coordinated by an orchestrator, and...

2026-01-30 View on X

TechCrunch

Anthropic expands its agentic plugins, which let enterprise users automate department-specific workflows, from Claude Code to its new general-use tool Cowork

Earlier this month, Anthropic launched Cowork, a new agentic tool designed to take the benefits of its AI coding assistant Claude Code …

View original

Very cool new tool from OpenAI for writing scientific papers. Two thoughts came to mind. First, while not a new model, this can help accelerate science. Second, this could be a precedent for other OpenAI document editing tools, such as for docs and sheets and presentations.

2026-01-28 View on X

MIT Technology Review

OpenAI launches Prism, a free cloud-based LaTeX editor that embeds GPT-5.2 to assist in scientific paper drafting and citation management

OpenAI just revealed what its new in-house team, OpenAI for Science, has been up to. The firm has released a free LLM-powered tool for scientists called Prism …

View original

I tried Kimi K2.5 in Agent Swarm mode today and can say that the benchmarks don't lie. This is a great model and I don't understand how they've made something as powerful and user-friendly as Agent Swarm ahead of the big US labs.

2026-01-27 View on X

Kimi

Moonshot says Kimi K2.5 builds on K2 with “pretraining over ~15T mixed visual and text tokens” and “can self-direct an agent swarm with up to 100 sub-agents”

Today, we are introducing Kimi K2.5, the most powerful open-source model to date.

View original

New Qwen model significantly improves at Humanity's Last Exam with tool use and now appears to be SOTA here by a large margin? Main innovation driving this seems to be a non-parallel approach to test time scaling where the model avoids redundant thinking. On the surface the [image]

2026-01-27 View on X

Qwen

Qwen releases Qwen3-Max-Thinking, its flagship reasoning model that it says demonstrates performance comparable to models such as GPT-5.2 Thinking and Opus 4.5

· QwenTeam丨Translations:.体中文 — Introduction# — We present Qwen3-Max-Thinking, our latest flagship reasoning model.

View original

Very cool new tool from OpenAI for writing scientific papers. Two thoughts came to mind. First, while not a new model, this can help accelerate science. Second, this could be a precedent for other OpenAI document editing tools, such as for docs and sheets and presentations.

2026-01-27 View on X

MIT Technology Review

OpenAI launches Prism, a free cloud-based LaTeX editor that embeds GPT-5.2 to assist in scientific paper drafting and citation management

Accelerating science writing and collaboration with AI.

View original

New Qwen model significantly improves at Humanity's Last Exam with tool use and now appears to be SOTA here by a large margin? Main innovation driving this seems to be a non-parallel approach to test time scaling where the model avoids redundant thinking. On the surface the [image]

2026-01-26 View on X

Qwen

Qwen releases Qwen3-Max-Thinking, its flagship reasoning model that it says demonstrates performance comparable to models such as GPT-5.2 Thinking and Opus 4.5

View original

Charging per ad view is interesting for a few reasons, one of which being that Altman and Ive said they're trying to get people off screens with OpenAI's hardware. So not sure a CPM model will work with those future form factors.

2026-01-22 View on X

The Information

Sources: OpenAI has begun offering its chatbot ads to dozens of advertisers, initially charging based on ad view, not ad click, asking for <$1M in commitments

View original