Agentic inference is set to be different than today's inference, and will change compute infrastructure because speed won't matter when humans aren't involved
Great insights from @benthompson on how inference will evolve chip demand, especially agentic inference. — “If latency isn't the top priority, then slower and cheaper memory — like traditional DRAM, for example — makes a lot more sense. And if the entire system is mostly
The Inference Shift Agentic inference is going to be different than the inference we use today, and it will change compute infrastructure because speed won't matter when humans aren't involved. https://stratechery.com/...