2025-10-20
South China Morning Post
4 related
Alibaba Cloud details a GPU pooling system that it claims reduced the number of Nvidia H20s required by 82% when serving dozens of LLMs of up to 72B parameters
up to 9x increase in output lets 213 GPUs perform like 1,192 ACM Digital Library : Aegaeon: Effective GPU Pooling for Concurrent LLM Serving on the Market Rounak Jain / Benzinga : Alibaba Cloud's New ...
2023-04-26
Bloomberg
3 related
Alibaba Cloud plans to cut its core product costs by 15% to 50% on May 7; Alibaba has 200K+ requests from businesses for the beta of its Tongyi Qianwen AI model
Bloomberg :
Loading articles...