How Alibaba builds its most efficient AI model to date

How Alibaba builds its most efficient AI model to date


A technical innovation has allowed Alibaba Group Holding, one of the leading players in China’s artificial intelligence boom, to develop a new generation of foundation models that match the strong performance of larger predecessors while being significantly smaller and more cost efficient.

Alibaba Cloud, the AI and cloud computing division of Alibaba, unveiled on Friday a new generation of large language models that it said heralded “the future of efficient LLMs”. The new models are nearly 13 times smaller than the company’s largest AI model, released just a week earlier.

Despite its compact size, Qwen3-Next-80B-A3B is among Alibaba’s best models to date, according to developers. The key lies in its efficiency: the model is said to perform 10 times faster in some tasks than the preceding Qwen3-32B released in April, while achieving a 90 per cent reduction in training costs.

Do you have questions about the biggest topics and trends from around the world? Get the answers with SCMP Knowledge, our new platform of curated content with explainers, FAQs, analyses and infographics brought to you by our award-winning team.

Emad Mostaque, co-founder of the UK-based start-up Stability AI, said on X that Alibaba’s new model outperformed “pretty much any model from last year” despite an estimated training cost of less than US$500,000.

For comparison, training Google’s Gemini Ultra, released in February 2024, cost an estimated US$191 million, according to Stanford University’s AI Index.

Alibaba says its new generation of AI foundation models heralds the “the future of efficient LLMs”. Photo: Handout alt=Alibaba says its new generation of AI foundation models heralds the “the future of efficient LLMs”. Photo: Handout>

Artificial Analysis, a leading AI benchmarking firm, said Qwen3-Next-80B-A3B surpassed the latest versions of both DeepSeek R1 and Alibaba-backed start-up Moonshot AI’s Kimi-K2. Alibaba owns the South China Morning Post.

Several AI researchers attributed the success of Alibaba’s new model to a relatively new technique called “hybrid attention”.

Existing models face diminishing returns on efficiency as input lengths increase because of the way AI models determine which inputs are the most relevant. This “attention” mechanism involves trade-offs: better attention accuracy leads to higher computational expenses.

Those costs compound when models handle long context inputs, making it expensive to train sophisticated AI agents that autonomously execute tasks for users.





Source link