Advertisement

Ant Group’s use of China-made GPUs, not Nvidia, cuts AI model training costs by 20%

The fintech affiliate of Alibaba said its Ling-Plus-Base model can be ‘effectively trained on lower-performance devices’

Reading Time:2 minutes
Why you can trust SCMP
2
Ant Group’s latest research into AI development places it among domestic firms such as DeepSeek and ByteDance in the search for more efficient ways to train and run models. Photo: Shutterstock
Ann Caoin Shanghai
Ant Group, the fintech affiliate of Alibaba Group Holding, is able to train large language models (LLMs) using locally produced graphics processing units (GPUs), reducing reliance on Nvidia’s advanced chips and cutting training costs by 20 per cent, according to a research paper and media reports.

Ant’s Ling team, responsible for LLM development, revealed that its Ling-Plus-Base model, a Mixture-of-Experts (MoE) model with 300 billion parameters, can be “effectively trained on lower-performance devices”. The finding was published in a recent paper on arXiv, an open-access platform for professionals in the scientific community.

By avoiding high-performance GPUs, the model reduces computing costs by a fifth in the pre-training process, while still achieving performance comparable to other models such as Qwen2.5-72B-Instruct and DeepSeek-V2.5-1210-Chat, according to the paper.

The development positions the Hangzhou-based fintech giant alongside domestic peers like DeepSeek and ByteDance in reducing reliance on advanced Nvidia chips, which are subject to strict US export controls.

“These results demonstrate the feasibility of training state-of-the-art large-scale MoE models on less powerful hardware, enabling a more flexible and cost-effective approach to foundational model development with respect to computing resource selection,” the team wrote in the paper.

MoE is a machine learning technique in which multiple networks of specialised knowledge are used to divide a problem space into homogeneous sections. The technique has been widely adopted by leading artificial intelligence (AI) models – Grok, DeepSeek and Alibaba’s Qwen included – to scale LLMs to trillion-plus parameters while maintaining fixed computing costs. Alibaba owns the South China Morning Post.
Advertisement