Advertisement

Tencent boosts AI training efficiency without Nvidia’s most advanced chips

  • The company has focused on speeding up network communications to access idling GPU capacity, Tencent said, offering a 20 per cent improvement in LLM training

Reading Time:2 minutes
Why you can trust SCMP
0
Tencent has been finding ways to improve the training efficiency of its artificial intelligence models without upgrading to the most advanced chips. Photo: Shutterstock
Iris Dengin Shenzhen
Tencent Holdings has upgraded its high-performance computing (HPC) network, with an improvement to its artificial intelligence (AI) capabilities, as Chinese tech giants seek ways to boost large language model (LLM) training with existing systems and equipment amid a domestic push for technological self-reliance.

The 2.0 version of Tencent’s Intelligent High-Performance Network, known as Xingmai in Chinese, will improve the efficiency of network communications and LLM training by 60 per cent and 20 per cent, respectively, the company’s cloud unit said on Monday.

The performance enhancement comes as China looks for ways to advance its AI ambitions with restricted access to advanced chips from Nvidia owing to strict US export rules. China’s most valuable tech giant achieved the performance gains by optimising existing facilities rather than trying to compete head-to-head with US rivals such as OpenAI in terms of spending and cutting-edge semiconductors.

05:03

How does China’s AI stack up against ChatGPT?

How does China’s AI stack up against ChatGPT?

An HPC network connects clusters of powerful graphics processing units (GPUs) to process data and solve problems at extremely high speeds.

Under pre-existing HPC networking technologies, computing clusters were spending too much time communicating with other clusters, leaving a significant portion of GPU capacity idling, according to Tencent. So the company upgraded its network to speed up the communications process while reducing costs, it said.

The Xingmai network can support a single computing cluster with more than 100,000 GPUs, according to the company, doubling the scale from the initial version of the network released in 2023. The improved performance shortens the time needed for identifying problems to just minutes, down from days previously, Tencent said.

Tencent has recently made a big push to strengthen its technologies in the rapidly growing AI field. The Shenzhen-based firm has been promoting its in-house LLMs for enterprise use, and also has services helping other companies to build their own models.

Advertisement