Jack Ma’s fintech powerhouse has tapped into homegrown chips from Alibaba and Huawei to train AI models that appear to rival Nvidia’s H800 GPUs.
According to people with knowledge of the matter, Ant Group has figured out a way to train AI models on Chinese-manufactured semiconductors, driving down costs by about 20 percent compared to conventional methods.
Insiders say Ant’s results stack up favorably against Nvidia Corp.'s H800 chips, which aren’t currently available to Chinese companies due to U.S. export controls. Although Ant still uses Nvidia hardware for some of its AI work, the company is now reportedly putting more emphasis on AMD processors and Chinese alternatives for its latest models.
Ant published a research paper this month claiming its Ling-Plus and Ling-Lite models even beat out Meta Platforms Inc. in certain benchmarks. If those findings hold up, these systems could represent a major leap forward for Chinese AI by drastically cutting the expense of training and rolling out AI services.
The paper notes that using high-performance hardware to train 1 trillion tokens costs roughly 6.35 million yuan (around $880,000). But with the company’s optimized approach—and lower-spec equipment—that figure drops to about 5.1 million yuan (around $700,000). For those unfamiliar, tokens are essentially the units of information used by these models to learn and produce outputs.
Looking ahead, Ant intends to use these AI models for healthcare and finance applications. Earlier this year, it acquired the Chinese online platform Haodf.com to bolster its healthcare-focused AI services. Ant also operates an AI “life assistant” app called Zhixiaobao and a financial advisory AI tool named Maxiaocai.
Both Ling models are open source: Ling-Lite carries 16.8 billion parameters, while Ling-Plus weighs in at 290 billion. Though those are hefty figures, they’re still smaller than some other major AI models—experts estimate GPT-4.5 sits at around 1.8 trillion parameters, and DeepSeek-R1 clocks in at 671 billion.
Ant acknowledged some bumps in the road, particularly regarding stability during training. The research paper noted that small changes to hardware or model design sometimes triggered big spikes in error rates.
Source(s)
Bloomberg (in English)