Huawei Technologies has unveiled its most ambitious AI system to date, the CloudMatrix 384, at the World Artificial Intelligence Conference (WAIC) in Shanghai. Shown publicly for the first time, the system is designed to accelerate large-scale model training and positions the company as a domestic alternative to Nvidia’s high-end GB200 NVL72 platform.
The system’s core features 384 Ascend 910C accelerators linked by a proprietary “super-node” interconnect. By clustering more chips, the design compensates for lower per-device throughput, achieving aggregate performance that, according to SemiAnalysis, can surpass Nvidia’s GB200 on some benchmarks. WAIC did not reveal exact performance figures, but analysts note Huawei is prioritizing bandwidth and latency optimization over individual processor performance.
The launch occurs as US export restrictions block Nvidia’s fastest GPUs from China, creating an opening for Huawei. As noted by Nvidia’s CEO in May, Huawei is "moving quite fast." The company can now supply domestic hardware to cloud providers and research institutes, utilizing an in-house approach that bypasses licensing constraints that limit many local chip designers.
Founder Ren Zhengfei acknowledges that Ascend chips trail US rivals in raw power, but claims that mathematical optimization and cluster computing can close performance gaps for real workloads. The company dedicates approximately ¥180 billion (≈ US$25 billion) annually to R&D, with a third allocated to long-term theoretical research, which Ren considers essential for reducing reliance on Moore's Law.
Whether CloudMatrix 384 translates into wide commercial adoption will depend on price, software maturity and Beijing’s evolving cloud-procurement policies. Nonetheless, its appearance underlines how quickly China’s AI-hardware ecosystem is pivoting toward home-grown solutions—and how competition is shifting from individual chips to full-stack, system-level innovation.
Source(s)
Reuters (in English)