Chinese GPU maker Moore Threads held a MUSA Developer Conference, unveiling its next-gen "Huagang" (or "Flowerpot") architecture. The architecture is set to launch in 2026 and will span both gaming and AI applications. The conference, however, was light on specific technical specifications of the upcoming architecture, yet emphasized performance claims heavily.
A new upcoming gaming GPU called "Lushan" will be built on the Huagang architecture and will succeed the current MTT S80 and S90 models. The company claims a whopping 15x performance improvement in AAA game rendering and a massive 50x boost in ray tracing performance. The GPU will also reportedly feature a 2nd-gen hardware ray tracing engine, alongside full DirectX 12 Ultimate support for better compatibility. It is important to remember that there is no proof yet that any of these claims hold water, so it's best to take them with a grain of salt.
Memory-wise, the GPU is expected to offer up to 64 GB of memory (up from the current 16 GB GDDR6 offered on current models). The company also claims 64x improvements in AI compute performance, 16x in geometry processing, 4x in texture fill performance, and 8x in atomic memory access. The GPU will also reportedly feature a new "UniTE" unified rendering architecture with a dedicated AI hardware block. It remains to be seen if these claims will hold value, though.
Alongside Lushan, the company also teased its Huashan AI GPU, reportedly featuring a dual-chiplet design with 9 HBM modules. The company claims performance will be comparable to Nvidia's Hopper and Blackwell GPUs, and that memory bandwidth will exceed Nvidia's B200. The AI GPU will also support FP4 through FP64 compute with proprietary formats (MTFP4, MTFP6, MTFP8) and is scalable to over 100,000 GPUs via MTLink 4.0 interconnect at 1314 GB/s. The company claims a 50 percent increase in compute density and a 10x improvement in efficiency over current models.
While there aren't any gaming demos available for the GPUs, the company did offer a performance demo for DeepSeek V3 on the MTT S5000 (another GPU releasing next year but not part of the Huashan lineup). The GPU apparently achieved 1000 tokens/second in Decode and 4000 tokens/second in Prefill, showing that its performance is slightly ahead of Nvidia's Hopper lineup. The upcoming GPUs showcase China's push for GPU self-reliance amid export restrictions, and more details are expected in the coming months as the products approach launch.
Source(s)
Fast Technology (in Chinese)






