Tachyum's Prodigy Universal Processor Platform aims to rewrite everything we know about CPUs and GPUs
A month ago, startup firm Tachyum announced plans of foraying into the CPU market with a disruptive invention that aims to change our current perception of CPUs and GPUs. Tachyum's claims of how the new chip will revolutionize hyperscale computing have been unprecedented to say the least. According to the official press release, the new Prodigy processor "aims to combine CPUs, GP-GPUs, and AI units in a single universal platform with ten times the processing power per watt capable of running the world's most complex compute tasks." TechRadar recently caught up with Dr. Radoslav Danilak, CEO of Tachyum, via email to find out what exactly the hype is all about.
Dr. Danilak said that the Prodigy processor is based on a "single programming model, a single instruction stream, fully coherent memory, and fully coherent inter-core communication" unlike the current Heterogeneous System Architecture (HSA) that forms the basis of most SoCs today. Unlike traditional SoCs based on HSA such as ARM chips that combine diverse components such as the CPU, GPU, AI processors etc, Prodigy's universal platform scales up to the task at hand be it traditional compute or Neural Nets. It does not combine diverse chips but works and allocates workflows depending on the load. For example, deploying Prodigy in a server environment can power traditional compute while running AI and High Performance Computing (HPC) tasks when idle. It bears a few similarities to how the Cell processor in the PlayStation 3 was used to run protein folding programs but in a lot more efficient package.
Prodigy aims to circumvent the challenges of physics in modern semiconductors by basically taking a whole new approach to architecture design. The new architecture apparently reduces the length of wiring inside the CPU while making them work faster at the same time. With the new architecture, Tachyum aims to reduce server space by as much as 1% of what is currently the norm with energy consumption 1/10th of a typical datacenter resulting in a total cost of ownership that is 4 times lesser. Prodigy's offloading of traditional CPU tasks to the compiler results in increased IPC, clock speeds, and reduced power consumption.
Perhaps, what Dr. Danilak is most excited about is the computing prowess of this chip. Tachyum's 64-core Prodigy processor can output 128 TFLOPS of compute performance. Assemble about 250,000 of these and you get a neural net that is about the size of the human brain i.e. about 1019 flops or 10 exaflops. Dr. Danilak predicts that with the anticipated volume of production in the coming years, realizing such a big neural net should be possible by 2020. For perspective, the NVIDIA Tesla V100 can output 112 TFLOPS of Tensor performance so if the Prodigy does make it to market, it will be a remarkable feat indeed.
Which begets an important question. What can Tachyum do that other chip makers couldn't or haven't done so far? Dr. Danilak says it's all in the numbers. It's not the number of staff that count but how many of them can actually make the needed difference. He says, "If you have team of “gods”, [head] count is not that critical and can be filled with contractors."
Those are some bold claims from a startup aiming to take on Goliaths such as Intel, AMD, NVIDIA, and ARM but Tachyum has leading advisors of the likes of Prof. Christos Kozyrakis from Stanford backing them so there could be really be something interesting cooking in their labs. What would definitely be revolutionary is when such technology trickles down to the consumer segment. That would surely be the breakthrough we've been waiting for in the quest for cramming more power and efficiency into thinner and lighter form factors.