Notebookcheck Logo

Way to run DeepSeek's 671B AI model without expensive GPUs discovered

Image source: Aristal, Pixabay
Image source: Aristal, Pixabay
Hugging Face engineer Matthew Carrigan recently revealed on X a method to locally run DeepSeek's advanced R1 model with 8-bit quantization, eliminating the need for expensive GPUs, for a reported cost of $6,000. The key? Having a lot of memory as opposed to vast computing power reserves.

Launched on January 20, 2025, DeepSeek-R1 is a 671B parameter Mixture-of-Experts (MoE) model with 37B active parameters per token. Designed for advanced reasoning, it supports 128K token inputs and generates up to 32K tokens. Thanks to its MoE architecture, it delivers top-tier performance while using fewer resources than traditional dense models.

Independent testing suggests that the R1 language model achieves performance comparable to OpenAI’s O1, positioning it as a competitive alternative in high-stakes AI applications. Let`s find out what we need to run it locally.

The hardware

This build centers around dual AMD Epyc CPUs and 768GB of DDR5 RAM—no expensive GPUs needed.

Software & Setup

Once assembled, Linux and llama.cpp need be installed in order to run the model. A crucial BIOS tweak, setting NUMA groups to 0, doubles RAM efficiency for better performance. The full 700GB of DeepSeek-R1 weights can be downloaded from Hugging Face.

Performance

This setup generates 6-8 tokens per second—not bad for a fully local high-end AI model. It skips GPU entirely, but that’s intentional. Running Q8 quantization (for high quality) on GPUs would require 700GB+ of VRAM, costing over $100K. Despite its raw power, the entire system consumes under 400W, making it surprisingly efficient.

For those who want full control over frontier AI, no cloud, no restrictions, this is a game changer. It proves that high-end AI can be run locally, in a fully open-source fashion, while prioritizing data privacy, minimizing vulnerabilities to breaches, and eliminating reliance on external systems.

Source(s)

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2025 02 > Way to run DeepSeek's 671B AI model without expensive GPUs discovered
Daniel Miron, 2025-02- 4 (Update: 2025-02- 4)