Deepseek has unveiled its latest AI large-language model (LLM) Deepseek V3, and both the model and chatbot are available for free.
LLMs that power today's common chatbots are all trained on millions of documents to understand the connections between words and topics. The more parameters these models have, the better the chatbots can perform in answering prompts from users. However, the billions of parameters used require much computing power and energy, so careful tuning of the training process is key to keeping costs and training time low.
To achieve these goals, Deepseek used an innovative load-balancing strategy coupled with lower-precision, 8-bit floating point (FP8) calculations, the company's unique method for shrinking memory usage (Multi-Head Latent Attention or MLA) and other methods detailed in their technical paper.
Careful optimization of the Deepseek V3 training process kept costs under $6 million, unlike the $78 million to train OpenAI's GPT-4 or the estimated $500+ million per run to train OpenAI's GPT-5. Lower costs and faster training reduce the cost to commercial users of Deepseek. The ecologically minded can also celebrate the lower energy use and reduced carbon emissions required for Deepseek V3 training.
Deepseek V2 was already ranked in the top ten most powerful AI LLM models available, and preliminary chatbot benchmarks from the company indicated it won in 12 out of 21 tests against top-ranked LLMs, such as Anthropic Claude 3.5 Sonnet and OpenAI GPT-4o.
Readers can use the Deepseek V3 chatbot for free to help write essays, provide answers, and simplify work. Businesses can start building apps using the V3 Platform API. Those trying to keep secrets should know that all chat data is kept in servers located in the People's Republic of China. However, the largest American companies behind today's top AI LLMs, such as Facebook, have been caught sharing data, too.