China-based artificial intelligence startup DeepSeek has made its new big language model DeepSeek-V3-0324, released under licence from MIT, available to users. The model is available for free download on the Hugging Face platform and is fully open for commercial use.
New DeepSeek-V3-0324 unveiled
The 641 gigabyte model draws attention with its ability to run on consumer-grade hardware. The technology, which is said to work smoothly even on Mac Studio devices with Apple’s M3 Ultra chip, has a structure with 685 billion parameters.
Artificial intelligence researcher Xeophon states that this model could be a serious competitor to Anthropic’s Claude Sonnet 3.5 model. Especially the fact that DeepSeek-V3-0324 is available completely free of charge, unlike Sonnet’s subscription, will make a big difference.
The model is based on the Mixture of Experts (MoE) architecture. Unlike traditional large language models, DeepSeek-V3-0324 only activates the most necessary parameters. Out of 685 billion parameters, only about 37 billion are activated.
This approach significantly reduces computation time without sacrificing performance. Performance tests showed very similar results to larger, more intensive activation models.
DeepSeek-V3-0324 also includes two important innovations, Multi-Head Latent Attention (MLA) and Multi-Token Prediction (MTP). MLA improves the ability to preserve context across long texts, while MTP allows multiple tokens to be generated at each step.
These technologies will increase the model’s output speed by about 80 per cent. Awni Hannun of Apple’s research group said they tested the model on Mac Studio, producing output at a rate of about 20 tokens per second.
Users report a significant change in the model’s communication style compared to previous versions. Unlike previous DeepSeek models, which had a human-like and conversational tone, V3-0324 has a more formal and technical tone.
DeepSeek’s move has taken the competition between major language models to a new level. So what do you think about this issue? You can share your opinions with us in the comments section below.