DeepSeek V4 Pro has 1.6T total parameters, its largest model by that metric, and V4 Flash has 284B parameters; both models have a context window of 1M tokens (South China Morning Post)

South China Morning Post:
DeepSeek V4 Pro has 1.6T total parameters, its largest model by that metric, and V4 Flash has 284B parameters; both models have a context window of 1M tokens  —  The company says its cost-efficient new V4 model is competitive with top closed-source models from OpenAI and Google DeepMind



from Techmeme https://ift.tt/Pi3EHwt

Post a Comment

0 Comments