Skip to content

DeepSeek

  • A company, founded in May 2023, by Liang Wenfeng
  • Focuses on training foundational LLMs with reasoning capabilities

Timeline

2024-01-05 DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
2024-05-07 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
2024-12-27 DeepSeek-V3 Technical Report
2025-01-22 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
2025-01-** On DeepSeek and Export Controls
2025-01-26 DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
2025-02-03 DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters | Lex Fridman Podcast #459
  • It's became famous after the release of DeepSeek-R1.
  • Ollama added the model to its repository, along with a bunch of other models distilled using DeepSeek.
  • They uploaded various models to Huggingface
  • Their API Documentation
  • open-r1 is a huggingface project aiming to fully reproduce DeepSeek-R1
  • Janus is another DeepSeek project training Unified Multimodal Understanding and Generation Models