Overview
DeepSeek Coder is a suite of code language models trained from scratch on 2T tokens consisting of 87% code and 13% natural language (English and Chinese). Models range in size from 1B to 33B parameters. It's pre-trained on a project-level code corpus using a 16K window size and a fill-in-the-blank task to support project-level code completion and infilling. The architecture is based on the transformer model. It focuses on superior code generation and understanding capabilities. It achieves state-of-the-art performance among open-source code models on benchmarks like HumanEval, MultiPL-E, MBPP, DS-1000, and APPS. It offers flexibility with different model sizes, allowing users to choose the best fit for their hardware and requirements. Advanced code completion supports project-level tasks.
