Overview
StarCoder2-15B is a 15 billion parameter language model designed for code generation. It is trained on over 600 programming languages from The Stack v2 dataset, excluding opt-out requests. The model uses Grouped Query Attention and has a context window of 16,384 tokens with a sliding window attention of 4,096 tokens. Trained with the Fill-in-the-Middle objective on 4+ trillion tokens using NVIDIA NeMo framework on NVIDIA DGX H100 systems. It's not an instruction-following model, it excels at generating code snippets given context. Fine-tuning scripts are available in the StarCoder2's GitHub repository. Quantized versions are available through bitsandbytes for efficient memory usage.
