3.1 KiB
| sidebar_position |
|---|
| 4 |
🧑🔬 Models Directory
We recommend using
- small models (less than 400M) for CPU devices.
- For 1B to 7B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs.
- For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.
CodeLlama-7B / CodeLlama-13B
Code Llama is a collection of pretrained and fine-tuned generative text models. Theses model is designed for general code synthesis and understanding.
StarCoder-1B / StarCoder-3B / StarCoder-7B
StarCoder series model are trained on 80+ programming languages from The Stack (v1.2), with opt-out requests excluded. The model uses Multi Query Attention, a context window of 8192 tokens, and was trained using the Fill-in-the-Middle objective on 1 trillion tokens.
SantaCoder-1B
SantaCoder is the smallest member of the BigCode family of models, boasting just 1.1 billion parameters. This model is specifically trained with a fill-in-the-middle objective, enabling it to efficiently auto-complete function parameters. It offers support for three programming languages: Python, Java, and JavaScript.
J-350M
Derived from Salesforce/codegen-350M-multi, a model of CodeGen family.
T5P-220M
Derived from Salesforce/codet5p-220m, a model of CodeT5+ family.