3.2 KiB
3.2 KiB
| sidebar_position |
|---|
| 4 |
🧑🔬 Models Directory
Completion models (--model)
We recommend using
- small models (less than 400M) for CPU devices.
- For 1B to 7B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs.
- For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.
| Model ID | License | Infilling Support | Apple M1/M2 Supports |
|---|---|---|---|
| TabbyML/CodeLlama-13B | Llama2 | ✅ | ✅ |
| TabbyML/CodeLlama-7B | Llama2 | ✅ | ✅ |
| TabbyML/StarCoder-7B | BigCode-OpenRAIL-M | ✅ | ✅ |
| TabbyML/StarCoder-3B | BigCode-OpenRAIL-M | ✅ | ✅ |
| TabbyML/StarCoder-1B | BigCode-OpenRAIL-M | ✅ | ✅ |
Chat models (--chat-model)
To ensure optimal response quality, and given that latency requirements are not stringent in this scenario, we recommend using a model with at least 3B parameters.
| Model ID | License | Apple M1/M2 Supports |
|---|---|---|
| TabbyML/Mistral-7B | Apache 2.0 | ✅ |
| TabbyML/WizardCoder-3B | OpenRAIL-M | ✅ |
Alternative Registry
By default, Tabby utilizes the Hugging Face organization as its model registry. Mainland Chinese users have encountered challenges accessing Hugging Face for various reasons. The Tabby team has established a mirrored at modelscope, which can be utilized using the following environment variable:
TABBY_REGISTRY=modelscope tabby serve --model TabbyML/StarCoder-1B