3.2 KiB

Raw Blame History

sidebar_position
4

🧑‍🔬 Models Directory

Completion models (`--model`)

We recommend using

small models (less than 400M) for CPU devices.
For 1B to 7B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs.
For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.

Model ID	License	Infilling Support	Apple M1/M2 Supports
TabbyML/CodeLlama-13B	Llama2	✅	✅
TabbyML/CodeLlama-7B	Llama2	✅	✅
TabbyML/StarCoder-7B	BigCode-OpenRAIL-M	✅	✅
TabbyML/StarCoder-3B	BigCode-OpenRAIL-M	✅	✅
TabbyML/StarCoder-1B	BigCode-OpenRAIL-M	✅	✅

Chat models (`--chat-model`)

To ensure optimal response quality, and given that latency requirements are not stringent in this scenario, we recommend using a model with at least 3B parameters.

Model ID	License	Apple M1/M2 Supports
TabbyML/Mistral-7B	Apache 2.0	✅
TabbyML/WizardCoder-3B	OpenRAIL-M	✅

Alternative Registry

By default, Tabby utilizes the Hugging Face organization as its model registry. Mainland Chinese users have encountered challenges accessing Hugging Face for various reasons. The Tabby team has established a mirrored at modelscope, which can be utilized using the following environment variable:

TABBY_REGISTRY=modelscope tabby serve --model TabbyML/StarCoder-1B

3.2 KiB Raw Blame History

🧑‍🔬 Models Directory

Completion models (--model)

Chat models (--chat-model)

Alternative Registry

3.2 KiB

Raw Blame History

Completion models (`--model`)

Chat models (`--chat-model`)