parent
cfbcff64ec
commit
c184166944
21
README.md
21
README.md
|
|
@ -37,33 +37,24 @@ Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub C
|
||||||
|
|
||||||
### Docker
|
### Docker
|
||||||
|
|
||||||
The easiest way of getting started is using the docker image:
|
**NOTE**: To run Tabby, it is required to have a GPU and CUDA. However, you can refer to [Skypilot](./deployment/skypilot/README.md) for alternative solutions.
|
||||||
|
Before running Tabby, ensure the installation of the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).
|
||||||
|
We suggest using NVIDIA drivers that are compatible with CUDA version 11.8 or higher.
|
||||||
```bash
|
```bash
|
||||||
# Create data dir and grant owner to 1000 (Tabby run as uid 1000 in container)
|
# Create data dir and grant owner to 1000 (Tabby run as uid 1000 in container)
|
||||||
mkdir -p data/hf_cache && chown -R 1000 data
|
mkdir -p data/hf_cache && chown -R 1000 data
|
||||||
|
|
||||||
docker run \
|
|
||||||
-it --rm \
|
|
||||||
-v ./data:/data \
|
|
||||||
-v ./data/hf_cache:/home/app/.cache/huggingface \
|
|
||||||
-p 5000:5000 \
|
|
||||||
-e MODEL_NAME=TabbyML/J-350M \
|
|
||||||
tabbyml/tabby
|
|
||||||
```
|
|
||||||
|
|
||||||
To use the GPU backend (triton) for a faster inference speed:
|
|
||||||
```bash
|
|
||||||
docker run \
|
docker run \
|
||||||
--gpus all \
|
--gpus all \
|
||||||
-it --rm \
|
-it --rm \
|
||||||
-v ./data:/data \
|
-v "/$(pwd)/data:/data" \
|
||||||
-v ./data/hf_cache:/home/app/.cache/huggingface \
|
-v "/$(pwd)/data/hf_cache:/home/app/.cache/huggingface" \
|
||||||
-p 5000:5000 \
|
-p 5000:5000 \
|
||||||
-e MODEL_NAME=TabbyML/J-350M \
|
-e MODEL_NAME=TabbyML/J-350M \
|
||||||
-e MODEL_BACKEND=triton \
|
-e MODEL_BACKEND=triton \
|
||||||
|
--name=tabby \
|
||||||
tabbyml/tabby
|
tabbyml/tabby
|
||||||
```
|
```
|
||||||
Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.
|
|
||||||
|
|
||||||
You can then query the server using `/v1/completions` endpoint:
|
You can then query the server using `/v1/completions` endpoint:
|
||||||
```bash
|
```bash
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue