86 lines
2.8 KiB
Markdown
86 lines
2.8 KiB
Markdown
<div align="center">
|
|
|
|
# 🐾 Tabby
|
|
|
|
[](https://opensource.org/licenses/Apache-2.0)
|
|
[](https://github.com/psf/black)
|
|

|
|
[](https://tabbyml-tabby.hf.space/_admin/)
|
|
|
|

|
|
|
|
</div>
|
|
|
|
Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.
|
|
|
|
> **Warning**
|
|
> Tabby is still in the alpha phrase
|
|
|
|
## Features
|
|
|
|
* Self-contained, with no need for a DBMS or cloud service
|
|
* Web UI for visualizing and configuration models and MLOps.
|
|
* OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE).
|
|
* Consumer level GPU supports (FP-16 weight loading with various optimization).
|
|
|
|
## Get started
|
|
|
|
### Docker
|
|
|
|
The easiest way of getting started is using the official docker image:
|
|
```bash
|
|
# Create data dir and grant owner to 1000 (Tabby run as uid 1000 in container)
|
|
mkdir -p data/hf_cache && chown -R 1000 data
|
|
|
|
docker run \
|
|
-it --rm \
|
|
-v ./data:/data \
|
|
-v ./data/hf_cache:/home/app/.cache/huggingface \
|
|
-p 5000:5000 \
|
|
-e MODEL_NAME=TabbyML/J-350M \
|
|
tabbyml/tabby
|
|
```
|
|
|
|
To use the GPU backend (triton) for a faster inference speed:
|
|
```bash
|
|
docker run \
|
|
--gpus all \
|
|
-it --rm \
|
|
-v ./data:/data \
|
|
-v ./data/hf_cache:/home/app/.cache/huggingface \
|
|
-p 5000:5000 \
|
|
-e MODEL_NAME=TabbyML/J-350M \
|
|
-e MODEL_BACKEND=triton \
|
|
tabbyml/tabby
|
|
```
|
|
Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.
|
|
|
|
You can then query the server using `/v1/completions` endpoint:
|
|
```bash
|
|
curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
|
|
"prompt": "def binarySearch(arr, left, right, x):\n mid = (left +"
|
|
}'
|
|
```
|
|
|
|
We also provides an interactive playground in admin panel [localhost:5000/_admin](http://localhost:5000/_admin)
|
|
|
|

|
|
|
|
### Skypilot
|
|
See [deployment/skypilot/README.md](./deployment/skypilot/README.md)
|
|
|
|
## API documentation
|
|
|
|
Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.
|
|
|
|
## Development
|
|
|
|
Go to `development` directory.
|
|
```bash
|
|
make dev
|
|
```
|
|
or
|
|
```bash
|
|
make dev-triton # Turn on triton backend (for cuda env developers)
|
|
```
|