Go to file
Meng Zhang 7f5189210a fix: don't set web concurrency by default 2023-04-05 22:20:07 +08:00
.github/workflows feat: support single container (#46) 2023-04-05 20:19:43 +08:00
clients/vscode VSCode extension: add language field in completion request. (#45) 2023-04-05 14:27:23 +08:00
deployment fix: don't set web concurrency by default 2023-04-05 22:20:07 +08:00
development feat: support single container (#46) 2023-04-05 20:19:43 +08:00
docs/internal feat: improve events system (#40) 2023-04-04 13:22:16 +08:00
tabby feat: support single container (#46) 2023-04-05 20:19:43 +08:00
tests test: support TABBY_API_HOST in k6 tests 2023-04-04 11:14:22 +08:00
.dockerignore Add gptj converter (#19) 2023-03-27 11:12:52 +08:00
.gitattributes Add docker compose (#3) 2023-03-22 02:42:47 +08:00
.gitignore Add supervisord.pid to gitignore 2023-03-29 16:41:18 +08:00
.pre-commit-config.yaml feat: support stopping words in python backend. (#32) 2023-03-29 20:23:11 +08:00
Dockerfile feat: support single container (#46) 2023-04-05 20:19:43 +08:00
LICENSE Create LICENSE 2023-03-16 17:28:10 +08:00
Makefile test: support TABBY_API_HOST in k6 tests 2023-04-04 11:14:22 +08:00
README.md Update README.md 2023-04-04 19:49:47 +08:00
poetry.lock Add bitsandbytes (#35) 2023-03-29 20:47:44 +08:00
pyproject.toml Add bitsandbytes (#35) 2023-03-29 20:47:44 +08:00

README.md

🐾 Tabby

License Code style: black Docker build status

architecture

Warning Tabby is still in the alpha phrase

An opensource / on-prem alternative to GitHub Copilot.

Features

  • Self-contained, with no need for a DBMS or cloud service
  • Web UI for visualizing and configuration models and MLOps.
  • OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE).
  • Consumer level GPU supports (FP-16 weight loading with various optimization).

Get started

Docker

The easiest way of getting started is using the official docker image:

docker run \
  -it --rm \
  -v ./data:/data \
  -v ./data/hf_cache:/root/.cache/huggingface \
  -p 5000:5000 \
  -p 8501:8501 \
  -p 8080:8080 \
  -e MODEL_NAME=TabbyML/J-350M \
  tabbyml/tabby

You can then query the server using /v1/completions endpoint:

curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
    "prompt": "def binarySearch(arr, left, right, x):\n    mid = (left +"
}'

To use the GPU backend (triton) for a faster inference speed, use deployment/docker-compose.yml:

docker-compose up

Note: To use GPUs, you need to install the NVIDIA Container Toolkit. We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.

We also provides an interactive playground in admin panel localhost:8501

image

Skypilot

See deployment/skypilot/README.md

API documentation

Tabby opens an FastAPI server at localhost:5000, which embeds an OpenAPI documentation of the HTTP API.

Development

Go to development directory.

make dev

or

make dev-python  # Turn off triton backend (for non-cuda env developers)

TODOs

  • VIM Client #36
  • Fine-tuning models on private code repository. #23
  • Production ready (Open Telemetry, Prometheus metrics).