Go to file

Meng Zhang 20801bbe8c Cleanup environment variable (#30 ) * Remove EVENTS_LOG_DIR * Rename supervisord.sh -> tabby.sh		2023-03-29 16:33:00 +08:00
.github/workflows	Move python code under tabby/ (#8 )	2023-03-25 12:20:29 +08:00
clients/vscode	Delete README.md	2023-03-29 09:09:41 +08:00
deployment	Cleanup environment variable (#30 )	2023-03-29 16:33:00 +08:00
development	Cleanup environment variable (#30 )	2023-03-29 16:33:00 +08:00
tabby	Cleanup environment variable (#30 )	2023-03-29 16:33:00 +08:00
.dockerignore	Add gptj converter (#19 )	2023-03-27 11:12:52 +08:00
.gitattributes	Add docker compose (#3 )	2023-03-22 02:42:47 +08:00
.gitignore	Prepare public release with a minimal deployment setup (#16 )	2023-03-26 22:44:15 +08:00
.pre-commit-config.yaml	Add LoRA Fine-tuning for private code repository (#22 )	2023-03-28 15:57:13 +08:00
Dockerfile	Cleanup environment variable (#30 )	2023-03-29 16:33:00 +08:00
LICENSE	Create LICENSE	2023-03-16 17:28:10 +08:00
Makefile	Prepare public release with a minimal deployment setup (#16 )	2023-03-26 22:44:15 +08:00
README.md	Cleanup environment variable (#30 )	2023-03-29 16:33:00 +08:00
poetry.lock	Add Completion Events & Acceptance Rate in metrics panel. (#26 )	2023-03-28 20:12:03 +08:00
pyproject.toml	Add Completion Events & Acceptance Rate in metrics panel. (#26 )	2023-03-28 20:12:03 +08:00

README.md

🐾 Tabby

Warning Tabby is still in the alpha phrase

An opensource / on-prem alternative to GitHub Copilot.

Features

Self-contained, with no need for a DBMS or cloud service
Web UI for visualizing and configuration models and MLOps.
OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE).
Consumer level GPU supports (FP-16 weight loading with various optimization).

Get started

Docker

The easiest way of getting started is using the official docker image:

docker run \
  -it --rm \
  -v ./data:/data \
  -v ./data/hf_cache:/root/.cache/huggingface \
  -p 5000:5000 \
  -p 8501:8501 \
  -p 8080:8080\
  -e MODEL_NAME=TabbyML/J-350M tabbyml/tabby

You can then query the server using /v1/completions endpoint:

curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
    "prompt": "def binarySearch(arr, left, right, x):\n    mid = (left +"
}'

To use the GPU backend (triton) for a faster inference speed, use deployment/docker-compose.yml:

docker-compose up

Note: To use GPUs, you need to install the NVIDIA Container Toolkit. We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.

We also provides an interactive playground in admin panel localhost:8501

API documentation

Tabby opens an FastAPI server at localhost:5000, which embeds an OpenAPI documentation of the HTTP API.

Development

Go to development directory.

make dev

make dev-python  # Turn off triton backend (for non-cuda env developers)

TODOs

DuckDB integration, to plot metrics in admin panel (e.g acceptance rate). #24
Fine-tuning models on private code repository. #23
Production ready (Open Telemetry, Prometheus metrics).
Token streaming using Server-Sent Events (SSE)