feat: add deployment script on lambda cloud with skypilot (#37)
* feat: add deployment script on lambda cloud with skypilot * docs: adjust API documentation level * fix: move docker-compose install before the docker-compose pull * Fix documentation * Update replica updatingadd-more-languages
parent
0a66a9d498
commit
22fbaefbd4
|
|
@ -56,7 +56,10 @@ We also provides an interactive playground in admin panel [localhost:8501](http:
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
### API documentation
|
### Skypilot
|
||||||
|
See [deployment/skypilot/README.md](./deployment/skypilot/README.md)
|
||||||
|
|
||||||
|
## API documentation
|
||||||
|
|
||||||
Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.
|
Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,7 @@ fi
|
||||||
sed -i 's@${MODEL_DIR}@'$MODEL_DIR'@g' $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
sed -i 's@${MODEL_DIR}@'$MODEL_DIR'@g' $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
||||||
|
|
||||||
# SET model replica in triton config.
|
# SET model replica in triton config.
|
||||||
sed -i "s/count: 1/count: $MODEL_REPLICA/g" $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
sed -i "s/count: [[:digit:]]\+/count: $MODEL_REPLICA/g" $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
||||||
|
|
||||||
# Start triton server.
|
# Start triton server.
|
||||||
mpirun -n 1 \
|
mpirun -n 1 \
|
||||||
|
|
|
||||||
|
|
@ -1,14 +0,0 @@
|
||||||
resources:
|
|
||||||
cloud: aws
|
|
||||||
accelerators: T4:1
|
|
||||||
|
|
||||||
setup: |
|
|
||||||
set -ex
|
|
||||||
git clone https://github.com/TabbyML/tabby.git || true
|
|
||||||
sudo curl -L "https://github.com/docker/compose/releases/download/v2.12.1/docker-compose-linux-x86_64" -o /usr/local/bin/docker-compose
|
|
||||||
sudo chmod +x /usr/local/bin/docker-compose
|
|
||||||
cd tabby/deployment && docker-compose pull
|
|
||||||
|
|
||||||
run: |
|
|
||||||
cd tabby/deployment
|
|
||||||
MODEL_REPLICA=8 docker-compose up
|
|
||||||
|
|
@ -0,0 +1,50 @@
|
||||||
|
# Run Tabby server on any cloud with one click
|
||||||
|
|
||||||
|
## Background
|
||||||
|
|
||||||
|
[**SkyPilot**](https://github.com/skypilot-org/skypilot) is an open-source framework for seamlessly running machine learning on any cloud. With a simple CLI, users can easily launch many clusters and jobs, while substantially lowering their cloud bills. Currently, [Lambda Labs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#lambda-cloud) (low-cost GPU cloud), [AWS](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#aws), [GCP](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#gcp), and [Azure](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#azure) are supported. See [docs](https://skypilot.readthedocs.io/en/latest/) to learn more.
|
||||||
|
|
||||||
|
## Steps
|
||||||
|
|
||||||
|
1. Install SkyPilot and [check that cloud credentials exist](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup):
|
||||||
|
```bash
|
||||||
|
pip install "skypilot[aws,gcp,azure,lambda]" # pick your clouds
|
||||||
|
sky check
|
||||||
|
```
|
||||||
|
<img src="https://i.imgur.com/7BUci5n.png" width="485" alt="`sky check` output showing enabled clouds for SkyPilot"/>
|
||||||
|
|
||||||
|
2. Get the [deployment folder](./):
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/TabbyML/tabby
|
||||||
|
cd tabby/deployment/skypilot
|
||||||
|
```
|
||||||
|
|
||||||
|
3. run:
|
||||||
|
```bash
|
||||||
|
sky launch -c tabby default.yml
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Open another terminal and run:
|
||||||
|
```bash
|
||||||
|
ssh -L 8501:localhost:8501 -L 5000:localhost:5000 -L 8080:localhost:8080 tabby
|
||||||
|
```
|
||||||
|
|
||||||
|
5. Open http://localhost:8501 in your browser and start coding!
|
||||||
|

|
||||||
|
|
||||||
|
## Cleaning up
|
||||||
|
When you are done, you can stop or tear down the cluster:
|
||||||
|
|
||||||
|
- **To stop the cluster**, run
|
||||||
|
```bash
|
||||||
|
sky stop tabby # or pass your custom name if you used "-c <other name>"
|
||||||
|
```
|
||||||
|
You can restart a stopped cluster and relaunch the chatbot (the `run` section in YAML) with
|
||||||
|
```bash
|
||||||
|
sky launch default.yml -c tabby --no-setup
|
||||||
|
```
|
||||||
|
Note the `--no-setup` flag: a stopped cluster preserves its disk contents so we can skip redoing the setup.
|
||||||
|
- **To tear down the cluster** (non-restartable), run
|
||||||
|
```bash
|
||||||
|
sky down tabby # or pass your custom name if you used "-c <other name>"
|
||||||
|
```
|
||||||
|
|
@ -0,0 +1,22 @@
|
||||||
|
resources:
|
||||||
|
accelerators: A100:1
|
||||||
|
disk_size: 1024
|
||||||
|
|
||||||
|
setup: |
|
||||||
|
set -ex
|
||||||
|
|
||||||
|
# On some cloud providers, docker-compose is not installed by default.
|
||||||
|
sudo curl -L https://github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
|
||||||
|
sudo chmod a+x /usr/local/bin/docker-compose
|
||||||
|
|
||||||
|
# Pull tabby images.
|
||||||
|
git clone https://github.com/TabbyML/tabby.git || true
|
||||||
|
cd tabby/deployment
|
||||||
|
|
||||||
|
# On certain cloud providers (e.g lambda cloud), the default user is not added to docker group, so we need sudo here
|
||||||
|
sudo docker-compose pull
|
||||||
|
|
||||||
|
|
||||||
|
run: |
|
||||||
|
cd tabby/deployment
|
||||||
|
MODEL_REPLICA=${MODEL_REPLICA:-8} sudo docker-compose up
|
||||||
Loading…
Reference in New Issue