feat: add deployment script on lambda cloud with skypilot (#37)
* feat: add deployment script on lambda cloud with skypilot * docs: adjust API documentation level * fix: move docker-compose install before the docker-compose pull * Fix documentation * Update replica updatingadd-more-languages
parent
0a66a9d498
commit
22fbaefbd4
|
|
@ -56,7 +56,10 @@ We also provides an interactive playground in admin panel [localhost:8501](http:
|
|||
|
||||

|
||||
|
||||
### API documentation
|
||||
### Skypilot
|
||||
See [deployment/skypilot/README.md](./deployment/skypilot/README.md)
|
||||
|
||||
## API documentation
|
||||
|
||||
Tabby opens an FastAPI server at [localhost:5000](https://localhost:5000), which embeds an OpenAPI documentation of the HTTP API.
|
||||
|
||||
|
|
|
|||
|
|
@ -19,7 +19,7 @@ fi
|
|||
sed -i 's@${MODEL_DIR}@'$MODEL_DIR'@g' $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
||||
|
||||
# SET model replica in triton config.
|
||||
sed -i "s/count: 1/count: $MODEL_REPLICA/g" $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
||||
sed -i "s/count: [[:digit:]]\+/count: $MODEL_REPLICA/g" $MODEL_DIR/triton/fastertransformer/config.pbtxt
|
||||
|
||||
# Start triton server.
|
||||
mpirun -n 1 \
|
||||
|
|
|
|||
|
|
@ -1,14 +0,0 @@
|
|||
resources:
|
||||
cloud: aws
|
||||
accelerators: T4:1
|
||||
|
||||
setup: |
|
||||
set -ex
|
||||
git clone https://github.com/TabbyML/tabby.git || true
|
||||
sudo curl -L "https://github.com/docker/compose/releases/download/v2.12.1/docker-compose-linux-x86_64" -o /usr/local/bin/docker-compose
|
||||
sudo chmod +x /usr/local/bin/docker-compose
|
||||
cd tabby/deployment && docker-compose pull
|
||||
|
||||
run: |
|
||||
cd tabby/deployment
|
||||
MODEL_REPLICA=8 docker-compose up
|
||||
|
|
@ -0,0 +1,50 @@
|
|||
# Run Tabby server on any cloud with one click
|
||||
|
||||
## Background
|
||||
|
||||
[**SkyPilot**](https://github.com/skypilot-org/skypilot) is an open-source framework for seamlessly running machine learning on any cloud. With a simple CLI, users can easily launch many clusters and jobs, while substantially lowering their cloud bills. Currently, [Lambda Labs](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#lambda-cloud) (low-cost GPU cloud), [AWS](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#aws), [GCP](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#gcp), and [Azure](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#azure) are supported. See [docs](https://skypilot.readthedocs.io/en/latest/) to learn more.
|
||||
|
||||
## Steps
|
||||
|
||||
1. Install SkyPilot and [check that cloud credentials exist](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#cloud-account-setup):
|
||||
```bash
|
||||
pip install "skypilot[aws,gcp,azure,lambda]" # pick your clouds
|
||||
sky check
|
||||
```
|
||||
<img src="https://i.imgur.com/7BUci5n.png" width="485" alt="`sky check` output showing enabled clouds for SkyPilot"/>
|
||||
|
||||
2. Get the [deployment folder](./):
|
||||
```bash
|
||||
git clone https://github.com/TabbyML/tabby
|
||||
cd tabby/deployment/skypilot
|
||||
```
|
||||
|
||||
3. run:
|
||||
```bash
|
||||
sky launch -c tabby default.yml
|
||||
```
|
||||
|
||||
4. Open another terminal and run:
|
||||
```bash
|
||||
ssh -L 8501:localhost:8501 -L 5000:localhost:5000 -L 8080:localhost:8080 tabby
|
||||
```
|
||||
|
||||
5. Open http://localhost:8501 in your browser and start coding!
|
||||

|
||||
|
||||
## Cleaning up
|
||||
When you are done, you can stop or tear down the cluster:
|
||||
|
||||
- **To stop the cluster**, run
|
||||
```bash
|
||||
sky stop tabby # or pass your custom name if you used "-c <other name>"
|
||||
```
|
||||
You can restart a stopped cluster and relaunch the chatbot (the `run` section in YAML) with
|
||||
```bash
|
||||
sky launch default.yml -c tabby --no-setup
|
||||
```
|
||||
Note the `--no-setup` flag: a stopped cluster preserves its disk contents so we can skip redoing the setup.
|
||||
- **To tear down the cluster** (non-restartable), run
|
||||
```bash
|
||||
sky down tabby # or pass your custom name if you used "-c <other name>"
|
||||
```
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
resources:
|
||||
accelerators: A100:1
|
||||
disk_size: 1024
|
||||
|
||||
setup: |
|
||||
set -ex
|
||||
|
||||
# On some cloud providers, docker-compose is not installed by default.
|
||||
sudo curl -L https://github.com/docker/compose/releases/download/v2.17.2/docker-compose-linux-x86_64 -o /usr/local/bin/docker-compose
|
||||
sudo chmod a+x /usr/local/bin/docker-compose
|
||||
|
||||
# Pull tabby images.
|
||||
git clone https://github.com/TabbyML/tabby.git || true
|
||||
cd tabby/deployment
|
||||
|
||||
# On certain cloud providers (e.g lambda cloud), the default user is not added to docker group, so we need sudo here
|
||||
sudo docker-compose pull
|
||||
|
||||
|
||||
run: |
|
||||
cd tabby/deployment
|
||||
MODEL_REPLICA=${MODEL_REPLICA:-8} sudo docker-compose up
|
||||
Loading…
Reference in New Issue