Update README.md
parent
495e6aaf76
commit
aa7ed053ec
14
README.md
14
README.md
|
|
@ -42,13 +42,6 @@ docker run \
|
||||||
tabbyml/tabby
|
tabbyml/tabby
|
||||||
```
|
```
|
||||||
|
|
||||||
You can then query the server using `/v1/completions` endpoint:
|
|
||||||
```bash
|
|
||||||
curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
|
|
||||||
"prompt": "def binarySearch(arr, left, right, x):\n mid = (left +"
|
|
||||||
}'
|
|
||||||
```
|
|
||||||
|
|
||||||
To use the GPU backend (triton) for a faster inference speed:
|
To use the GPU backend (triton) for a faster inference speed:
|
||||||
```bash
|
```bash
|
||||||
docker run \
|
docker run \
|
||||||
|
|
@ -65,6 +58,13 @@ docker run \
|
||||||
```
|
```
|
||||||
Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.
|
Note: To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). We also recommend using NVIDIA drivers with CUDA version 11.8 or higher.
|
||||||
|
|
||||||
|
You can then query the server using `/v1/completions` endpoint:
|
||||||
|
```bash
|
||||||
|
curl -X POST http://localhost:5000/v1/completions -H 'Content-Type: application/json' --data '{
|
||||||
|
"prompt": "def binarySearch(arr, left, right, x):\n mid = (left +"
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
We also provides an interactive playground in admin panel [localhost:8501](http://localhost:8501)
|
We also provides an interactive playground in admin panel [localhost:8501](http://localhost:8501)
|
||||||
|
|
||||||

|

|
||||||
|
|
|
||||||
Loading…
Reference in New Issue