tabby/crates/http-api-bindings
Meng Zhang 44f013f26e
feat: add /generate and /generate_streaming (#482)
* feat: add generate_stream interface

* extract engine::create_engine

* feat add generate::generate

* support streaming in llama.cpp

* support streaming in ctranslate2

* update

* fix formatting

* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
..
examples feat: add http api bindings (#410) 2023-09-09 03:59:42 +00:00
src feat: add /generate and /generate_streaming (#482) 2023-09-28 17:20:50 +00:00
Cargo.toml feat: add /generate and /generate_streaming (#482) 2023-09-28 17:20:50 +00:00
README.md feat: add support vertex-ai http bindings (#419) 2023-09-09 11:22:58 +00:00

README.md

Examples

export MODEL_ID="code-gecko"
export PROJECT_ID="$(gcloud config get project)"
export API_ENDPOINT="https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:predict"
export AUTHORIZATION="Bearer $(gcloud auth print-access-token)"

cargo run --example simple

Usage

export MODEL_ID="code-gecko"
export PROJECT_ID="$(gcloud config get project)"
export API_ENDPOINT="https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/publishers/google/models/${MODEL_ID}:predict"
export AUTHORIZATION="Bearer $(gcloud auth print-access-token)"

cargo run serve --device experimental-http --model "{\"kind\": \"vertex-ai\", \"api_endpoint\": \"$API_ENDPOINT\", \"authorization\": \"$AUTHORIZATION\"}"