Commit Graph

123 Commits (1d6ac7836b8bf3a65e087fe8b27f0847af51f1d6)

Author SHA1 Message Date
Meng Zhang d42942c379
feat: support ModelScope for model registry downloading (#477)
* feat: update cache info file after each file got downloaded

* refactor: extract Downloader for model downloading logic

* refactor: extract HuggingFaceRegistry

* refactor: extract serde_json to workspace dependency

* feat: add ModelScopeRegistry

* refactor: extract registry to its sub dir.

* feat: add scripts to mirror hf model to modelscope
2023-09-26 11:52:11 -07:00
胡锋 fb5a5971d3
feat: proxy server address mapping to the model server (#461)
* feat: proxy server address mapping to the model server

* fix: add swagger in Config

* refactor: add_proxy_server

* fix: missing semicolo
2023-09-21 07:06:51 +00:00
胡锋 de3a2271d6
fix(tabby): fix swagger's local server use local port (#458)
* fixed: swagger's local server use local port

* fix: extract fn add_localhost_server

* fix: add_localhost_server return doc
2023-09-19 04:36:08 +00:00
Meng Zhang c107c991ff chore: bump tabby version to 0.1.1 2023-09-17 17:09:56 +08:00
Meng Zhang 97eeb6b926
feat: update llama.cpp to fetch latest starcoder support (#452)
* feat: bump llama.cpp to HEAD

* fix: turn off add_bos by default
2023-09-16 03:41:49 +00:00
Meng Zhang 86e48afbe0 feat: bump ctranslate2 to HEAD 2023-09-16 02:40:23 +08:00
Meng Zhang 30afa19bc0
feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp (#436) 2023-09-12 14:41:39 +00:00
Meng Zhang 1e7ecce697 fix: usize overflow issue in ctranslate2 max_length truncation 2023-09-12 20:56:35 +08:00
Meng Zhang 1ccf9b2323 refactor: run make fix 2023-09-11 12:58:38 +08:00
Meng Zhang 09efa1b22b docs: add client extensions link in swagger landing page 2023-09-11 12:55:42 +08:00
leiwen83 e3c4a77fff
feat: add support fastchat http bindings (#421)
* feat: add support fastchat http bindings

Signed-off-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
2023-09-10 22:17:58 +08:00
Meng Zhang f0ed366420
feat: add support vertex-ai http bindings (#419)
* feat: add support vertex-ai http bindings

* support prefix / suffix
2023-09-09 11:22:58 +00:00
Meng Zhang 17397c8c8c
feat: add http api bindings (#410)
* feat: add http-api-bindings

* feat: add http-api-bindings

* hand max_input_length

* rename

* update

* update

* add examples/simple.rs

* update

* add default value for stop words

* update

* fix lint

* update
2023-09-09 03:59:42 +00:00
Meng Zhang ad3b974d5c
feat: implement input truncation for llama-cpp-bindings (#416)
* feat: implement input truncation for llama-cpp-bindings

* set max input length to 1024

* fix: batching tokens with n_batches

* fix batching
2023-09-09 00:20:51 +08:00
Meng Zhang 87b6b34120
feat: implement input truncation with options.max_input_length (#415) 2023-09-08 10:01:03 +00:00
Meng Zhang e780031ed6
feat: add ggml fp16 / q8_0 files (#407)
* feat: add ggml fp16 / q8_0 files

* add q8_0.gguf to optional download files

* add download options to split ctranslate2 files and ggml files
2023-09-06 17:12:29 +00:00
Meng Zhang a207520571
feat: turn on metal device by default on macosx / aarch64 devices (#398) 2023-09-05 13:03:49 +08:00
Meng Zhang d85cd81139
fix: ensure default suffix to be non-empty (#400) 2023-09-05 03:45:29 +00:00
Meng Zhang e93a971d0e
feat: tune llama metal backend performance (#393)
* feat: support eos based stop

* feat: print performance stats after each inference

* update llama.cpp

* update commits
2023-09-05 10:14:29 +08:00
vodkaslime 2472cf3b55
test: use function call style snippet for prompt builder unit test (#395)
* test: better tests for build_prefix()

* chore

* chore: resolve comments
2023-09-04 04:54:18 +00:00
vodkaslime 74073aa77a
test: add build prefix test and debug chars counting [TAB-184] (#394)
* test: add count char test

* chore: fix lint

* chore

* chore
2023-09-03 20:57:26 +08:00
vodkaslime 3c7c8d9293
feat: add cargo test to github actions and run only unit tests in ci [TAB-185] (#390)
* feat: add cargo test to github actions

* chore: fix lint

* chore: add openblas dependency

* chore: update build dependency

* chore: resolve comments

* chore: fix lint

* chore: fix lint

* chore: test installing dependencies

* chore: refactor integ test

* update ci

* cleanup

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-03 05:04:52 +00:00
Meng Zhang c8339a2912 refactor: use TabbyML/llama.cpp submodule 2023-09-03 12:38:54 +08:00
Meng Zhang 3acd5d9bc4 refactor: remove llama.cpp subtree 2023-09-03 12:37:26 +08:00
Meng Zhang 92c8ae8ee7 feat: embed ggml-metal.metal 2023-09-03 10:41:03 +08:00
Meng Zhang ed6c5b2e60 Merge commit 'aad80a58b07836bfbf6aedd50993bc54b4257388' as 'crates/llama-cpp-bindings/llama.cpp' 2023-09-03 10:07:10 +08:00
Meng Zhang d4137463ef remove llama.cpp submodule 2023-09-03 10:04:26 +08:00
Meng Zhang e360b438b4 fix lint 2023-09-03 10:01:28 +08:00
Meng Zhang 3f7aa99b0d feat: support cancellation in llama backend 2023-09-03 09:59:40 +08:00
Meng Zhang 3573d4378e
feat: llama.cpp for metal support [TAB-146] (#391)
* feat: init commit adding llama-cpp-bindings

* add llama.cpp submodule

* add LlamaEngine to hold llama context / llama model

* add cxxbridge

* add basic greedy sampling

* move files

* make compile success

* connect TextGeneration with LlamaEngine

* experimental support llama.cpp

* add metal device

* add Accelerate

* fix namespace for llama-cpp-bindings

* fix lint

* move stepping logic to rust

* add stop words package

* use stop-words in ctranslate2-bindings

* use raw string for regex

* use Arc<Tokenizer> for sharing tokenizers

* refactor: remove useless stop_words_encoding_offset

* switch to tokenizers 0.13.4-rc.3

* fix lints in cpp

* simplify implementation of greedy decoding

* feat: split metal feature for llama backend

* add ci

* update ci

* build tabby bin in ci build
2023-09-03 09:59:07 +08:00
vodkaslime 5dff349801
add single line comment to languages so they can be used in prompting [TAB-181] (#388)
* chore: add comment signs to extended languages

* Update crates/tabby/src/serve/completions/prompt.rs

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-01 03:43:27 +00:00
vodkaslime 63c00494f3
test: unit tests to prompt builder [TAB-180] (#387)
* test: unit tests to prompt builder

* chore: fix typo

* chore: fix lint

* chore: resolve comments
2023-09-01 09:20:20 +08:00
vodkaslime 90aadad3ce
feat: map js,ts,jsx and tsx to js-ts as unified language [TAB-181] (#386)
* feat: reduce js, ts, jsx and tsx to js-ts

* chore: refactor and add language reducing to both indexing and dataset jobs

* chore: only reduce language in dataset job

* chore: only reduce language in index job

* chore: fix lint

* chore: resolve comments
2023-08-31 17:21:39 +00:00
vodkaslime e5598e63f2
feat: extend language [TAB-181] (#385)
* feat: extend indexer's language support

* feat: extend language support

* chore: add support for mjs and mts

* chore: fix lint
2023-08-31 07:36:57 +00:00
Meng Zhang c44a9c7195
fix: correct git_describe in /health (#383)
* fix: add missing Version component in OpenAPI definition

* fix: allow tag / dirty in git describe
2023-08-31 01:06:36 +00:00
Meng Zhang 57baecb370
fix: switch default running backend to openblas on x86 linux (#380) 2023-08-30 14:19:35 +00:00
Meng Zhang 054aefaf15
chore: add linux static build (#379)
* chore: add linux static build

* add touch

* update build env

* add sudo

* fix: protobuf ubuntu target
2023-08-30 18:45:05 +08:00
Meng Zhang 3526ca3164
chore: build with ruy (cpu only) on static mode for linux. (#378)
* chore: build with ruy (cpu only) on static mode for linux.

* update cmake min version
2023-08-30 18:04:40 +08:00
Meng Zhang fc9a623e72
feat: add logging on server starting (#372) 2023-08-28 06:12:00 +00:00
Meng Zhang 65836ee199
feat: add stop words encoding offset for ctranslate model config (#371)
* feat: add stop words encoding offset for ctranslate model config

* feat: set default suffix to \n

* add special treatment for bytefallback tokens
2023-08-28 14:07:01 +08:00
vodkaslime 2a91a21787
feat: add gpu info to health state [TAB-162] (#364)
* feat: add gpu info to health response

* chore: error handling

* chore: refactor cpu manager code

* chore: typo

* chore: fix context mutability

* chore: fix context mutability

* feat: add link to NVML lib

* chore: refactor

* lint

* chore: resolve comments

* chore: fix typo

* chore: fix

* chore: resolve comments

* chore: fix

* chore: resolve comments
2023-08-21 18:06:38 +08:00
Meng Zhang b1ad936033
feat: add version information in health state. (#363)
* feat: add git_hash in health state

* add more version information in health state
2023-08-20 15:21:12 +00:00
Meng Zhang df45573501
feat: reduce ServeHealth event to every 300s to reduce event volume (#362) 2023-08-20 12:36:59 +00:00
vodkaslime 2026b4dd0e
feat: add architecture/cpu info to health api response [TAB-162] (#355)
* feat: add architecture, cpu and gpu info to health command

* chore: fix

* chore: fix

* chore: fix

* chore: fix lint

* chore: fix lint

* chore: remove gpu

* chore: resolve comments

* chore: resolve comments

* Update health.rs

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-08-15 15:22:03 +00:00
Meng Zhang dbc89831b1
feat: add serve health heartbeat (#343)
* add serve health tracking

* fix lint

* fix
2023-08-09 08:08:42 +00:00
Meng Zhang d0f6ad2d2a
feat: add anonymous usage tracker (#342)
* feat: add anonymous usage tracker

* improve deps

* update

* update
2023-08-09 07:31:13 +00:00
Meng Zhang 220fcc0d65
fix: make `config.experimental` optional (#339)
* fix: make config.experimental` optional

* add unit test for empty toml config
2023-08-07 09:53:00 +00:00
Meng Zhang 4eaae27ed3
Update Cargo.toml (#331) 2023-08-03 19:55:00 +08:00
Meng Zhang 6a50902ca7
fix: support ctranslate2 rev7 vocab files (.json) (#327) 2023-08-02 13:36:31 +00:00
Meng Zhang 57c811b30f
fix: improve download logging (#325)
* Suggest use `-it` so docker run generate progress bar of downloading information properly

* add info! log for model download
2023-08-02 06:30:35 +00:00