Commit Graph

156 Commits (e466c1d6cb672b9f9f89cd3e851f5554b4fdae2e)

Author SHA1 Message Date
Meng Zhang e466c1d6cb Release 0.2.2-rc.3
ctranslate2-bindings@0.2.2-rc.3
tabby@0.2.2-rc.3

Generated by cargo-workspaces
2023-10-09 18:43:49 -07:00
Meng Zhang 019c745ac6 debug: adjust channel to unbounded 2023-10-09 18:43:26 -07:00
Meng Zhang 26a1a1e164 Release 0.2.2-rc.2
ctranslate2-bindings@0.2.2-rc.2
tabby@0.2.2-rc.2

Generated by cargo-workspaces
2023-10-09 18:21:04 -07:00
Meng Zhang 64e0f92837 feat: add back streaming for ctranslate2 2023-10-09 18:20:39 -07:00
Meng Zhang c8d9b4d9ce Release 0.2.2-rc.1
ctranslate2-bindings@0.2.2-rc.1
tabby@0.2.2-rc.1

Generated by cargo-workspaces
2023-10-09 16:35:40 -07:00
Meng Zhang 6923d1b90f fix format 2023-10-09 16:35:12 -07:00
Meng Zhang 90e85c79c2 Release 0.2.2-rc.0
ctranslate2-bindings@0.2.2-rc.0
tabby@0.2.2-rc.0

Generated by cargo-workspaces
2023-10-09 16:32:03 -07:00
Meng Zhang 3c7af24047 fix: switch ctranslate2 to synchornous implementation 2023-10-09 16:30:46 -07:00
Meng Zhang 2d5b3e4ff5 chore: release v0.2.1 2023-10-03 17:13:39 -07:00
Meng Zhang 503c44e7c5 fix: playground environment misconfig 2023-10-03 17:10:02 -07:00
Meng Zhang b3b498624c feat: deprecate num_replicas_per_thread, generate default value for it 2023-10-03 17:02:37 -07:00
Meng Zhang 1afba47059 feat: allow set num_replicas_per_device for CUDA to increase throughput 2023-10-03 15:52:25 -07:00
Meng Zhang ceaa7ab012 chore: update main branch to v0.3.0-dev 2023-10-03 13:38:27 -07:00
Meng Zhang 1731c3075e chore: Update version to 0.2.0 2023-10-03 13:32:21 -07:00
Meng Zhang 0e5128e8fb feat: add chat_template field in tabby.json 2023-10-03 11:46:05 -07:00
Meng Zhang 7fc76228f7 chore: add debug log for /chat interface 2023-10-03 11:38:58 -07:00
Meng Zhang 692c2fe0fd Release 0.2.0-rc.0
ctranslate2-bindings@0.2.0-rc.0
http-api-bindings@0.2.0-rc.0
llama-cpp-bindings@0.2.0-rc.0
tabby@0.2.0-rc.0
tabby-common@0.2.0-rc.0
tabby-download@0.2.0-rc.0
tabby-inference@0.2.0-rc.0
tabby-scheduler@0.2.0-rc.0

Generated by cargo-workspaces
2023-10-02 19:14:12 -07:00
Meng Zhang 6306bb3f01
fix: if local file doens't exist, local_cache_key should be cleared (#501)
* fix: if local file doens't exist, local_cache_key should be cleared

* fix
2023-10-02 23:48:35 +00:00
Meng Zhang ce20bd6154
refactor: use RegexSet for cleaer stop regex construction (#499)
* fix: add a regression test cased for stop words regex matching

* refactor: use RegexSet for cleaer stop regex construction
2023-10-02 23:21:51 +00:00
Meng Zhang 63612d5a67
fix(tabby-download): even when prefer_local_file is set to true, we should still check for remote (if network is avaialble), to see if a file should be upgraded (#500) 2023-10-02 23:09:57 +00:00
Meng Zhang 80a17aea37
feat: only show /v1/chat api if --chat-model is set (#498) 2023-10-02 17:17:27 +00:00
Meng Zhang f05dd3a2f6
refactor: cleanup chat api make it message oriented (#497)
* refactor: refactor into /chat/completions api

* Revert "feat: support request level stop words (#492)"

This reverts commit 0d6840e372.

* feat: adjust interface

* switch interface in tabby-playground

* move to chat/prompt, add unit test

* update interface
2023-10-02 15:39:15 +00:00
Meng Zhang dfdd0373a6 fix: when llama model loads failed, panic in rust stack 2023-10-01 22:25:25 -07:00
Meng Zhang 2171ba72ff
refactor: cleanup llama cpp implementations to fix warnings (#495) 2023-09-30 08:37:36 -07:00
Meng Zhang aea8c74bdc feat: add OpenAPI link to playground 2023-09-29 18:20:38 -07:00
Meng Zhang 10bf2d6c0c
feat: add param --instruct-model, allowing specify different model for q&a use cases. (#494) 2023-09-29 23:44:53 +00:00
Meng Zhang eb15933255
feat: add tabby playground for q&a use case (#493)
* init commit

* support chat

* add theme toggle

* limit message to 2 lines

* update

* update formatting

* update

* update

* update

* fix formatting

* update
2023-09-29 15:51:54 -07:00
Meng Zhang 0d6840e372
feat: support request level stop words (#492) 2023-09-29 18:21:57 +00:00
Meng Zhang 486e507079
fix: correct Decoding behavior in incremental manner (#491)
* feat: implement IncrementalDecoding

* refactor: use IncrementalDecoding for ctranslate2

* refactor: rename StopWords to DecodingFactory

* refactor: move decoding logic to tabby-inference

* feat: optimize decoding range

* cleanup
2023-09-29 13:06:47 +00:00
Meng Zhang 5d9ca6928c
feat: update llama.cpp (#488)
* feat: update llama.cpp

* remove useless include
2023-09-28 23:59:59 +00:00
Meng Zhang a159c2358d
refactor: move generate / generate_stream to /v1beta (#487) 2023-09-28 23:58:17 +00:00
Meng Zhang 56b7b850af
fix: Linkage issue on latest xcode commandline tools clang (#486) 2023-09-28 17:46:02 +00:00
Meng Zhang 44f013f26e
feat: add /generate and /generate_streaming (#482)
* feat: add generate_stream interface

* extract engine::create_engine

* feat add generate::generate

* support streaming in llama.cpp

* support streaming in ctranslate2

* update

* fix formatting

* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang d42942c379
feat: support ModelScope for model registry downloading (#477)
* feat: update cache info file after each file got downloaded

* refactor: extract Downloader for model downloading logic

* refactor: extract HuggingFaceRegistry

* refactor: extract serde_json to workspace dependency

* feat: add ModelScopeRegistry

* refactor: extract registry to its sub dir.

* feat: add scripts to mirror hf model to modelscope
2023-09-26 11:52:11 -07:00
胡锋 fb5a5971d3
feat: proxy server address mapping to the model server (#461)
* feat: proxy server address mapping to the model server

* fix: add swagger in Config

* refactor: add_proxy_server

* fix: missing semicolo
2023-09-21 07:06:51 +00:00
胡锋 de3a2271d6
fix(tabby): fix swagger's local server use local port (#458)
* fixed: swagger's local server use local port

* fix: extract fn add_localhost_server

* fix: add_localhost_server return doc
2023-09-19 04:36:08 +00:00
Meng Zhang c107c991ff chore: bump tabby version to 0.1.1 2023-09-17 17:09:56 +08:00
Meng Zhang 97eeb6b926
feat: update llama.cpp to fetch latest starcoder support (#452)
* feat: bump llama.cpp to HEAD

* fix: turn off add_bos by default
2023-09-16 03:41:49 +00:00
Meng Zhang 86e48afbe0 feat: bump ctranslate2 to HEAD 2023-09-16 02:40:23 +08:00
Meng Zhang 30afa19bc0
feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp (#436) 2023-09-12 14:41:39 +00:00
Meng Zhang 1e7ecce697 fix: usize overflow issue in ctranslate2 max_length truncation 2023-09-12 20:56:35 +08:00
Meng Zhang 1ccf9b2323 refactor: run make fix 2023-09-11 12:58:38 +08:00
Meng Zhang 09efa1b22b docs: add client extensions link in swagger landing page 2023-09-11 12:55:42 +08:00
leiwen83 e3c4a77fff
feat: add support fastchat http bindings (#421)
* feat: add support fastchat http bindings

Signed-off-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
2023-09-10 22:17:58 +08:00
Meng Zhang f0ed366420
feat: add support vertex-ai http bindings (#419)
* feat: add support vertex-ai http bindings

* support prefix / suffix
2023-09-09 11:22:58 +00:00
Meng Zhang 17397c8c8c
feat: add http api bindings (#410)
* feat: add http-api-bindings

* feat: add http-api-bindings

* hand max_input_length

* rename

* update

* update

* add examples/simple.rs

* update

* add default value for stop words

* update

* fix lint

* update
2023-09-09 03:59:42 +00:00
Meng Zhang ad3b974d5c
feat: implement input truncation for llama-cpp-bindings (#416)
* feat: implement input truncation for llama-cpp-bindings

* set max input length to 1024

* fix: batching tokens with n_batches

* fix batching
2023-09-09 00:20:51 +08:00
Meng Zhang 87b6b34120
feat: implement input truncation with options.max_input_length (#415) 2023-09-08 10:01:03 +00:00
Meng Zhang e780031ed6
feat: add ggml fp16 / q8_0 files (#407)
* feat: add ggml fp16 / q8_0 files

* add q8_0.gguf to optional download files

* add download options to split ctranslate2 files and ggml files
2023-09-06 17:12:29 +00:00
Meng Zhang a207520571
feat: turn on metal device by default on macosx / aarch64 devices (#398) 2023-09-05 13:03:49 +08:00