Meng Zhang
dfdd0373a6
fix: when llama model loads failed, panic in rust stack
2023-10-01 22:25:25 -07:00
Meng Zhang
2171ba72ff
refactor: cleanup llama cpp implementations to fix warnings ( #495 )
2023-09-30 08:37:36 -07:00
Meng Zhang
aea8c74bdc
feat: add OpenAPI link to playground
2023-09-29 18:20:38 -07:00
Meng Zhang
10bf2d6c0c
feat: add param --instruct-model, allowing specify different model for q&a use cases. ( #494 )
2023-09-29 23:44:53 +00:00
Meng Zhang
eb15933255
feat: add tabby playground for q&a use case ( #493 )
...
* init commit
* support chat
* add theme toggle
* limit message to 2 lines
* update
* update formatting
* update
* update
* update
* fix formatting
* update
2023-09-29 15:51:54 -07:00
Meng Zhang
0d6840e372
feat: support request level stop words ( #492 )
2023-09-29 18:21:57 +00:00
Meng Zhang
486e507079
fix: correct Decoding behavior in incremental manner ( #491 )
...
* feat: implement IncrementalDecoding
* refactor: use IncrementalDecoding for ctranslate2
* refactor: rename StopWords to DecodingFactory
* refactor: move decoding logic to tabby-inference
* feat: optimize decoding range
* cleanup
2023-09-29 13:06:47 +00:00
Meng Zhang
5d9ca6928c
feat: update llama.cpp ( #488 )
...
* feat: update llama.cpp
* remove useless include
2023-09-28 23:59:59 +00:00
Meng Zhang
a159c2358d
refactor: move generate / generate_stream to /v1beta ( #487 )
2023-09-28 23:58:17 +00:00
Meng Zhang
56b7b850af
fix: Linkage issue on latest xcode commandline tools clang ( #486 )
2023-09-28 17:46:02 +00:00
Meng Zhang
44f013f26e
feat: add /generate and /generate_streaming ( #482 )
...
* feat: add generate_stream interface
* extract engine::create_engine
* feat add generate::generate
* support streaming in llama.cpp
* support streaming in ctranslate2
* update
* fix formatting
* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang
d42942c379
feat: support ModelScope for model registry downloading ( #477 )
...
* feat: update cache info file after each file got downloaded
* refactor: extract Downloader for model downloading logic
* refactor: extract HuggingFaceRegistry
* refactor: extract serde_json to workspace dependency
* feat: add ModelScopeRegistry
* refactor: extract registry to its sub dir.
* feat: add scripts to mirror hf model to modelscope
2023-09-26 11:52:11 -07:00
胡锋
fb5a5971d3
feat: proxy server address mapping to the model server ( #461 )
...
* feat: proxy server address mapping to the model server
* fix: add swagger in Config
* refactor: add_proxy_server
* fix: missing semicolo
2023-09-21 07:06:51 +00:00
胡锋
de3a2271d6
fix(tabby): fix swagger's local server use local port ( #458 )
...
* fixed: swagger's local server use local port
* fix: extract fn add_localhost_server
* fix: add_localhost_server return doc
2023-09-19 04:36:08 +00:00
Meng Zhang
c107c991ff
chore: bump tabby version to 0.1.1
2023-09-17 17:09:56 +08:00
Meng Zhang
97eeb6b926
feat: update llama.cpp to fetch latest starcoder support ( #452 )
...
* feat: bump llama.cpp to HEAD
* fix: turn off add_bos by default
2023-09-16 03:41:49 +00:00
Meng Zhang
86e48afbe0
feat: bump ctranslate2 to HEAD
2023-09-16 02:40:23 +08:00
Meng Zhang
30afa19bc0
feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp ( #436 )
2023-09-12 14:41:39 +00:00
Meng Zhang
1e7ecce697
fix: usize overflow issue in ctranslate2 max_length truncation
2023-09-12 20:56:35 +08:00
Meng Zhang
1ccf9b2323
refactor: run make fix
2023-09-11 12:58:38 +08:00
Meng Zhang
09efa1b22b
docs: add client extensions link in swagger landing page
2023-09-11 12:55:42 +08:00
leiwen83
e3c4a77fff
feat: add support fastchat http bindings ( #421 )
...
* feat: add support fastchat http bindings
Signed-off-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
2023-09-10 22:17:58 +08:00
Meng Zhang
f0ed366420
feat: add support vertex-ai http bindings ( #419 )
...
* feat: add support vertex-ai http bindings
* support prefix / suffix
2023-09-09 11:22:58 +00:00
Meng Zhang
17397c8c8c
feat: add http api bindings ( #410 )
...
* feat: add http-api-bindings
* feat: add http-api-bindings
* hand max_input_length
* rename
* update
* update
* add examples/simple.rs
* update
* add default value for stop words
* update
* fix lint
* update
2023-09-09 03:59:42 +00:00
Meng Zhang
ad3b974d5c
feat: implement input truncation for llama-cpp-bindings ( #416 )
...
* feat: implement input truncation for llama-cpp-bindings
* set max input length to 1024
* fix: batching tokens with n_batches
* fix batching
2023-09-09 00:20:51 +08:00
Meng Zhang
87b6b34120
feat: implement input truncation with options.max_input_length ( #415 )
2023-09-08 10:01:03 +00:00
Meng Zhang
e780031ed6
feat: add ggml fp16 / q8_0 files ( #407 )
...
* feat: add ggml fp16 / q8_0 files
* add q8_0.gguf to optional download files
* add download options to split ctranslate2 files and ggml files
2023-09-06 17:12:29 +00:00
Meng Zhang
a207520571
feat: turn on metal device by default on macosx / aarch64 devices ( #398 )
2023-09-05 13:03:49 +08:00
Meng Zhang
d85cd81139
fix: ensure default suffix to be non-empty ( #400 )
2023-09-05 03:45:29 +00:00
Meng Zhang
e93a971d0e
feat: tune llama metal backend performance ( #393 )
...
* feat: support eos based stop
* feat: print performance stats after each inference
* update llama.cpp
* update commits
2023-09-05 10:14:29 +08:00
vodkaslime
2472cf3b55
test: use function call style snippet for prompt builder unit test ( #395 )
...
* test: better tests for build_prefix()
* chore
* chore: resolve comments
2023-09-04 04:54:18 +00:00
vodkaslime
74073aa77a
test: add build prefix test and debug chars counting [TAB-184] ( #394 )
...
* test: add count char test
* chore: fix lint
* chore
* chore
2023-09-03 20:57:26 +08:00
vodkaslime
3c7c8d9293
feat: add cargo test to github actions and run only unit tests in ci [TAB-185] ( #390 )
...
* feat: add cargo test to github actions
* chore: fix lint
* chore: add openblas dependency
* chore: update build dependency
* chore: resolve comments
* chore: fix lint
* chore: fix lint
* chore: test installing dependencies
* chore: refactor integ test
* update ci
* cleanup
---------
Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-03 05:04:52 +00:00
Meng Zhang
c8339a2912
refactor: use TabbyML/llama.cpp submodule
2023-09-03 12:38:54 +08:00
Meng Zhang
3acd5d9bc4
refactor: remove llama.cpp subtree
2023-09-03 12:37:26 +08:00
Meng Zhang
92c8ae8ee7
feat: embed ggml-metal.metal
2023-09-03 10:41:03 +08:00
Meng Zhang
ed6c5b2e60
Merge commit 'aad80a58b07836bfbf6aedd50993bc54b4257388' as 'crates/llama-cpp-bindings/llama.cpp'
2023-09-03 10:07:10 +08:00
Meng Zhang
d4137463ef
remove llama.cpp submodule
2023-09-03 10:04:26 +08:00
Meng Zhang
e360b438b4
fix lint
2023-09-03 10:01:28 +08:00
Meng Zhang
3f7aa99b0d
feat: support cancellation in llama backend
2023-09-03 09:59:40 +08:00
Meng Zhang
3573d4378e
feat: llama.cpp for metal support [TAB-146] ( #391 )
...
* feat: init commit adding llama-cpp-bindings
* add llama.cpp submodule
* add LlamaEngine to hold llama context / llama model
* add cxxbridge
* add basic greedy sampling
* move files
* make compile success
* connect TextGeneration with LlamaEngine
* experimental support llama.cpp
* add metal device
* add Accelerate
* fix namespace for llama-cpp-bindings
* fix lint
* move stepping logic to rust
* add stop words package
* use stop-words in ctranslate2-bindings
* use raw string for regex
* use Arc<Tokenizer> for sharing tokenizers
* refactor: remove useless stop_words_encoding_offset
* switch to tokenizers 0.13.4-rc.3
* fix lints in cpp
* simplify implementation of greedy decoding
* feat: split metal feature for llama backend
* add ci
* update ci
* build tabby bin in ci build
2023-09-03 09:59:07 +08:00
vodkaslime
5dff349801
add single line comment to languages so they can be used in prompting [TAB-181] ( #388 )
...
* chore: add comment signs to extended languages
* Update crates/tabby/src/serve/completions/prompt.rs
---------
Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-01 03:43:27 +00:00
vodkaslime
63c00494f3
test: unit tests to prompt builder [TAB-180] ( #387 )
...
* test: unit tests to prompt builder
* chore: fix typo
* chore: fix lint
* chore: resolve comments
2023-09-01 09:20:20 +08:00
vodkaslime
90aadad3ce
feat: map js,ts,jsx and tsx to js-ts as unified language [TAB-181] ( #386 )
...
* feat: reduce js, ts, jsx and tsx to js-ts
* chore: refactor and add language reducing to both indexing and dataset jobs
* chore: only reduce language in dataset job
* chore: only reduce language in index job
* chore: fix lint
* chore: resolve comments
2023-08-31 17:21:39 +00:00
vodkaslime
e5598e63f2
feat: extend language [TAB-181] ( #385 )
...
* feat: extend indexer's language support
* feat: extend language support
* chore: add support for mjs and mts
* chore: fix lint
2023-08-31 07:36:57 +00:00
Meng Zhang
c44a9c7195
fix: correct git_describe in /health ( #383 )
...
* fix: add missing Version component in OpenAPI definition
* fix: allow tag / dirty in git describe
2023-08-31 01:06:36 +00:00
Meng Zhang
57baecb370
fix: switch default running backend to openblas on x86 linux ( #380 )
2023-08-30 14:19:35 +00:00
Meng Zhang
054aefaf15
chore: add linux static build ( #379 )
...
* chore: add linux static build
* add touch
* update build env
* add sudo
* fix: protobuf ubuntu target
2023-08-30 18:45:05 +08:00
Meng Zhang
3526ca3164
chore: build with ruy (cpu only) on static mode for linux. ( #378 )
...
* chore: build with ruy (cpu only) on static mode for linux.
* update cmake min version
2023-08-30 18:04:40 +08:00
Meng Zhang
fc9a623e72
feat: add logging on server starting ( #372 )
2023-08-28 06:12:00 +00:00