Meng Zhang
e466c1d6cb
Release 0.2.2-rc.3
...
ctranslate2-bindings@0.2.2-rc.3
tabby@0.2.2-rc.3
Generated by cargo-workspaces
2023-10-09 18:43:49 -07:00
Meng Zhang
019c745ac6
debug: adjust channel to unbounded
2023-10-09 18:43:26 -07:00
Meng Zhang
26a1a1e164
Release 0.2.2-rc.2
...
ctranslate2-bindings@0.2.2-rc.2
tabby@0.2.2-rc.2
Generated by cargo-workspaces
2023-10-09 18:21:04 -07:00
Meng Zhang
64e0f92837
feat: add back streaming for ctranslate2
2023-10-09 18:20:39 -07:00
Meng Zhang
c8d9b4d9ce
Release 0.2.2-rc.1
...
ctranslate2-bindings@0.2.2-rc.1
tabby@0.2.2-rc.1
Generated by cargo-workspaces
2023-10-09 16:35:40 -07:00
Meng Zhang
6923d1b90f
fix format
2023-10-09 16:35:12 -07:00
Meng Zhang
90e85c79c2
Release 0.2.2-rc.0
...
ctranslate2-bindings@0.2.2-rc.0
tabby@0.2.2-rc.0
Generated by cargo-workspaces
2023-10-09 16:32:03 -07:00
Meng Zhang
3c7af24047
fix: switch ctranslate2 to synchornous implementation
2023-10-09 16:30:46 -07:00
Meng Zhang
1731c3075e
chore: Update version to 0.2.0
2023-10-03 13:32:21 -07:00
Meng Zhang
692c2fe0fd
Release 0.2.0-rc.0
...
ctranslate2-bindings@0.2.0-rc.0
http-api-bindings@0.2.0-rc.0
llama-cpp-bindings@0.2.0-rc.0
tabby@0.2.0-rc.0
tabby-common@0.2.0-rc.0
tabby-download@0.2.0-rc.0
tabby-inference@0.2.0-rc.0
tabby-scheduler@0.2.0-rc.0
Generated by cargo-workspaces
2023-10-02 19:14:12 -07:00
Meng Zhang
f05dd3a2f6
refactor: cleanup chat api make it message oriented ( #497 )
...
* refactor: refactor into /chat/completions api
* Revert "feat: support request level stop words (#492 )"
This reverts commit 0d6840e372 .
* feat: adjust interface
* switch interface in tabby-playground
* move to chat/prompt, add unit test
* update interface
2023-10-02 15:39:15 +00:00
Meng Zhang
0d6840e372
feat: support request level stop words ( #492 )
2023-09-29 18:21:57 +00:00
Meng Zhang
486e507079
fix: correct Decoding behavior in incremental manner ( #491 )
...
* feat: implement IncrementalDecoding
* refactor: use IncrementalDecoding for ctranslate2
* refactor: rename StopWords to DecodingFactory
* refactor: move decoding logic to tabby-inference
* feat: optimize decoding range
* cleanup
2023-09-29 13:06:47 +00:00
Meng Zhang
44f013f26e
feat: add /generate and /generate_streaming ( #482 )
...
* feat: add generate_stream interface
* extract engine::create_engine
* feat add generate::generate
* support streaming in llama.cpp
* support streaming in ctranslate2
* update
* fix formatting
* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang
86e48afbe0
feat: bump ctranslate2 to HEAD
2023-09-16 02:40:23 +08:00
Meng Zhang
1e7ecce697
fix: usize overflow issue in ctranslate2 max_length truncation
2023-09-12 20:56:35 +08:00
Meng Zhang
87b6b34120
feat: implement input truncation with options.max_input_length ( #415 )
2023-09-08 10:01:03 +00:00
Meng Zhang
3573d4378e
feat: llama.cpp for metal support [TAB-146] ( #391 )
...
* feat: init commit adding llama-cpp-bindings
* add llama.cpp submodule
* add LlamaEngine to hold llama context / llama model
* add cxxbridge
* add basic greedy sampling
* move files
* make compile success
* connect TextGeneration with LlamaEngine
* experimental support llama.cpp
* add metal device
* add Accelerate
* fix namespace for llama-cpp-bindings
* fix lint
* move stepping logic to rust
* add stop words package
* use stop-words in ctranslate2-bindings
* use raw string for regex
* use Arc<Tokenizer> for sharing tokenizers
* refactor: remove useless stop_words_encoding_offset
* switch to tokenizers 0.13.4-rc.3
* fix lints in cpp
* simplify implementation of greedy decoding
* feat: split metal feature for llama backend
* add ci
* update ci
* build tabby bin in ci build
2023-09-03 09:59:07 +08:00
Meng Zhang
57baecb370
fix: switch default running backend to openblas on x86 linux ( #380 )
2023-08-30 14:19:35 +00:00
Meng Zhang
3526ca3164
chore: build with ruy (cpu only) on static mode for linux. ( #378 )
...
* chore: build with ruy (cpu only) on static mode for linux.
* update cmake min version
2023-08-30 18:04:40 +08:00
Meng Zhang
65836ee199
feat: add stop words encoding offset for ctranslate model config ( #371 )
...
* feat: add stop words encoding offset for ctranslate model config
* feat: set default suffix to \n
* add special treatment for bytefallback tokens
2023-08-28 14:07:01 +08:00
Meng Zhang
b8308b7118
refactor: extract TextGeneration trait ( #324 )
...
* add tabby-inference
* extract TextGeneration trait
* format
* Rename TextInferenceEngine to CTranslate2Engine
2023-08-02 06:12:51 +00:00
Meng Zhang
83e1cf76d8
feat: Upgrade ctranslate2 to v3.17.1 ( #323 )
2023-08-02 05:46:08 +00:00
Meng Zhang
be90047477
fix: fix int8 compute type, fix auto compute type selection (include float32 into consideration for cuda compute capability <= 6.0) ( #291 )
2023-07-12 11:09:38 +08:00
Meng Zhang
9c9e46c6f4
feat: support set compute_type through commandline arguments
2023-06-13 12:04:07 -07:00
Meng Zhang
5985d91782
fix: use int8_float16 to fix SantaCoder-1B ( #237 )
...
#236
2023-06-13 01:13:06 -07:00
Meng Zhang
4cb672ec39
feat: improve error handling and messages [TAB-58] ( #213 )
...
* add fatal macro
* switch expect to fatal
* improve error handling of serve
* improve error handling on download module
* improve error handling in scheduler
* improve error handling
* fmt
* fmt
2023-06-07 02:02:58 +00:00
Meng Zhang
fd1baff8d5
feat: support stop sequences [TAB-52] ( #212 )
...
* refactor: pass step and string token to callback
* add token to callback
* add stop regexp
* implement stop words logic
* pass token_ids from inference
* improve effiency of regexp match with reversed regex
* fmt
* add typescript and javascript stop words
* add cache for stop words regexp
2023-06-06 23:28:58 +00:00
Meng Zhang
007a40c582
feat: support early stop [TAB-51] ( #208 )
...
* bump ctranslate2 to v3.15.0
* enable early stop
* support early stop
2023-06-06 12:46:17 +00:00
Meng Zhang
272dde9769
refactor: rust nightly format ( #197 )
...
* chore: turn on group format
* turn on nightly fmt
2023-06-05 14:17:07 -07:00
Meng Zhang
2bf5bcd0cf
refactor: extract TextInferenceEngineImpl to reduce duplications between EncoderDecoderImpl and DecoderImpl #189
2023-06-04 22:28:39 +00:00
Meng Zhang
6de61f45bb
chore: mark thread safety [TAB-52] ( #186 )
...
* mark thread safety
* use shared_ptr to ensure thread safety
* fmt
2023-06-04 06:23:31 +00:00
Meng Zhang
3cac2607e7
refactor: improve error handlings, fix clippy warnings ( #181 )
...
* refactor: minor improvements on error handling
* refactor: cleanup error handlings
* update
* update
* fix
* add clippy / test workflow
* fix clippy
* fix clippy
* update
2023-06-01 17:23:05 -07:00
Meng Zhang
48796ecd77
feat: add `tabby download` command ( #157 )
...
* simplify fmt-display
* cleanup
* move tabby-admin to reduce nest
* add model downloader
* get rid of model-type
* improve commands
* fix fmt
2023-05-28 14:36:11 -07:00
Meng Zhang
b8309d98cc
Switch to sccache ( #154 )
...
* fix fmt
* fix
* fix test
* fix clippy
* switch to sc cache
* fix
* update
* update
* update
* fix
* add test
* remove clippy
* update
* disable incremental
* update
* simply
2023-05-27 16:20:17 -07:00
Meng Zhang
552711a560
Support causal lm (decoder only model) ( #151 )
...
* support
* support causal lm
2023-05-27 01:26:33 -07:00
Meng Zhang
72ed30e9ff
Build link shared in docker for ctranslate2 ( #150 )
...
* Build link shared in docker
* update
* update
2023-05-27 00:05:56 -07:00
Meng Zhang
06cf34a007
support static linking of ctranslate2 ( #148 )
...
* support static linking of ctranslate2
* update
* remove submodule rust-cxx-cmake-bridge
* support alwayslink with whole-archive
* update
* move export_libs
* update docker config
* update ctranslate2
* remove
* update
* update build.rs
* parse external libs
* cleanup
* add cargo fmt
2023-05-26 21:34:31 -07:00
Meng Zhang
7b10340e67
feat: add --port to serve command
2023-05-26 00:32:11 -07:00
Meng Zhang
c296b83de9
chore: remove unused lock
2023-05-26 00:06:10 -07:00
Meng Zhang
8dfe49ec6c
feat: support cuda devices in rust tabby ( #149 )
2023-05-25 23:23:07 -07:00
Meng Zhang
0acc975618
Support linux ctranslate2 cuda build ( #147 )
...
* Support linux build
* add <memory> to fix build error in linux
* add Dockerfile.tabby
* update
* update
* add rust docker image pipeline
* add docker.rust.yml
2023-05-25 18:18:22 -07:00
Meng Zhang
80588ddd22
fix: remove wrongly added submodule
2023-05-25 15:08:34 -07:00
Meng Zhang
a2476af373
add ctranslate2-bindings / tabby rust packages ( #146 )
...
* add ctranslate2-bindings
* add fixme for linux build
* turn off shared lib
* add tabby-cli
2023-05-25 14:05:28 -07:00