Commit Graph

18 Commits (52c4ef38d3364cce27f07f2d6d1dae5854a414ce)

Author SHA1 Message Date
Meng Zhang 44f013f26e
feat: add /generate and /generate_streaming (#482)
* feat: add generate_stream interface

* extract engine::create_engine

* feat add generate::generate

* support streaming in llama.cpp

* support streaming in ctranslate2

* update

* fix formatting

* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang 1e7ecce697 fix: usize overflow issue in ctranslate2 max_length truncation 2023-09-12 20:56:35 +08:00
Meng Zhang 87b6b34120
feat: implement input truncation with options.max_input_length (#415) 2023-09-08 10:01:03 +00:00
Meng Zhang 3573d4378e
feat: llama.cpp for metal support [TAB-146] (#391)
* feat: init commit adding llama-cpp-bindings

* add llama.cpp submodule

* add LlamaEngine to hold llama context / llama model

* add cxxbridge

* add basic greedy sampling

* move files

* make compile success

* connect TextGeneration with LlamaEngine

* experimental support llama.cpp

* add metal device

* add Accelerate

* fix namespace for llama-cpp-bindings

* fix lint

* move stepping logic to rust

* add stop words package

* use stop-words in ctranslate2-bindings

* use raw string for regex

* use Arc<Tokenizer> for sharing tokenizers

* refactor: remove useless stop_words_encoding_offset

* switch to tokenizers 0.13.4-rc.3

* fix lints in cpp

* simplify implementation of greedy decoding

* feat: split metal feature for llama backend

* add ci

* update ci

* build tabby bin in ci build
2023-09-03 09:59:07 +08:00
Meng Zhang 65836ee199
feat: add stop words encoding offset for ctranslate model config (#371)
* feat: add stop words encoding offset for ctranslate model config

* feat: set default suffix to \n

* add special treatment for bytefallback tokens
2023-08-28 14:07:01 +08:00
Meng Zhang b8308b7118
refactor: extract TextGeneration trait (#324)
* add tabby-inference

* extract TextGeneration trait

* format

* Rename TextInferenceEngine to CTranslate2Engine
2023-08-02 06:12:51 +00:00
Meng Zhang 9c9e46c6f4 feat: support set compute_type through commandline arguments 2023-06-13 12:04:07 -07:00
Meng Zhang 4cb672ec39
feat: improve error handling and messages [TAB-58] (#213)
* add fatal macro

* switch expect to fatal

* improve error handling of serve

* improve error handling on download module

* improve error handling in scheduler

* improve error handling

* fmt

* fmt
2023-06-07 02:02:58 +00:00
Meng Zhang fd1baff8d5
feat: support stop sequences [TAB-52] (#212)
* refactor: pass step and string token to callback

* add token to callback

* add stop regexp

* implement stop words logic

* pass token_ids from inference

* improve effiency of regexp match with reversed regex

* fmt

* add typescript and javascript stop words

* add cache for stop words regexp
2023-06-06 23:28:58 +00:00
Meng Zhang 007a40c582
feat: support early stop [TAB-51] (#208)
* bump ctranslate2 to v3.15.0

* enable early stop

* support early stop
2023-06-06 12:46:17 +00:00
Meng Zhang 2bf5bcd0cf
refactor: extract TextInferenceEngineImpl to reduce duplications between EncoderDecoderImpl and DecoderImpl #189 2023-06-04 22:28:39 +00:00
Meng Zhang 6de61f45bb
chore: mark thread safety [TAB-52] (#186)
* mark thread safety

* use shared_ptr to ensure thread safety

* fmt
2023-06-04 06:23:31 +00:00
Meng Zhang b8309d98cc
Switch to sccache (#154)
* fix fmt

* fix

* fix test

* fix clippy

* switch to sc cache

* fix

* update

* update

* update

* fix

* add test

* remove clippy

* update

* disable incremental

* update

* simply
2023-05-27 16:20:17 -07:00
Meng Zhang 552711a560
Support causal lm (decoder only model) (#151)
* support

* support causal lm
2023-05-27 01:26:33 -07:00
Meng Zhang 7b10340e67 feat: add --port to serve command 2023-05-26 00:32:11 -07:00
Meng Zhang c296b83de9 chore: remove unused lock 2023-05-26 00:06:10 -07:00
Meng Zhang 8dfe49ec6c
feat: support cuda devices in rust tabby (#149) 2023-05-25 23:23:07 -07:00
Meng Zhang a2476af373
add ctranslate2-bindings / tabby rust packages (#146)
* add ctranslate2-bindings

* add fixme for linux build

* turn off shared lib

* add tabby-cli
2023-05-25 14:05:28 -07:00