tabby

Commit Graph

Author	SHA1	Message	Date
Meng Zhang	f05dd3a2f6	refactor: cleanup chat api make it message oriented (#497 ) * refactor: refactor into /chat/completions api * Revert "feat: support request level stop words (#492)" This reverts commit `0d6840e372`. * feat: adjust interface * switch interface in tabby-playground * move to chat/prompt, add unit test * update interface	2023-10-02 15:39:15 +00:00
Meng Zhang	0d6840e372	feat: support request level stop words (#492 )	2023-09-29 18:21:57 +00:00
Meng Zhang	486e507079	fix: correct Decoding behavior in incremental manner (#491 ) * feat: implement IncrementalDecoding * refactor: use IncrementalDecoding for ctranslate2 * refactor: rename StopWords to DecodingFactory * refactor: move decoding logic to tabby-inference * feat: optimize decoding range * cleanup	2023-09-29 13:06:47 +00:00
Meng Zhang	44f013f26e	feat: add /generate and /generate_streaming (#482 ) * feat: add generate_stream interface * extract engine::create_engine * feat add generate::generate * support streaming in llama.cpp * support streaming in ctranslate2 * update * fix formatting * refactor: extract helpers functions	2023-09-28 17:20:50 +00:00
Meng Zhang	1e7ecce697	fix: usize overflow issue in ctranslate2 max_length truncation	2023-09-12 20:56:35 +08:00
Meng Zhang	87b6b34120	feat: implement input truncation with options.max_input_length (#415 )	2023-09-08 10:01:03 +00:00
Meng Zhang	3573d4378e	feat: llama.cpp for metal support [TAB-146] (#391 ) * feat: init commit adding llama-cpp-bindings * add llama.cpp submodule * add LlamaEngine to hold llama context / llama model * add cxxbridge * add basic greedy sampling * move files * make compile success * connect TextGeneration with LlamaEngine * experimental support llama.cpp * add metal device * add Accelerate * fix namespace for llama-cpp-bindings * fix lint * move stepping logic to rust * add stop words package * use stop-words in ctranslate2-bindings * use raw string for regex * use Arc<Tokenizer> for sharing tokenizers * refactor: remove useless stop_words_encoding_offset * switch to tokenizers 0.13.4-rc.3 * fix lints in cpp * simplify implementation of greedy decoding * feat: split metal feature for llama backend * add ci * update ci * build tabby bin in ci build	2023-09-03 09:59:07 +08:00
Meng Zhang	65836ee199	feat: add stop words encoding offset for ctranslate model config (#371 ) * feat: add stop words encoding offset for ctranslate model config * feat: set default suffix to \n * add special treatment for bytefallback tokens	2023-08-28 14:07:01 +08:00
Meng Zhang	b8308b7118	refactor: extract TextGeneration trait (#324 ) * add tabby-inference * extract TextGeneration trait * format * Rename TextInferenceEngine to CTranslate2Engine	2023-08-02 06:12:51 +00:00
Meng Zhang	9c9e46c6f4	feat: support set compute_type through commandline arguments	2023-06-13 12:04:07 -07:00
Meng Zhang	4cb672ec39	feat: improve error handling and messages [TAB-58] (#213 ) * add fatal macro * switch expect to fatal * improve error handling of serve * improve error handling on download module * improve error handling in scheduler * improve error handling * fmt * fmt	2023-06-07 02:02:58 +00:00
Meng Zhang	fd1baff8d5	feat: support stop sequences [TAB-52] (#212 ) * refactor: pass step and string token to callback * add token to callback * add stop regexp * implement stop words logic * pass token_ids from inference * improve effiency of regexp match with reversed regex * fmt * add typescript and javascript stop words * add cache for stop words regexp	2023-06-06 23:28:58 +00:00
Meng Zhang	007a40c582	feat: support early stop [TAB-51] (#208 ) * bump ctranslate2 to v3.15.0 * enable early stop * support early stop	2023-06-06 12:46:17 +00:00
Meng Zhang	2bf5bcd0cf	refactor: extract TextInferenceEngineImpl to reduce duplications between EncoderDecoderImpl and DecoderImpl #189	2023-06-04 22:28:39 +00:00
Meng Zhang	6de61f45bb	chore: mark thread safety [TAB-52] (#186 ) * mark thread safety * use shared_ptr to ensure thread safety * fmt	2023-06-04 06:23:31 +00:00
Meng Zhang	b8309d98cc	Switch to sccache (#154 ) * fix fmt * fix * fix test * fix clippy * switch to sc cache * fix * update * update * update * fix * add test * remove clippy * update * disable incremental * update * simply	2023-05-27 16:20:17 -07:00
Meng Zhang	552711a560	Support causal lm (decoder only model) (#151 ) * support * support causal lm	2023-05-27 01:26:33 -07:00
Meng Zhang	7b10340e67	feat: add --port to serve command	2023-05-26 00:32:11 -07:00
Meng Zhang	c296b83de9	chore: remove unused lock	2023-05-26 00:06:10 -07:00
Meng Zhang	8dfe49ec6c	feat: support cuda devices in rust tabby (#149 )	2023-05-25 23:23:07 -07:00
Meng Zhang	a2476af373	add ctranslate2-bindings / tabby rust packages (#146 ) * add ctranslate2-bindings * add fixme for linux build * turn off shared lib * add tabby-cli	2023-05-25 14:05:28 -07:00

21 Commits (f05dd3a2f68754fe4b8349190f1c9dcb7c90e889)