tabby

Commit Graph

Author	SHA1	Message	Date
Meng Zhang	2171ba72ff	refactor: cleanup llama cpp implementations to fix warnings (#495 )	2023-09-30 08:37:36 -07:00
Meng Zhang	486e507079	fix: correct Decoding behavior in incremental manner (#491 ) * feat: implement IncrementalDecoding * refactor: use IncrementalDecoding for ctranslate2 * refactor: rename StopWords to DecodingFactory * refactor: move decoding logic to tabby-inference * feat: optimize decoding range * cleanup	2023-09-29 13:06:47 +00:00
Meng Zhang	ad3b974d5c	feat: implement input truncation for llama-cpp-bindings (#416 ) * feat: implement input truncation for llama-cpp-bindings * set max input length to 1024 * fix: batching tokens with n_batches * fix batching	2023-09-09 00:20:51 +08:00
Meng Zhang	e93a971d0e	feat: tune llama metal backend performance (#393 ) * feat: support eos based stop * feat: print performance stats after each inference * update llama.cpp * update commits	2023-09-05 10:14:29 +08:00
Meng Zhang	3573d4378e	feat: llama.cpp for metal support [TAB-146] (#391 ) * feat: init commit adding llama-cpp-bindings * add llama.cpp submodule * add LlamaEngine to hold llama context / llama model * add cxxbridge * add basic greedy sampling * move files * make compile success * connect TextGeneration with LlamaEngine * experimental support llama.cpp * add metal device * add Accelerate * fix namespace for llama-cpp-bindings * fix lint * move stepping logic to rust * add stop words package * use stop-words in ctranslate2-bindings * use raw string for regex * use Arc<Tokenizer> for sharing tokenizers * refactor: remove useless stop_words_encoding_offset * switch to tokenizers 0.13.4-rc.3 * fix lints in cpp * simplify implementation of greedy decoding * feat: split metal feature for llama backend * add ci * update ci * build tabby bin in ci build	2023-09-03 09:59:07 +08:00

5 Commits (r0.3)