tabby

Commit Graph

Author	SHA1	Message	Date
Meng Zhang	9309e0314f	fix: fix docker build	2023-10-27 21:25:45 -07:00
Meng Zhang	6dd12ce1ec	fix: adding cuda search path to docker build.	2023-10-27 19:40:35 -07:00
Meng Zhang	2d948639be	fix: docker build for llama cuda backend	2023-10-27 16:36:54 -07:00
Meng Zhang	23bd542cec	feat: switch cuda backend to llama.cpp (#656 ) * feat: switch cuda backend to llama.cpp * fix * fix	2023-10-27 13:41:22 -07:00
Meng Zhang	f37840566b	feat: upgrade llama.cpp (#645 ) * feat: upgrade llama.cpp * update download files * update changelog * Update CHANGELOG.md * Update CHANGELOG.md	2023-10-27 12:18:46 -07:00
Meng Zhang	1a4c2aa71f	feat: swtich cpu backend to llama.cpp (#638 ) * feat: swtich Cpu backend to llama.cpp * feat: switch cpu serving to ggml * fix cargo.toml * use optional dependency * fix compliation * update ci target	2023-10-25 15:40:11 -07:00
Meng Zhang	e171776774	Release 0.5.0-dev ctranslate2-bindings@0.5.0-dev http-api-bindings@0.5.0-dev llama-cpp-bindings@0.5.0-dev rust-cxx-cmake-bridge@0.5.0-dev tabby@0.5.0-dev tabby-common@0.5.0-dev tabby-download@0.5.0-dev tabby-inference@0.5.0-dev tabby-scheduler@0.5.0-dev Generated by cargo-workspaces	2023-10-24 13:05:33 -07:00
Meng Zhang	99a7053b6f	refactor: extract language configuration into individual toml file (#564 ) * refactor: extract language configuration into individual toml file * feat: add golang language configuration (#565)	2023-10-16 00:24:44 +00:00
Meng Zhang	82e893d569	Release 0.4.0-dev ctranslate2-bindings@0.4.0-dev http-api-bindings@0.4.0-dev llama-cpp-bindings@0.4.0-dev rust-cxx-cmake-bridge@0.4.0-dev tabby@0.4.0-dev tabby-common@0.4.0-dev tabby-download@0.4.0-dev tabby-inference@0.4.0-dev tabby-scheduler@0.4.0-dev Generated by cargo-workspaces	2023-10-13 17:54:14 -07:00
Meng Zhang	4dbaf4f312	Release 0.3.0 ctranslate2-bindings@0.3.0 http-api-bindings@0.3.0 llama-cpp-bindings@0.3.0 rust-cxx-cmake-bridge@0.3.0 tabby@0.3.0 tabby-common@0.3.0 tabby-download@0.3.0 tabby-inference@0.3.0 tabby-scheduler@0.3.0 Generated by cargo-workspaces	2023-10-13 17:45:07 -07:00
Meng Zhang	eb463ba496	Release 0.3.0-rc.1 ctranslate2-bindings@0.3.0-rc.1 http-api-bindings@0.3.0-rc.1 llama-cpp-bindings@0.3.0-rc.1 rust-cxx-cmake-bridge@0.3.0-rc.1 tabby@0.3.0-rc.1 tabby-common@0.3.0-rc.1 tabby-download@0.3.0-rc.1 tabby-inference@0.3.0-rc.1 tabby-scheduler@0.3.0-rc.1 Generated by cargo-workspaces	2023-10-13 11:43:34 -07:00
Meng Zhang	182aceed41	Release 0.3.0-rc.0 ctranslate2-bindings@0.3.0-rc.0 http-api-bindings@0.3.0-rc.0 llama-cpp-bindings@0.3.0-rc.0 tabby@0.3.0-rc.0 tabby-common@0.3.0-rc.0 tabby-download@0.3.0-rc.0 tabby-inference@0.3.0-rc.0 tabby-scheduler@0.3.0-rc.0 Generated by cargo-workspaces	2023-10-13 11:24:36 -07:00
Meng Zhang	6dbb712918	Release 0.3.0-dev ctranslate2-bindings@0.3.0-dev http-api-bindings@0.3.0-dev llama-cpp-bindings@0.3.0-dev tabby@0.3.0-dev tabby-common@0.3.0-dev tabby-download@0.3.0-dev tabby-inference@0.3.0-dev tabby-scheduler@0.3.0-dev Generated by cargo-workspaces	2023-10-09 19:39:27 -07:00
Meng Zhang	1731c3075e	chore: Update version to 0.2.0	2023-10-03 13:32:21 -07:00
Meng Zhang	692c2fe0fd	Release 0.2.0-rc.0 ctranslate2-bindings@0.2.0-rc.0 http-api-bindings@0.2.0-rc.0 llama-cpp-bindings@0.2.0-rc.0 tabby@0.2.0-rc.0 tabby-common@0.2.0-rc.0 tabby-download@0.2.0-rc.0 tabby-inference@0.2.0-rc.0 tabby-scheduler@0.2.0-rc.0 Generated by cargo-workspaces	2023-10-02 19:14:12 -07:00
Meng Zhang	f05dd3a2f6	refactor: cleanup chat api make it message oriented (#497 ) * refactor: refactor into /chat/completions api * Revert "feat: support request level stop words (#492)" This reverts commit `0d6840e372`. * feat: adjust interface * switch interface in tabby-playground * move to chat/prompt, add unit test * update interface	2023-10-02 15:39:15 +00:00
Meng Zhang	dfdd0373a6	fix: when llama model loads failed, panic in rust stack	2023-10-01 22:25:25 -07:00
Meng Zhang	2171ba72ff	refactor: cleanup llama cpp implementations to fix warnings (#495 )	2023-09-30 08:37:36 -07:00
Meng Zhang	0d6840e372	feat: support request level stop words (#492 )	2023-09-29 18:21:57 +00:00
Meng Zhang	486e507079	fix: correct Decoding behavior in incremental manner (#491 ) * feat: implement IncrementalDecoding * refactor: use IncrementalDecoding for ctranslate2 * refactor: rename StopWords to DecodingFactory * refactor: move decoding logic to tabby-inference * feat: optimize decoding range * cleanup	2023-09-29 13:06:47 +00:00
Meng Zhang	5d9ca6928c	feat: update llama.cpp (#488 ) * feat: update llama.cpp * remove useless include	2023-09-28 23:59:59 +00:00
Meng Zhang	56b7b850af	fix: Linkage issue on latest xcode commandline tools clang (#486 )	2023-09-28 17:46:02 +00:00
Meng Zhang	44f013f26e	feat: add /generate and /generate_streaming (#482 ) * feat: add generate_stream interface * extract engine::create_engine * feat add generate::generate * support streaming in llama.cpp * support streaming in ctranslate2 * update * fix formatting * refactor: extract helpers functions	2023-09-28 17:20:50 +00:00
Meng Zhang	97eeb6b926	feat: update llama.cpp to fetch latest starcoder support (#452 ) * feat: bump llama.cpp to HEAD * fix: turn off add_bos by default	2023-09-16 03:41:49 +00:00
Meng Zhang	30afa19bc0	feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp (#436 )	2023-09-12 14:41:39 +00:00
Meng Zhang	ad3b974d5c	feat: implement input truncation for llama-cpp-bindings (#416 ) * feat: implement input truncation for llama-cpp-bindings * set max input length to 1024 * fix: batching tokens with n_batches * fix batching	2023-09-09 00:20:51 +08:00
Meng Zhang	e93a971d0e	feat: tune llama metal backend performance (#393 ) * feat: support eos based stop * feat: print performance stats after each inference * update llama.cpp * update commits	2023-09-05 10:14:29 +08:00
vodkaslime	3c7c8d9293	feat: add cargo test to github actions and run only unit tests in ci [TAB-185] (#390 ) * feat: add cargo test to github actions * chore: fix lint * chore: add openblas dependency * chore: update build dependency * chore: resolve comments * chore: fix lint * chore: fix lint * chore: test installing dependencies * chore: refactor integ test * update ci * cleanup --------- Co-authored-by: Meng Zhang <meng@tabbyml.com>	2023-09-03 05:04:52 +00:00
Meng Zhang	c8339a2912	refactor: use TabbyML/llama.cpp submodule	2023-09-03 12:38:54 +08:00
Meng Zhang	3acd5d9bc4	refactor: remove llama.cpp subtree	2023-09-03 12:37:26 +08:00
Meng Zhang	92c8ae8ee7	feat: embed ggml-metal.metal	2023-09-03 10:41:03 +08:00
Meng Zhang	ed6c5b2e60	Merge commit 'aad80a58b07836bfbf6aedd50993bc54b4257388' as 'crates/llama-cpp-bindings/llama.cpp'	2023-09-03 10:07:10 +08:00
Meng Zhang	d4137463ef	remove llama.cpp submodule	2023-09-03 10:04:26 +08:00
Meng Zhang	e360b438b4	fix lint	2023-09-03 10:01:28 +08:00
Meng Zhang	3f7aa99b0d	feat: support cancellation in llama backend	2023-09-03 09:59:40 +08:00
Meng Zhang	3573d4378e	feat: llama.cpp for metal support [TAB-146] (#391 ) * feat: init commit adding llama-cpp-bindings * add llama.cpp submodule * add LlamaEngine to hold llama context / llama model * add cxxbridge * add basic greedy sampling * move files * make compile success * connect TextGeneration with LlamaEngine * experimental support llama.cpp * add metal device * add Accelerate * fix namespace for llama-cpp-bindings * fix lint * move stepping logic to rust * add stop words package * use stop-words in ctranslate2-bindings * use raw string for regex * use Arc<Tokenizer> for sharing tokenizers * refactor: remove useless stop_words_encoding_offset * switch to tokenizers 0.13.4-rc.3 * fix lints in cpp * simplify implementation of greedy decoding * feat: split metal feature for llama backend * add ci * update ci * build tabby bin in ci build	2023-09-03 09:59:07 +08:00

36 Commits (3151d9100befa9d99b9217fa3b2d2d380c105ef1)