Commit Graph

25 Commits (05038d8b7e657c65da2e84d28f14d9698708e073)

Author SHA1 Message Date
Meng Zhang 182aceed41 Release 0.3.0-rc.0
ctranslate2-bindings@0.3.0-rc.0
http-api-bindings@0.3.0-rc.0
llama-cpp-bindings@0.3.0-rc.0
tabby@0.3.0-rc.0
tabby-common@0.3.0-rc.0
tabby-download@0.3.0-rc.0
tabby-inference@0.3.0-rc.0
tabby-scheduler@0.3.0-rc.0

Generated by cargo-workspaces
2023-10-13 11:24:36 -07:00
Meng Zhang 6dbb712918 Release 0.3.0-dev
ctranslate2-bindings@0.3.0-dev
http-api-bindings@0.3.0-dev
llama-cpp-bindings@0.3.0-dev
tabby@0.3.0-dev
tabby-common@0.3.0-dev
tabby-download@0.3.0-dev
tabby-inference@0.3.0-dev
tabby-scheduler@0.3.0-dev

Generated by cargo-workspaces
2023-10-09 19:39:27 -07:00
Meng Zhang 1731c3075e chore: Update version to 0.2.0 2023-10-03 13:32:21 -07:00
Meng Zhang 692c2fe0fd Release 0.2.0-rc.0
ctranslate2-bindings@0.2.0-rc.0
http-api-bindings@0.2.0-rc.0
llama-cpp-bindings@0.2.0-rc.0
tabby@0.2.0-rc.0
tabby-common@0.2.0-rc.0
tabby-download@0.2.0-rc.0
tabby-inference@0.2.0-rc.0
tabby-scheduler@0.2.0-rc.0

Generated by cargo-workspaces
2023-10-02 19:14:12 -07:00
Meng Zhang f05dd3a2f6
refactor: cleanup chat api make it message oriented (#497)
* refactor: refactor into /chat/completions api

* Revert "feat: support request level stop words (#492)"

This reverts commit 0d6840e372.

* feat: adjust interface

* switch interface in tabby-playground

* move to chat/prompt, add unit test

* update interface
2023-10-02 15:39:15 +00:00
Meng Zhang dfdd0373a6 fix: when llama model loads failed, panic in rust stack 2023-10-01 22:25:25 -07:00
Meng Zhang 2171ba72ff
refactor: cleanup llama cpp implementations to fix warnings (#495) 2023-09-30 08:37:36 -07:00
Meng Zhang 0d6840e372
feat: support request level stop words (#492) 2023-09-29 18:21:57 +00:00
Meng Zhang 486e507079
fix: correct Decoding behavior in incremental manner (#491)
* feat: implement IncrementalDecoding

* refactor: use IncrementalDecoding for ctranslate2

* refactor: rename StopWords to DecodingFactory

* refactor: move decoding logic to tabby-inference

* feat: optimize decoding range

* cleanup
2023-09-29 13:06:47 +00:00
Meng Zhang 5d9ca6928c
feat: update llama.cpp (#488)
* feat: update llama.cpp

* remove useless include
2023-09-28 23:59:59 +00:00
Meng Zhang 56b7b850af
fix: Linkage issue on latest xcode commandline tools clang (#486) 2023-09-28 17:46:02 +00:00
Meng Zhang 44f013f26e
feat: add /generate and /generate_streaming (#482)
* feat: add generate_stream interface

* extract engine::create_engine

* feat add generate::generate

* support streaming in llama.cpp

* support streaming in ctranslate2

* update

* fix formatting

* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang 97eeb6b926
feat: update llama.cpp to fetch latest starcoder support (#452)
* feat: bump llama.cpp to HEAD

* fix: turn off add_bos by default
2023-09-16 03:41:49 +00:00
Meng Zhang 30afa19bc0
feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp (#436) 2023-09-12 14:41:39 +00:00
Meng Zhang ad3b974d5c
feat: implement input truncation for llama-cpp-bindings (#416)
* feat: implement input truncation for llama-cpp-bindings

* set max input length to 1024

* fix: batching tokens with n_batches

* fix batching
2023-09-09 00:20:51 +08:00
Meng Zhang e93a971d0e
feat: tune llama metal backend performance (#393)
* feat: support eos based stop

* feat: print performance stats after each inference

* update llama.cpp

* update commits
2023-09-05 10:14:29 +08:00
vodkaslime 3c7c8d9293
feat: add cargo test to github actions and run only unit tests in ci [TAB-185] (#390)
* feat: add cargo test to github actions

* chore: fix lint

* chore: add openblas dependency

* chore: update build dependency

* chore: resolve comments

* chore: fix lint

* chore: fix lint

* chore: test installing dependencies

* chore: refactor integ test

* update ci

* cleanup

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-03 05:04:52 +00:00
Meng Zhang c8339a2912 refactor: use TabbyML/llama.cpp submodule 2023-09-03 12:38:54 +08:00
Meng Zhang 3acd5d9bc4 refactor: remove llama.cpp subtree 2023-09-03 12:37:26 +08:00
Meng Zhang 92c8ae8ee7 feat: embed ggml-metal.metal 2023-09-03 10:41:03 +08:00
Meng Zhang ed6c5b2e60 Merge commit 'aad80a58b07836bfbf6aedd50993bc54b4257388' as 'crates/llama-cpp-bindings/llama.cpp' 2023-09-03 10:07:10 +08:00
Meng Zhang d4137463ef remove llama.cpp submodule 2023-09-03 10:04:26 +08:00
Meng Zhang e360b438b4 fix lint 2023-09-03 10:01:28 +08:00
Meng Zhang 3f7aa99b0d feat: support cancellation in llama backend 2023-09-03 09:59:40 +08:00
Meng Zhang 3573d4378e
feat: llama.cpp for metal support [TAB-146] (#391)
* feat: init commit adding llama-cpp-bindings

* add llama.cpp submodule

* add LlamaEngine to hold llama context / llama model

* add cxxbridge

* add basic greedy sampling

* move files

* make compile success

* connect TextGeneration with LlamaEngine

* experimental support llama.cpp

* add metal device

* add Accelerate

* fix namespace for llama-cpp-bindings

* fix lint

* move stepping logic to rust

* add stop words package

* use stop-words in ctranslate2-bindings

* use raw string for regex

* use Arc<Tokenizer> for sharing tokenizers

* refactor: remove useless stop_words_encoding_offset

* switch to tokenizers 0.13.4-rc.3

* fix lints in cpp

* simplify implementation of greedy decoding

* feat: split metal feature for llama backend

* add ci

* update ci

* build tabby bin in ci build
2023-09-03 09:59:07 +08:00