Commit Graph

166 Commits (72d1d9f0bb6f150efaffba40258800ca3a544171)

Author SHA1 Message Date
Meng Zhang 1a87c99488
feat: add debug flag disable_prompt_rewrite (#545) 2023-10-13 06:55:41 +00:00
Meng Zhang 1ad871e1ff
feat: add debug request / response to visualize prompting with source code index (#544)
* feat: logs segments in completion log

* feat: tune prompt format and improve testing

* add debug options for easier of visualizing the prompt

* update
2023-10-12 19:27:52 -07:00
Meng Zhang 92c1f5a8c0
feat: turn on index server if avaiable (#536) 2023-10-11 23:27:41 +00:00
Meng Zhang 1f9e248dfa
refactor: support multiple page in playground (#537) 2023-10-11 23:27:20 +00:00
Meng Zhang 99c39375fd
feat: set default empty screen questions for tabby-playground (#535) 2023-10-11 22:31:35 +00:00
Meng Zhang 75d2944fb6
feat: support loading the source code index whenever it's ready in file system (#530)
* feat: support loading index whenever it's ready

* fix test
2023-10-10 21:35:20 -07:00
Meng Zhang 6dbb712918 Release 0.3.0-dev
ctranslate2-bindings@0.3.0-dev
http-api-bindings@0.3.0-dev
llama-cpp-bindings@0.3.0-dev
tabby@0.3.0-dev
tabby-common@0.3.0-dev
tabby-download@0.3.0-dev
tabby-inference@0.3.0-dev
tabby-scheduler@0.3.0-dev

Generated by cargo-workspaces
2023-10-09 19:39:27 -07:00
Meng Zhang d21a4de79c
chore: set max timeout for /v1/completions handler (#526)
* chore: set max timeout for /v1/completions handler

* refactor: extract sub routers

* fix
2023-10-09 18:44:55 -07:00
Meng Zhang 24eadf0de8
refactor: make /v1/health accept GET requests (#527) 2023-10-09 18:34:56 +00:00
Meng Zhang 3eb5f4132c chore: add 'v*' match pattern to restrict git describe only compare against a actual release 2023-10-09 11:13:37 -07:00
Meng Zhang 0f8ee7f589
refactor: move http engine creation to its sub crates (#524) 2023-10-09 17:37:04 +00:00
Meng Zhang 8c09f75360
refactor: extract language related data into languages.rs (#518)
* refactor: extract language related data into languages.rs

* fix

* cleanup index

* fix

* further sanitize

* add a score threshold
2023-10-07 01:40:21 +00:00
Meng Zhang d85a7892d1
feat: connect prompt rewriting part (#517)
* feat: enable /v1beta/search if index is available

* make prompt rewriting work

* update

* fix test

* fix api doc
2023-10-07 00:29:24 +00:00
Meng Zhang 8497fb1372
feat: implement /v1beta/search interface (#516)
* feat: implement /v1beta/search interface

* update

* update

* improve debugger
2023-10-06 18:54:12 +00:00
Meng Zhang 4c00ac06fb
fix(download): mark ggml model downloading should be optional, as ggml is only used for metal backend for now (#512) 2023-10-05 16:54:56 +00:00
Meng Zhang 9cd2accbaa
feat: adjust code indexing logic (#510) 2023-10-05 05:29:41 +00:00
Meng Zhang f7ebce2514
refactor: deprecate --compute-type (#505) 2023-10-04 18:45:34 +00:00
Meng Zhang 8a03c9bf17 refactor: use / as server url
Swagger access to the page, regardless of host / port, will just work.
2023-10-03 18:33:56 -07:00
Meng Zhang 2d5b3e4ff5 chore: release v0.2.1 2023-10-03 17:13:39 -07:00
Meng Zhang 503c44e7c5 fix: playground environment misconfig 2023-10-03 17:10:02 -07:00
Meng Zhang b3b498624c feat: deprecate num_replicas_per_thread, generate default value for it 2023-10-03 17:02:37 -07:00
Meng Zhang 1afba47059 feat: allow set num_replicas_per_device for CUDA to increase throughput 2023-10-03 15:52:25 -07:00
Meng Zhang ceaa7ab012 chore: update main branch to v0.3.0-dev 2023-10-03 13:38:27 -07:00
Meng Zhang 1731c3075e chore: Update version to 0.2.0 2023-10-03 13:32:21 -07:00
Meng Zhang 0e5128e8fb feat: add chat_template field in tabby.json 2023-10-03 11:46:05 -07:00
Meng Zhang 7fc76228f7 chore: add debug log for /chat interface 2023-10-03 11:38:58 -07:00
Meng Zhang 692c2fe0fd Release 0.2.0-rc.0
ctranslate2-bindings@0.2.0-rc.0
http-api-bindings@0.2.0-rc.0
llama-cpp-bindings@0.2.0-rc.0
tabby@0.2.0-rc.0
tabby-common@0.2.0-rc.0
tabby-download@0.2.0-rc.0
tabby-inference@0.2.0-rc.0
tabby-scheduler@0.2.0-rc.0

Generated by cargo-workspaces
2023-10-02 19:14:12 -07:00
Meng Zhang ce20bd6154
refactor: use RegexSet for cleaer stop regex construction (#499)
* fix: add a regression test cased for stop words regex matching

* refactor: use RegexSet for cleaer stop regex construction
2023-10-02 23:21:51 +00:00
Meng Zhang 80a17aea37
feat: only show /v1/chat api if --chat-model is set (#498) 2023-10-02 17:17:27 +00:00
Meng Zhang f05dd3a2f6
refactor: cleanup chat api make it message oriented (#497)
* refactor: refactor into /chat/completions api

* Revert "feat: support request level stop words (#492)"

This reverts commit 0d6840e372.

* feat: adjust interface

* switch interface in tabby-playground

* move to chat/prompt, add unit test

* update interface
2023-10-02 15:39:15 +00:00
Meng Zhang aea8c74bdc feat: add OpenAPI link to playground 2023-09-29 18:20:38 -07:00
Meng Zhang 10bf2d6c0c
feat: add param --instruct-model, allowing specify different model for q&a use cases. (#494) 2023-09-29 23:44:53 +00:00
Meng Zhang eb15933255
feat: add tabby playground for q&a use case (#493)
* init commit

* support chat

* add theme toggle

* limit message to 2 lines

* update

* update formatting

* update

* update

* update

* fix formatting

* update
2023-09-29 15:51:54 -07:00
Meng Zhang 0d6840e372
feat: support request level stop words (#492) 2023-09-29 18:21:57 +00:00
Meng Zhang a159c2358d
refactor: move generate / generate_stream to /v1beta (#487) 2023-09-28 23:58:17 +00:00
Meng Zhang 44f013f26e
feat: add /generate and /generate_streaming (#482)
* feat: add generate_stream interface

* extract engine::create_engine

* feat add generate::generate

* support streaming in llama.cpp

* support streaming in ctranslate2

* update

* fix formatting

* refactor: extract helpers functions
2023-09-28 17:20:50 +00:00
Meng Zhang d42942c379
feat: support ModelScope for model registry downloading (#477)
* feat: update cache info file after each file got downloaded

* refactor: extract Downloader for model downloading logic

* refactor: extract HuggingFaceRegistry

* refactor: extract serde_json to workspace dependency

* feat: add ModelScopeRegistry

* refactor: extract registry to its sub dir.

* feat: add scripts to mirror hf model to modelscope
2023-09-26 11:52:11 -07:00
胡锋 fb5a5971d3
feat: proxy server address mapping to the model server (#461)
* feat: proxy server address mapping to the model server

* fix: add swagger in Config

* refactor: add_proxy_server

* fix: missing semicolo
2023-09-21 07:06:51 +00:00
胡锋 de3a2271d6
fix(tabby): fix swagger's local server use local port (#458)
* fixed: swagger's local server use local port

* fix: extract fn add_localhost_server

* fix: add_localhost_server return doc
2023-09-19 04:36:08 +00:00
Meng Zhang c107c991ff chore: bump tabby version to 0.1.1 2023-09-17 17:09:56 +08:00
Meng Zhang 1ccf9b2323 refactor: run make fix 2023-09-11 12:58:38 +08:00
Meng Zhang 09efa1b22b docs: add client extensions link in swagger landing page 2023-09-11 12:55:42 +08:00
leiwen83 e3c4a77fff
feat: add support fastchat http bindings (#421)
* feat: add support fastchat http bindings

Signed-off-by: Lei Wen <wenlei03@qiyi.com>
Co-authored-by: Lei Wen <wenlei03@qiyi.com>
2023-09-10 22:17:58 +08:00
Meng Zhang f0ed366420
feat: add support vertex-ai http bindings (#419)
* feat: add support vertex-ai http bindings

* support prefix / suffix
2023-09-09 11:22:58 +00:00
Meng Zhang 87b6b34120
feat: implement input truncation with options.max_input_length (#415) 2023-09-08 10:01:03 +00:00
Meng Zhang e780031ed6
feat: add ggml fp16 / q8_0 files (#407)
* feat: add ggml fp16 / q8_0 files

* add q8_0.gguf to optional download files

* add download options to split ctranslate2 files and ggml files
2023-09-06 17:12:29 +00:00
Meng Zhang a207520571
feat: turn on metal device by default on macosx / aarch64 devices (#398) 2023-09-05 13:03:49 +08:00
Meng Zhang d85cd81139
fix: ensure default suffix to be non-empty (#400) 2023-09-05 03:45:29 +00:00
vodkaslime 2472cf3b55
test: use function call style snippet for prompt builder unit test (#395)
* test: better tests for build_prefix()

* chore

* chore: resolve comments
2023-09-04 04:54:18 +00:00
vodkaslime 74073aa77a
test: add build prefix test and debug chars counting [TAB-184] (#394)
* test: add count char test

* chore: fix lint

* chore

* chore
2023-09-03 20:57:26 +08:00
Meng Zhang 3573d4378e
feat: llama.cpp for metal support [TAB-146] (#391)
* feat: init commit adding llama-cpp-bindings

* add llama.cpp submodule

* add LlamaEngine to hold llama context / llama model

* add cxxbridge

* add basic greedy sampling

* move files

* make compile success

* connect TextGeneration with LlamaEngine

* experimental support llama.cpp

* add metal device

* add Accelerate

* fix namespace for llama-cpp-bindings

* fix lint

* move stepping logic to rust

* add stop words package

* use stop-words in ctranslate2-bindings

* use raw string for regex

* use Arc<Tokenizer> for sharing tokenizers

* refactor: remove useless stop_words_encoding_offset

* switch to tokenizers 0.13.4-rc.3

* fix lints in cpp

* simplify implementation of greedy decoding

* feat: split metal feature for llama backend

* add ci

* update ci

* build tabby bin in ci build
2023-09-03 09:59:07 +08:00
vodkaslime 5dff349801
add single line comment to languages so they can be used in prompting [TAB-181] (#388)
* chore: add comment signs to extended languages

* Update crates/tabby/src/serve/completions/prompt.rs

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-09-01 03:43:27 +00:00
vodkaslime 63c00494f3
test: unit tests to prompt builder [TAB-180] (#387)
* test: unit tests to prompt builder

* chore: fix typo

* chore: fix lint

* chore: resolve comments
2023-09-01 09:20:20 +08:00
Meng Zhang c44a9c7195
fix: correct git_describe in /health (#383)
* fix: add missing Version component in OpenAPI definition

* fix: allow tag / dirty in git describe
2023-08-31 01:06:36 +00:00
Meng Zhang 054aefaf15
chore: add linux static build (#379)
* chore: add linux static build

* add touch

* update build env

* add sudo

* fix: protobuf ubuntu target
2023-08-30 18:45:05 +08:00
Meng Zhang fc9a623e72
feat: add logging on server starting (#372) 2023-08-28 06:12:00 +00:00
Meng Zhang 65836ee199
feat: add stop words encoding offset for ctranslate model config (#371)
* feat: add stop words encoding offset for ctranslate model config

* feat: set default suffix to \n

* add special treatment for bytefallback tokens
2023-08-28 14:07:01 +08:00
vodkaslime 2a91a21787
feat: add gpu info to health state [TAB-162] (#364)
* feat: add gpu info to health response

* chore: error handling

* chore: refactor cpu manager code

* chore: typo

* chore: fix context mutability

* chore: fix context mutability

* feat: add link to NVML lib

* chore: refactor

* lint

* chore: resolve comments

* chore: fix typo

* chore: fix

* chore: resolve comments

* chore: fix

* chore: resolve comments
2023-08-21 18:06:38 +08:00
Meng Zhang b1ad936033
feat: add version information in health state. (#363)
* feat: add git_hash in health state

* add more version information in health state
2023-08-20 15:21:12 +00:00
Meng Zhang df45573501
feat: reduce ServeHealth event to every 300s to reduce event volume (#362) 2023-08-20 12:36:59 +00:00
vodkaslime 2026b4dd0e
feat: add architecture/cpu info to health api response [TAB-162] (#355)
* feat: add architecture, cpu and gpu info to health command

* chore: fix

* chore: fix

* chore: fix

* chore: fix lint

* chore: fix lint

* chore: remove gpu

* chore: resolve comments

* chore: resolve comments

* Update health.rs

---------

Co-authored-by: Meng Zhang <meng@tabbyml.com>
2023-08-15 15:22:03 +00:00
Meng Zhang dbc89831b1
feat: add serve health heartbeat (#343)
* add serve health tracking

* fix lint

* fix
2023-08-09 08:08:42 +00:00
Meng Zhang b8308b7118
refactor: extract TextGeneration trait (#324)
* add tabby-inference

* extract TextGeneration trait

* format

* Rename TextInferenceEngine to CTranslate2Engine
2023-08-02 06:12:51 +00:00
Meng Zhang 95bd53ac9c
feat: add select kind param. Supported editors could log line select … (#299)
* feat: add select kind param. Supported editors could log line select or block select

* fix lint
2023-07-16 16:02:40 +08:00
Meng Zhang be5fe0d737
feat: add rust prompt rewrite support (#296) 2023-07-13 09:31:44 +00:00
Meng Zhang 4388fd0050
feat: support prompt rewriting (#295)
* refactor: extract PromptBuilder

* feat: load tantivy index in prompt builder

* integrate with searcher

* add enable_prompt_rewrite to control rewrite behavior

* nit docs

* limit 1 snippet per identifier

* extract magic numbers
2023-07-13 09:05:41 +00:00
Meng Zhang be90047477
fix: fix int8 compute type, fix auto compute type selection (include float32 into consideration for cuda compute capability <= 6.0) (#291) 2023-07-12 11:09:38 +08:00
Meng Zhang 9ca1f7e5f1
fix: add additional whitespace to match tokens that combining space and li… (#270)
* fix: add additional whitespace to match tokens that combining space and line break

* fix lint
2023-06-25 01:15:52 +00:00
Meng Zhang 631cff3aed docs: update url of playground server 2023-06-23 18:55:23 -07:00
Meng Zhang fcbc5edc55
Revert "feat: add /experimental/search endpoint (#258)" (#260)
This reverts commit 04980160e5.
2023-06-22 14:23:35 -07:00
Meng Zhang 04980160e5
feat: add /experimental/search endpoint (#258)
* feat: add /experimental/search endpoint

* fix format
2023-06-22 20:47:32 +00:00
Meng Zhang 7ed5dd584d
feat: experiment ctags support in scheduler (#207)
* experiment ctags support

* add document.rs

* extract Document to common

* integrate tags into dataset builder

* skip if none

* do not add scheduler in client binary

* fix fmt
2023-06-21 19:48:13 -07:00
Meng Zhang 6eae16d475 fix: typo in openapi documentation 2023-06-16 20:22:38 -07:00
Meng Zhang 8ee700089f feat: do not use fim template when suffix is empty string 2023-06-15 09:27:32 -07:00
Meng Zhang d572cf7d6d api: add user field in completion api 2023-06-14 10:50:03 -07:00
Meng Zhang 93d5d8b297 docs: update website openapi.json 2023-06-13 13:13:03 -07:00
Meng Zhang b2734aed59 feat: returns more information in /v1/health 2023-06-13 13:11:20 -07:00
Meng Zhang 9c9e46c6f4 feat: support set compute_type through commandline arguments 2023-06-13 12:04:07 -07:00
Meng Zhang ba7e04d030 refactor: remove admin 2023-06-13 11:37:55 -07:00
Meng Zhang 3b7153ba23 docs: update website 2023-06-11 12:28:23 -07:00
Meng Zhang de546b03fe
feat: add otlp-endpoint for OpenTelemetry support [TAB-67] (#227)
* feat: add otlp-endpoint for OpenTelemetry support

* set default log level for axum tracing to INFO

* update build enviornment

* update
2023-06-10 22:46:25 -07:00
Meng Zhang 6180b32980
feat: add /v1/health (#226)
* feat: add /v1/health

* fix fmt
2023-06-10 22:37:42 -07:00
Meng Zhang 6718afbf67 fix: server should still support prompt only use case 2023-06-10 00:39:32 -07:00
Meng Zhang 40460885b0 docs: improve website theming 2023-06-07 11:17:00 -07:00
Meng Zhang 1aaf29c968
docs: switch openapi docs (#215)
* update openapi

* update

* fix: shared_vocabulary is not a required file

* docs: improve docs
2023-06-07 01:58:05 -07:00
Meng Zhang 4cb672ec39
feat: improve error handling and messages [TAB-58] (#213)
* add fatal macro

* switch expect to fatal

* improve error handling of serve

* improve error handling on download module

* improve error handling in scheduler

* improve error handling

* fmt

* fmt
2023-06-07 02:02:58 +00:00
Meng Zhang fd1baff8d5
feat: support stop sequences [TAB-52] (#212)
* refactor: pass step and string token to callback

* add token to callback

* add stop regexp

* implement stop words logic

* pass token_ids from inference

* improve effiency of regexp match with reversed regex

* fmt

* add typescript and javascript stop words

* add cache for stop words regexp
2023-06-06 23:28:58 +00:00
Meng Zhang 249d51d0f5
feat: add indexer [TAB-17] (#199)
* add basic indexer

* formatting
2023-06-05 22:18:10 +00:00
Meng Zhang 272dde9769
refactor: rust nightly format (#197)
* chore: turn on group format

* turn on nightly fmt
2023-06-05 14:17:07 -07:00
Meng Zhang f4442b104f docs: usage string for scheduler 2023-06-05 12:57:18 -07:00
Meng Zhang e8a33312bb
refactor: extract download into tabby-download (#195)
* refactor: extract download into tabby-download

* remove unused deps
2023-06-05 18:40:24 +00:00
Meng Zhang e8b1c10738
feat: add `tabby scheduler` command (#194)
* feat: add `tabby scheduler` command

* update test cases

* fix fmt
2023-06-05 18:29:38 +00:00
Meng Zhang 6de61f45bb
chore: mark thread safety [TAB-52] (#186)
* mark thread safety

* use shared_ptr to ensure thread safety

* fmt
2023-06-04 06:23:31 +00:00
Meng Zhang 775576b53e
fix: only use prompt_template when suffix presents [TAB-46] (#184)
* fix: only use prompt_template when suffix presents

* lint
2023-06-03 17:29:04 +00:00
Meng Zhang 2779da3cba
feat: supports FIM inference [TAB-46] (#183)
* Add prefix / suffix

* update

* feat: support segments in inference

* chore: add tabby.json in model repository to store prompt_template

* make prompt_template optional.

* download tabby.json in downloader
2023-06-02 16:47:48 -07:00
Meng Zhang 950a7a795f fix: when model_id is an local dir, don't try to download model from remote 2023-06-02 13:48:53 -07:00
Meng Zhang 3cac2607e7
refactor: improve error handlings, fix clippy warnings (#181)
* refactor: minor improvements on error handling

* refactor: cleanup error handlings

* update

* update

* fix

* add clippy / test workflow

* fix clippy

* fix clippy

* update
2023-06-01 17:23:05 -07:00
Meng Zhang ca077a3403
feat: ensure model exist before serving (#180)
* chore: migrate completion to new metadata format

* feat: ensure model exist before serving
2023-06-01 07:26:21 +00:00
Meng Zhang 9131567257
chore: migrate completion to new metadata format (#179) 2023-06-01 07:08:09 +00:00
Meng Zhang e8dbd36663
feat: improve download command - support local cache checking behavior (#178)
* move download.rs

* add metadata

* support prefer local args

* fix format

* replace errorchain with anyhow
2023-06-01 06:42:04 +00:00