tabby/website/docs/configuration.md

---
sidebar_position: 5
---

# ⚙️ Configuration

Tabby server will look for a configuration file at `~/.tabby/config.toml` for advanced features.

### Repository context for code completion

To enable repository level context for code completion, you can add the following to your configuration file:

```toml title="~/.tabby/config.toml"
# Index two repositories' source code as additional context for code completion.

[[repositories]]
git_url = "https://github.com/TabbyML/tabby.git"

# git through ssh protocol.
[[repositories]]
git_url = "git@github.com:OpenNMT/CTranslate2.git"

# local directory is also supported!
[[repositories]]
git_url = "file:///home/users/repository_a"
```

Once this is set, you can run `tabby scheduler` to index the source code repository.

:::tip
By default, `tabby scheduler` runs in a daemon and processes its pipeline every 5 hours. To run the pipeline immediately, use `tabby scheduler --now`.
:::

```bash title="artifacts produced by tabby scheduler"
~/.tabby % ls dataset
data.jsonl

~/.tabby % ls index
1a8729fa34d844df984b444f4def1456.fast      2ed712d4a7a44ed797dd4ff5ceaf4312.fieldnorm
b42ca53fe6f94d0c8e96f947318278ba.idx       1a8729fa34d844df984b444f4def1456.fieldnorm 
2ed712d4a7a44ed797dd4ff5ceaf4312.idx       b42ca53fe6f94d0c8e96f947318278ba.pos
...
```

In a code completion request, additional context from the source code repository will be attached to the prompt for better completion quality. For example:

```rust title="Example prompt for code completion, with retrieval augmented enabled"
// Path: crates/tabby/src/serve/engine.rs
// fn create_llama_engine(model_dir: &ModelDir) -> Box<dyn TextGeneration> {
//     let options = llama_cpp_bindings::LlamaEngineOptionsBuilder::default()
//         .model_path(model_dir.ggml_q8_0_file())
//         .tokenizer_path(model_dir.tokenizer_file())
//         .build()
//         .unwrap();
//
//     Box::new(llama_cpp_bindings::LlamaEngine::create(options))
// }
//
// Path: crates/tabby/src/serve/engine.rs
// create_local_engine(args, &model_dir, &metadata)
//
// Path: crates/tabby/src/serve/health.rs
// args.device.to_string()
//
// Path: crates/tabby/src/serve/mod.rs
// download_model(&args.model, &args.device)
    } else {
        create_llama_engine(model_dir)
    }
}

fn create_ctranslate2_engine(
    args: &crate::serve::ServeArgs,
    model_dir: &ModelDir,
    metadata: &Metadata,
) -> Box<dyn TextGeneration> {
    let device = format!("{}", args.device);
    let options = CTranslate2EngineOptionsBuilder::default()
        .model_path(model_dir.ctranslate2_dir())
        .tokenizer_path(model_dir.tokenizer_file())
        .device(device)
        .model_type(metadata.auto_model.clone())
        .device_indices(args.device_indices.clone())
        .build()
        .⮹
```

## Usage Collection
Tabby collects usage stats by default. This data will only be used by the Tabby team to improve its services.

### What data is collected?
We collect non-sensitive data that helps us understand how Tabby is used. For now we collects `serve` command you used to start the server.
As of the date 10/07/2023, the following information has been collected:

```rust
struct HealthState {
    model: String,
    chat_model: Option<String>,
    device: String,
    arch: String,
    cpu_info: String,
    cpu_count: usize,
    cuda_devices: Vec<String>,
    version: Version,
}
```

For an up-to-date list of the fields we have collected, please refer to [health.rs](https://github.com/TabbyML/tabby/blob/main/crates/tabby/src/serve/health.rs#L11).

### How to disable it
To disable usage collection, set the `TABBY_DISABLE_USAGE_COLLECTION` environment variable by `export TABBY_DISABLE_USAGE_COLLECTION=1`.
docs: add instruction on adding new programming language support. (#556) * temp commit * docs: add instruction on adding new programming language support. 2023-10-14 07:37:05 +00:00			`---`
			`sidebar_position: 5`
			`---`

docs: restructure docs (#430) * docs: improve doc structure * add extensions 2023-09-11 15:26:05 +00:00			`# ⚙️ Configuration`
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
docs: add instruction on creating a Tabby instance with repository context (#552) 2023-10-14 06:05:50 +00:00			Tabby server will look for a configuration file at `~/.tabby/config.toml` for advanced features.
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
docs: add instruction on creating a Tabby instance with repository context (#552) 2023-10-14 06:05:50 +00:00			`### Repository context for code completion`
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
docs: add instruction on creating a Tabby instance with repository context (#552) 2023-10-14 06:05:50 +00:00			`To enable repository level context for code completion, you can add the following to your configuration file:`
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
docs: add instruction on creating a Tabby instance with repository context (#552) 2023-10-14 06:05:50 +00:00			```toml title="~/.tabby/config.toml"
docs: add example to having two repositories in config.toml (#599) * docs: add example to having two repositories in config.toml #596 * Update configuration.md 2023-10-19 22:14:02 +00:00			`# Index two repositories' source code as additional context for code completion.`

docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00			`[[repositories]]`
			`git_url = "https://github.com/TabbyML/tabby.git"`
docs: add example to having two repositories in config.toml (#599) * docs: add example to having two repositories in config.toml #596 * Update configuration.md 2023-10-19 22:14:02 +00:00
docs: update configuration.md to mention support of local directory (#963) 2023-12-07 00:52:22 +00:00			`# git through ssh protocol.`
docs: add example to having two repositories in config.toml (#599) * docs: add example to having two repositories in config.toml #596 * Update configuration.md 2023-10-19 22:14:02 +00:00			`[[repositories]]`
feat: Allow cloning repositories via SSH (#961) * drop redundant git install * add ssh-client and ca-certs to Dockerfile to allow repo indexing via ssh * move ssh-client installation * update documentation to reflect change * add space back in * Update configuration.md * Update configuration.md --------- Co-authored-by: christian <christian@thoughtmachine.net> Co-authored-by: Meng Zhang <meng@tabbyml.com> 2023-12-07 00:18:02 +00:00			`git_url = "git@github.com:OpenNMT/CTranslate2.git"`
docs: update configuration.md to mention support of local directory (#963) 2023-12-07 00:52:22 +00:00
			`# local directory is also supported!`
			`[[repositories]]`
			`git_url = "file:///home/users/repository_a"`
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00			```

docs: add instruction on creating a Tabby instance with repository context (#552) 2023-10-14 06:05:50 +00:00			Once this is set, you can run `tabby scheduler` to index the source code repository.

			`:::tip`
			By default, `tabby scheduler` runs in a daemon and processes its pipeline every 5 hours. To run the pipeline immediately, use `tabby scheduler --now`.
			`:::`

			```bash title="artifacts produced by tabby scheduler"
			`~/.tabby % ls dataset`
			`data.jsonl`

			`~/.tabby % ls index`
			`1a8729fa34d844df984b444f4def1456.fast 2ed712d4a7a44ed797dd4ff5ceaf4312.fieldnorm`
			`b42ca53fe6f94d0c8e96f947318278ba.idx 1a8729fa34d844df984b444f4def1456.fieldnorm`
			`2ed712d4a7a44ed797dd4ff5ceaf4312.idx b42ca53fe6f94d0c8e96f947318278ba.pos`
			`...`
			```

			`In a code completion request, additional context from the source code repository will be attached to the prompt for better completion quality. For example:`

			```rust title="Example prompt for code completion, with retrieval augmented enabled"
			`// Path: crates/tabby/src/serve/engine.rs`
			`// fn create_llama_engine(model_dir: &ModelDir) -> Box<dyn TextGeneration> {`
			`// let options = llama_cpp_bindings::LlamaEngineOptionsBuilder::default()`
			`// .model_path(model_dir.ggml_q8_0_file())`
			`// .tokenizer_path(model_dir.tokenizer_file())`
			`// .build()`
			`// .unwrap();`
			`//`
			`// Box::new(llama_cpp_bindings::LlamaEngine::create(options))`
			`// }`
			`//`
			`// Path: crates/tabby/src/serve/engine.rs`
			`// create_local_engine(args, &model_dir, &metadata)`
			`//`
			`// Path: crates/tabby/src/serve/health.rs`
			`// args.device.to_string()`
			`//`
			`// Path: crates/tabby/src/serve/mod.rs`
			`// download_model(&args.model, &args.device)`
			`} else {`
			`create_llama_engine(model_dir)`
			`}`
			`}`

			`fn create_ctranslate2_engine(`
			`args: &crate::serve::ServeArgs,`
			`model_dir: &ModelDir,`
			`metadata: &Metadata,`
			`) -> Box<dyn TextGeneration> {`
			`let device = format!("{}", args.device);`
			`let options = CTranslate2EngineOptionsBuilder::default()`
			`.model_path(model_dir.ctranslate2_dir())`
			`.tokenizer_path(model_dir.tokenizer_file())`
			`.device(device)`
			`.model_type(metadata.auto_model.clone())`
			`.device_indices(args.device_indices.clone())`
			`.build()`
			`.⮹`
			```
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
			`## Usage Collection`
			`Tabby collects usage stats by default. This data will only be used by the Tabby team to improve its services.`

			`### What data is collected?`
			We collect non-sensitive data that helps us understand how Tabby is used. For now we collects `serve` command you used to start the server.
docs: add information on what data are collected for tabby server (#520) 2023-10-07 18:32:21 +00:00			`As of the date 10/07/2023, the following information has been collected:`

			```rust
			`struct HealthState {`
			`model: String,`
			`chat_model: Option<String>,`
			`device: String,`
			`arch: String,`
			`cpu_info: String,`
			`cpu_count: usize,`
			`cuda_devices: Vec<String>,`
			`version: Version,`
			`}`
			```

			`For an up-to-date list of the fields we have collected, please refer to [health.rs](https://github.com/TabbyML/tabby/blob/main/crates/tabby/src/serve/health.rs#L11).`
docs: adjust self-hosting section 2023-09-04 06:23:13 +00:00
			`### How to disable it`
			To disable usage collection, set the `TABBY_DISABLE_USAGE_COLLECTION` environment variable by `export TABBY_DISABLE_USAGE_COLLECTION=1`.