diff --git a/website/docs/getting-started.md b/website/docs/01-gettings-started.md similarity index 100% rename from website/docs/getting-started.md rename to website/docs/01-gettings-started.md diff --git a/website/docs/self-hosting/01-docker.mdx b/website/docs/02-self-hosting/01-docker.mdx similarity index 100% rename from website/docs/self-hosting/01-docker.mdx rename to website/docs/02-self-hosting/01-docker.mdx diff --git a/website/docs/self-hosting/02-apple.md b/website/docs/02-self-hosting/02-apple.md similarity index 100% rename from website/docs/self-hosting/02-apple.md rename to website/docs/02-self-hosting/02-apple.md diff --git a/website/docs/self-hosting/self-hosting.md b/website/docs/02-self-hosting/02-self-hosting.md similarity index 100% rename from website/docs/self-hosting/self-hosting.md rename to website/docs/02-self-hosting/02-self-hosting.md diff --git a/website/docs/03-faq.md b/website/docs/03-faq.md new file mode 100644 index 0000000..9edb500 --- /dev/null +++ b/website/docs/03-faq.md @@ -0,0 +1,20 @@ +# Frequently Asked Questions + +
+ How much VRAM a LLM model consumes? +
By default, Tabby operates in int8 mode with CUDA, requiring approximately 8GB of VRAM for CodeLlama-7B.
+
+ +
+ What GPUs are required for reduced-precision inference (e.g int8)? +
+ +

+ To determine the mapping between the GPU card type and its compute capability, please visit this page +

+
+