diff --git a/website/docs/getting-started.md b/website/docs/01-gettings-started.md
similarity index 100%
rename from website/docs/getting-started.md
rename to website/docs/01-gettings-started.md
diff --git a/website/docs/self-hosting/01-docker.mdx b/website/docs/02-self-hosting/01-docker.mdx
similarity index 100%
rename from website/docs/self-hosting/01-docker.mdx
rename to website/docs/02-self-hosting/01-docker.mdx
diff --git a/website/docs/self-hosting/02-apple.md b/website/docs/02-self-hosting/02-apple.md
similarity index 100%
rename from website/docs/self-hosting/02-apple.md
rename to website/docs/02-self-hosting/02-apple.md
diff --git a/website/docs/self-hosting/self-hosting.md b/website/docs/02-self-hosting/02-self-hosting.md
similarity index 100%
rename from website/docs/self-hosting/self-hosting.md
rename to website/docs/02-self-hosting/02-self-hosting.md
diff --git a/website/docs/03-faq.md b/website/docs/03-faq.md
new file mode 100644
index 0000000..9edb500
--- /dev/null
+++ b/website/docs/03-faq.md
@@ -0,0 +1,20 @@
+# Frequently Asked Questions
+
+<details>
+  <summary>How much VRAM a LLM model consumes?</summary>
+  <div>By default, Tabby operates in int8 mode with CUDA, requiring approximately 8GB of VRAM for CodeLlama-7B.</div>
+</details>
+
+<details>
+  <summary>What GPUs are required for reduced-precision inference (e.g int8)?</summary>
+  <div>
+    <ul>
+      <li>int8: Compute Capability >= 7.0 or Compute Capability 6.1</li>
+      <li>float16: Compute Capability >= 7.0</li>
+      <li>bfloat16: Compute Capability >= 8.0</li>
+    </ul>
+    <p>
+      To determine the mapping between the GPU card type and its compute capability, please visit <a href="https://developer.nvidia.com/cuda-gpus">this page</a>
+    </p>
+  </div>
+</details>