docs: add blog for v0.1.1 release (#454)

2023-09-17 17:31:41 +08:00 · 2023-09-17 17:31:41 +08:00 · 0b4206e6f8
parent c107c991ff
commit 0b4206e6f8
3 changed files with 30 additions and 24 deletions
--- a/website/blog/2023-09-13-release-0-1-0-metal/index.md
+++ b/website/blog/2023-09-13-release-0-1-0-metal/index.md
@ -1,21 +0,0 @@
---
-authors: [ meng ]
---
-# Highlights of Tabby v0.1.0: Apple M1/M2 Support
-We are thrilled to announce the release of Tabby v0.1.0👏🏻.
-
-Thanks to [llama.cpp](https://github.com/ggerganov/llama.cpp), Apple M1/M2 Tabby users can now harness Metal inference support on Apple's M1 and M2 chips by using the `--device metal` flag.
-
-This enhancement leads to a significant inference speed upgrade🚀. It marks a meaningful milestone in Tabby's adoption on Apple devices. Check out our [Model Directory](/docs/models) to discover LLM models with Metal support! 🎁
-
-<center>
-
-![Inference](./inference.png)
-
-*An example inference benchmarking with [CodeLlama-7B](https://huggingface.co/TabbyML/CodeLlama-7B) on Apple M2 Max, takes ~600ms.*
-
-</center>
-
-:::tip
-Check out latest Tabby updates on [Linkedin](https://www.linkedin.com/company/tabbyml/) and [Slack community](https://join.slack.com/t/tabbycommunity/shared_invite/zt-1xeiddizp-bciR2RtFTaJ37RBxr8VxpA)! Our Tabby community is eager for your participation. ❤️ 
-:::
--- a/website/blog/2023-09-13-release-0-1-0-metal/inference.png
+++ b/website/blog/2023-09-13-release-0-1-0-metal/inference.png
@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:702e9b69b54a0b86731c23d199ffe454a2f03437b25f0fe8c25257e9c71b8877
-size 19495
--- a/website/blog/2023-09-18-release-0-1-1-metal/index.md
+++ b/website/blog/2023-09-18-release-0-1-1-metal/index.md
@ -0,0 +1,30 @@
+---
+authors: [ meng ]
+---
+# Highlights of Tabby v0.1.1: Apple M1/M2 Support
+We are thrilled to announce the release of Tabby [v0.1.1](https://github.com/TabbyML/tabby/releases/tag/v0.1.1) 👏🏻.
+
+Apple M1/M2 Tabby users can now harness Metal inference support on Apple's M1 and M2 chips by using the `--device metal` flag, thanks to [llama.cpp](https://github.com/ggerganov/llama.cpp)'s awesome metal support.
+
+The Tabby team made a [contribution](https://github.com/ggerganov/llama.cpp/pull/3187) by adding support for the StarCoder series models (1B/3B/7B) in llama.cpp, enabling more appropriate model usage on the edge for completion use cases.
+
+<center>
+
+```
+llama_print_timings:        load time =   105.15 ms
+llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_print_timings: prompt eval time =    25.07 ms /     6 tokens (    4.18 ms per token,   239.36 tokens per second)
+llama_print_timings:        eval time =   311.80 ms /    28 runs   (   11.14 ms per token,    89.80 tokens per second)
+llama_print_timings:       total time =   340.25 ms
+```
+
+*Inference benchmarking with [StarCoder-1B](https://huggingface.co/TabbyML/StarCoder-1B) on Apple M2 Max now takes approximately 340ms, compared to the previous time of around 1790ms. This represents a roughly 5x speed improvement.*
+
+</center>
+
+
+This enhancement leads to a significant inference speed upgrade🚀, for example, It marks a meaningful milestone in Tabby's adoption on Apple devices. Check out our [Model Directory](/docs/models) to discover LLM models with Metal support! 🎁
+
+:::tip
+Check out latest Tabby updates on [Linkedin](https://www.linkedin.com/company/tabbyml/) and [Slack community](https://join.slack.com/t/tabbycommunity/shared_invite/zt-1xeiddizp-bciR2RtFTaJ37RBxr8VxpA)! Our Tabby community is eager for your participation. ❤️ 
+:::