docs: add blog for v0.1.1 release (#454)
parent
c107c991ff
commit
0b4206e6f8
|
|
@ -1,21 +0,0 @@
|
|||
---
|
||||
authors: [ meng ]
|
||||
---
|
||||
# Highlights of Tabby v0.1.0: Apple M1/M2 Support
|
||||
We are thrilled to announce the release of Tabby v0.1.0👏🏻.
|
||||
|
||||
Thanks to [llama.cpp](https://github.com/ggerganov/llama.cpp), Apple M1/M2 Tabby users can now harness Metal inference support on Apple's M1 and M2 chips by using the `--device metal` flag.
|
||||
|
||||
This enhancement leads to a significant inference speed upgrade🚀. It marks a meaningful milestone in Tabby's adoption on Apple devices. Check out our [Model Directory](/docs/models) to discover LLM models with Metal support! 🎁
|
||||
|
||||
<center>
|
||||
|
||||

|
||||
|
||||
*An example inference benchmarking with [CodeLlama-7B](https://huggingface.co/TabbyML/CodeLlama-7B) on Apple M2 Max, takes ~600ms.*
|
||||
|
||||
</center>
|
||||
|
||||
:::tip
|
||||
Check out latest Tabby updates on [Linkedin](https://www.linkedin.com/company/tabbyml/) and [Slack community](https://join.slack.com/t/tabbycommunity/shared_invite/zt-1xeiddizp-bciR2RtFTaJ37RBxr8VxpA)! Our Tabby community is eager for your participation. ❤️
|
||||
:::
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:702e9b69b54a0b86731c23d199ffe454a2f03437b25f0fe8c25257e9c71b8877
|
||||
size 19495
|
||||
|
|
@ -0,0 +1,30 @@
|
|||
---
|
||||
authors: [ meng ]
|
||||
---
|
||||
# Highlights of Tabby v0.1.1: Apple M1/M2 Support
|
||||
We are thrilled to announce the release of Tabby [v0.1.1](https://github.com/TabbyML/tabby/releases/tag/v0.1.1) 👏🏻.
|
||||
|
||||
Apple M1/M2 Tabby users can now harness Metal inference support on Apple's M1 and M2 chips by using the `--device metal` flag, thanks to [llama.cpp](https://github.com/ggerganov/llama.cpp)'s awesome metal support.
|
||||
|
||||
The Tabby team made a [contribution](https://github.com/ggerganov/llama.cpp/pull/3187) by adding support for the StarCoder series models (1B/3B/7B) in llama.cpp, enabling more appropriate model usage on the edge for completion use cases.
|
||||
|
||||
<center>
|
||||
|
||||
```
|
||||
llama_print_timings: load time = 105.15 ms
|
||||
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
||||
llama_print_timings: prompt eval time = 25.07 ms / 6 tokens ( 4.18 ms per token, 239.36 tokens per second)
|
||||
llama_print_timings: eval time = 311.80 ms / 28 runs ( 11.14 ms per token, 89.80 tokens per second)
|
||||
llama_print_timings: total time = 340.25 ms
|
||||
```
|
||||
|
||||
*Inference benchmarking with [StarCoder-1B](https://huggingface.co/TabbyML/StarCoder-1B) on Apple M2 Max now takes approximately 340ms, compared to the previous time of around 1790ms. This represents a roughly 5x speed improvement.*
|
||||
|
||||
</center>
|
||||
|
||||
|
||||
This enhancement leads to a significant inference speed upgrade🚀, for example, It marks a meaningful milestone in Tabby's adoption on Apple devices. Check out our [Model Directory](/docs/models) to discover LLM models with Metal support! 🎁
|
||||
|
||||
:::tip
|
||||
Check out latest Tabby updates on [Linkedin](https://www.linkedin.com/company/tabbyml/) and [Slack community](https://join.slack.com/t/tabbycommunity/shared_invite/zt-1xeiddizp-bciR2RtFTaJ37RBxr8VxpA)! Our Tabby community is eager for your participation. ❤️
|
||||
:::
|
||||
Loading…
Reference in New Issue