diff --git a/website/blog/2023-11-23-coding-llm-leaderboard/index.mdx b/website/blog/2023-11-23-coding-llm-leaderboard/index.mdx index b76cab4..3f0be8c 100644 --- a/website/blog/2023-11-23-coding-llm-leaderboard/index.mdx +++ b/website/blog/2023-11-23-coding-llm-leaderboard/index.mdx @@ -8,7 +8,7 @@ image: ./leaderboard.png --- # Introducing the Coding LLM Leaderboard -In our previous post on [Cracking the Coding Evaluation](../13/model-evaluation), we shed light on the limitations of relying on HumanEval pass@1 as a code completion benchmark. +In our previous post on *Cracking the Coding Evaluation*, we shed light on the limitations of relying on HumanEval pass@1 as a code completion benchmark. In response, we've launched the [Coding LLMs Leaderboard](http://leaderboard.tabbyml.com), embracing **Next Line Accuracy** as a metric inspired by academic works such as [RepoCoder](https://arxiv.org/abs/2303.12570), [RepoBench](https://arxiv.org/abs/2306.03091), and [CCEval](https://arxiv.org/abs/2310.11248). ![Leaderboard](./leaderboard.png)