From f7ecab5bca8f822892dc6cdfd9a2b959ebe49ac5 Mon Sep 17 00:00:00 2001 From: Meng Zhang Date: Sat, 30 Sep 2023 17:42:24 -0700 Subject: [PATCH] docs: change `consumer` to `client` --- website/blog/2023-09-30-stream-laziness-in-tabby/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/website/blog/2023-09-30-stream-laziness-in-tabby/index.md b/website/blog/2023-09-30-stream-laziness-in-tabby/index.md index 363fcff..0f120c2 100644 --- a/website/blog/2023-09-30-stream-laziness-in-tabby/index.md +++ b/website/blog/2023-09-30-stream-laziness-in-tabby/index.md @@ -50,7 +50,7 @@ function server(llm) { app.listen(8080); } -async function consumer() { +async function client() { const resp = await fetch('http://localhost:8080'); // Read values from our stream @@ -64,12 +64,12 @@ async function consumer() { } server(llm()); -consumer(); +client(); ``` ## Stream Laziness -If you were to run this program, you'd notice something interesting. We'll observe the LLM continuing to output `producing ${i}` even after the consumer has finished reading three times. This might seem obvious, given that the LLM is generating an infinite stream of integers. However, it represents a problem: our server must maintain an ever-expanding queue of items that have been pushed in but not pulled out. +If you were to run this program, you'd notice something interesting. We'll observe the LLM continuing to output `producing ${i}` even after the client has finished reading three times. This might seem obvious, given that the LLM is generating an infinite stream of integers. However, it represents a problem: our server must maintain an ever-expanding queue of items that have been pushed in but not pulled out. Moreover, the workload involved in creating the stream is typically both expensive and time-consuming, such as computation workload on the GPU. But what if the client aborts the in-flight request due to a network issue or other intended behaviors?