Dive into Paged Attention

Dive into the paged attention mechanism of vLLM.

Oct-07-2024 · 11 min · 5109 words · jamesnulliu