Fix link

Sherlock113 · Sherlock113 · commit d5166f72644c · 2025-07-31T11:00:47.000+08:00
Signed-off-by: Sherlock113 &lt;sherlockxu07@gmail.com&gt;
diff --git a/docs/inference-optimization/prefix-caching.md b/docs/inference-optimization/prefix-caching.md
@@ -72,7 +72,7 @@ In agent workflows, the benefit is even more pronounced. Some use cases have inp
 
 For applications with long, repetitive prompts, prefix caching can significantly reduce both latency and cost. Over time, however, your KV cache size can be quite large. GPU memory is finite, and storing long prefixes across many users can eat up space quickly. You’ll need cache eviction strategies or memory tiering.
 
-The open-source community is actively working on distributed serving strategies. See [prefix-aware routing](./prefix-caching-cache-aware-routing) for details.
+The open-source community is actively working on distributed serving strategies. See [prefix-aware routing](./prefix-aware-routing) for details.
 
 ---