LMCache/LMCache

GitHub Trending··

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer A KV Cache Management Layer for Scalable LLM Inference Blog | Documentation | Join Slack | Community Meeting | Roadmap Updates [2026/05] 🔥 Agentic workload benchmark on AMD MI300X (blog). [2026/04] 🔥 LMCache's new multiprocess(MP) architecture release (blog). [2026/03] LMCache at GTC 2026 (post). [2026/01] LMCache multi-node P2P CPU memory sharing, from experimental feature to production (blog). More [2025/11] LMCache x CoreWeave ac...

Read full article →

Related Articles

OpenAI’s o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors
donsupreme · Hacker News · 1mo ago
Accelerating Gemma 4: faster inference with multi-token prediction drafters
amrrs · Hacker News · 1mo ago
A couple million lines of Haskell: Production engineering at Mercury
unignorant · Hacker News · 1mo ago
Using “underdrawings” for accurate text and numbers
samcollins · Hacker News · 1mo ago
ProgramBench: Can language models rebuild programs from scratch?
jonbaer · Hacker News · 1mo ago