GateGPT: 56k tokens per second Transformer (KV cache) on FPGA at 80 MHz

·Hacker News··

56,000+ tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a https://t.co/hXX8kKIxiA

Read full article →

Related Articles

Swiss parliament lifts ban on new nuclear power plants
leonidasrup · Hacker News · 17h ago
Lore – Open source version control system designed for scalability
regnerba · Hacker News · 1d ago
Volkswagen started blocking GrapheneOS users
microtonal · Hacker News · 1d ago
The Token Compression Illusion: Why I'm Skeptical of RTK
lackoftactics · Hacker News · 14h ago
US holds off blacklisting DeepSeek, more than 100 firms deemed security risks
giuliomagnifico · Hacker News · 2d ago