Overfitted a 900KB Transformer to Compress a 100MB CSV into 7MB

·Hacker News··

I built an experiment that uses an overfitted transformer and arithmetic coding to compress individual files.Instead of training the model to generalize, I train a 900KB transformer to memorize a single file and predict the next byte. Those predictions are fed into an arithmetic coder to produce the compressed output.On a 100MB NYC taxi CSV, it compresses to about 7MB (~0.5 bits/byte). On a 100MB slice of enwik9, it compresses to about 21MB (~1.68 bits/byte).It's pretty slow right now (roughly 20–30 minutes of training and 45 minutes each for compression and decompression on my AMD 7800XT).Checkout the repo - https://github.com/samyak112/pym-particles

Read full article →

Related Articles

Anthropic says Alibaba illicitly extracted Claude AI model capabilities
htrp · Hacker News · 1d ago
Ford AI hiccups push carmaker to rehire ‘gray beard’ inspectors
alanwreath · Hacker News · 14h ago
An entire Herculaneum scroll has been read for the first time
verditelabs · Hacker News · 13h ago
IBM debuts sub-1 nanometer chip technology
porridgeraisin · Hacker News · 14h ago
OpenAI unveils its first custom chip, built by Broadcom
jamdesk · Hacker News · 1d ago