Gradient-free Single-pass Model Beats nanoGPT on Shakespeare

·LessWrong··

Beam is a character-level language model that computes count tables mapping character contexts to next-character frequencies.At prediction time, each order mjx-container[jax="CHTML"] { line-height: 0; } mjx-container [space="1"] { margin-left: .111em; } mjx-container [space="2"] { margin-left: .167em; } mjx-container [space="3"] { margin-left: .222em; } mjx-container [space="4"] { margin-left: .278em; } mjx-container [space="5"] { margin-left: .333em; } mjx-container [rspace="1"] { margin-right:...

Read full article →

Related Articles

Historical memory prices 1960-2026
vga1 · Hacker News · 1d ago
Anthropic says Alibaba illicitly extracted Claude AI model capabilities
htrp · Hacker News · 4d ago
DSpark: Speculative decoding accelerates LLM inference [pdf]
aurenvale · Hacker News · 2d ago
Building Principia for Windows XP
LorenDB · Hacker News · 5h ago
TOP500 at ISC’26: We have a New Number 1 Supercomputer
rbanffy · Hacker News · 23h ago