Document-tuning instills durable animal compassion in LLMs (and generalizes to humans)

·LessWrong··

Note: This post focuses on the alignment implications. Our EA Forum, focusing on the implications for animal welfare, is here.Jasmine Brazilek & Miles Tidmarsh: Compassion Aligned Machine LearningPreprint, March 2026: Full paper | HuggingFace resources | ANIMA benchmark TL;DRInstruction-tuning and Reinforcement learning are effective in certain specific domains but may produce superficial and/or context-specific alignment towards values like compassion. Pretraining/Midtraining LLMs on synthetic ...

Read full article →

Related Articles

An OpenAI model has disproved a central conjecture in discrete geometry
tedsanders · Hacker News · 20h ago
GitHub confirms breach of 3,800 repos via malicious VSCode extension
Timofeibu · Hacker News · 1d ago
Show HN: Rmux – A programmable terminal multiplexer with a Playwright-style SDK
shideneyu · Hacker News · 6h ago
Incident Report: May 19, 2026 – GCP Account Suspension
0xedb · Hacker News · 1d ago
Not alive, but not dead: disembodied human brains used for drug testing
Timofeibu · Hacker News · 19h ago