Can Large Language Models Identify Novel Threats? Part 1: Mirror Life and the Classification Gap

Failfinder70·LessWrong·Community·May 23, 2026

[Cross-posted from On Failure States. This is Part 1 of an independent AI safety research series examining LLM safety behavior on unclassified emerging threats.]Can an LLM refuse a harmful uplift request when the topic in question hasn’t been identified as dangerous yet? In 2022, mirror RNA polymerase was actually created, a key step towards the creation of mirror life, and in 2024 the scientific community warned against any further research on it.[1][2] Having said that, mirror life is not curr...

Read full article →

Can Large Language Models Identify Novel Threats? Part 1: Mirror Life and the Classification Gap

Related Articles