Simplifying Alignment by Expanding Scope
This post is crossposted from my Substack, Structure and Guarantees, where I explore how formal verification and related ideas might scale to more complex intelligent systems. This installment begins a sequence on lessons for alignment from the practice of formal verification. Counterintuitively, adding additional layers to a formally verified system can actually simplify the problem of specifying what safe behavior means. This phenomenon may point toward useful structuring principles even for s...
Read full article →