AI emotions and aligned behavior
I participated in the BlueDot Technical AI Safety Project Sprint (April-May 2026) to better understand the field of AI Safety Research. This blog post summarizes my findings.IntroductionMost AI safety research concentrates on two levers: alignment (building models that genuinely share human values) and control (layering in security and confinement when they don't). While these are important parts of AI safety, I wanted to see if we could leverage recent AI wellbeing research to incentivize model...
Read full article →