A Geometric Account of Activation Steering through Angle–Norm Decomposition

·LessWrong··

This blog post provides an overview of our recent paper: A Geometric Account of Activation Steering through Angle–Norm Decomposition.TL;DR: We decompose linear activation steering into two distinct operations: one that changes the angle of the activation toward a concept direction, and one that changes its norm. Through controlled experiments, we analyze the role of each component. We find that concept information is indeed primarily encoded in the angular component of activations. However, the ...

Read full article →

Related Articles

Epic Games announces Lore version control system
regnerba · Hacker News · 3h ago
Volkswagen started blocking GrapheneOS users
microtonal · Hacker News · 2h ago
RFC 10008: The new HTTP Query Method
schappim · Hacker News · 7h ago
Apple is about to make Hide My Email useless
SXX · Hacker News · 23h ago
TIL: You can make HTTP requests without curl using Bash /dev/TCP
mrshu · Hacker News · 1d ago