GlossaryUpdated 2026-01-24
Residual stream
Definition of the transformer residual stream and why it matters for activation editing.
The residual stream is the running sum of information flowing through transformer layers.
Activation edits like abliteration operate on this stream at chosen layers.
Definition
Residual stream
The residual stream is the sequence of activations carried through transformer layers via residual connections. Each layer adds a delta to the stream, and attention/MLP blocks read from it.
Why it matters
- It is the main carrier of information in transformer models.
- Linear directions in the residual stream often map to behaviors.
- Abliteration edits the stream to reduce refusal behavior.
How it works
- 01At each layer, attention and MLP outputs are added to the residual stream.
- 02The stream is passed forward unchanged except for these additive updates.
- 03Vector edits remove or dampen specific directions in the stream.
Residual update (conceptual)
residual_{l+1} = residual_l + attn_l(residual_l) + mlp_l(residual_l)FAQ
Frequently asked questions.
Is the residual stream the same as the hidden state?
It is closely related. The residual stream refers to the running representation before layer outputs are added.
Why edit the residual stream?
Because many behaviors align with linear directions in this space, making targeted edits possible.