Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

Published in preprint, 2024

Recommended citation: https://arxiv.org/abs/2406.09519