Talking Heads: Understanding Inter-layer Communication in Transformer Language ModelsPublished in Nuerips, 2024Recommended citation: https://arxiv.org/abs/2406.09519Share on Twitter Facebook LinkedIn Previous Next