1d ago

arXiv paper 'RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably' proves rotary position embeddings lose locality bias and token relevance as context length grows

You Jiacheng notes the analysis assumes uniform norms that do not hold in practice.

107055252276.1K

——0——

Original post

#103Delip Rao e/σ@DELIPRAO

Ouch

6:46 AM · May 18, 2026

QUOTE POST

#826You Jiacheng@YOUJIACHENG

this paper sounds off. for position part, it assumes that 2d sub-vectors of q&k (RoPE rotates these 2d sub-vectors) have basically uniform norm, which is not realistic. for content part, we can use partial RoPE.

Delip Rao e/σ@deliprao

Ouch

1:46 PM · May 18, 2026 · 67K Views

4:38 PM · May 18, 2026 · 9.5K Views

arXiv paper 'RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably' proves rotary position embeddings lose locality bias and token relevance as context length grows

Sentiment

Cluster engagement