8h ago

arXiv paper states rotary position embeddings lose locality bias in long contexts

0

An arXiv paper titled RoPE Distinguishes Neither Positions Nor Tokens in Long Contexts, Provably demonstrates that rotary position embeddings lose locality bias and token relevance consistency as context length grows. Theoretical proofs show failure probabilities approach random chance levels. AI researcher Delip Rao posted a screenshot of the paper on X. Researcher You Jiacheng replied that the analysis rests on unrealistic uniform norm assumptions for query and key vectors and suggested partial RoPE as an alternative.

Original post

this paper sounds off. for position part, it assumes that 2d sub-vectors of q&k (RoPE rotates these 2d sub-vectors) have basically uniform norm, which is not realistic. for content part, we can use partial RoPE.

Delip Rao e/σDelip Rao e/σ@deliprao

Ouch

1:46 PM · May 18, 2026 · 42.4K Views
4:38 PM · May 18, 2026 · 4.3K Views