11h agoFuli Luo, who builds Xiaomi's MiMo LLM, details how Hybrid Sliding Window Attention and KVCache optimizations cut MiMo-V2.5 serving costsThe optimizations delivered nearly 5x higher effective cache capacity.SentimentSentimentPos81.8%Neg18.2%Many users praised Xiaomi's MiMo-V2.5 API price cuts with hybrid optimizations for making the model more affordable and effective, while others criticized ongoing reliability issues like frequent timeouts and speed limits.34 comments with sentiment. View comments.