LLaMA.cpp Patch Delivers 40% Faster Qwen 27B Inference on M5 Max · Digg