2h agoAndrew Lanpouthakoun proposes Prefill-Only Fine Tuning to avoid decode-stage bottlenecks and increase multi-adapter LLM throughput 2.21xThe method trains adapters exclusively during the prefill phaseSentimentSentimentPos100%Neg0%Positive users share excitement about leading the PreFT adapters project that boosts multi-LoRA inference throughput with minimal accuracy loss.1 comment with sentiment. View comments.