Songlin Yang releases ReplaySSM, cutting hybrid SSM memory traffic in half to enable 2x faster speculative decoding · Digg