9/ The erase factor makes each step depend on the previous one, so the naive version is sequential and slow on a GPU. DeltaNet runs it in chunks: within a chunk, it's matrix multiplication and a triangular solve, and only the state is carried over between chunks.
8/ β is the write strength, learned per token. At 1 it fully overwrites. Below 1 it writes more gently, which matters because keys aren't orthogonal, so a hard write along one key also perturbs the keys that overlap it.
