Erik Kaunismäki SWE at Hugging Face releases MaxSim kernel for ColBERT retrieval
Erik Kaunismäki SWE at Hugging Face released the MaxSim kernel on Hugging Face under erikkaum/maxsim. The kernel replaces full similarity matrix materialization in ColBERT and PyLate models with tiled scoring that uses Metal simdgroup_matrix and WMMA instructions. It delivers 3–5× speedup over naive PyTorch baselines. Perplexity AI simultaneously open-sourced the related pplx-embed-v1-late-0.6b multilingual ColBERT-style model, which ships usage instructions for the new kernel.
oh! cool to see @perplexity_ai train late interaction (colbert) models
okay maybe it's a good time? We have a small colbert model trained at pplx, it is a continue-training of pplx-embed-0.6b, so native multilingual, just made it open and added a section how to use MaxSim kernel: https://huggingface.co/perplexity-ai/pplx-embed-v1-late-0.6b