Deep dive into FNS: building a tokenizer that chunks text efficiently but has character level resolution!
FNS augments the loss with character level signal at training time while at inference time you can decode single characters.
Deep dive here: https://huggingface.co/spaces/HuggingFaceBio/carbon-tokenization




