I built a self healing telemetry pipeline that hit 600 clones in 48 hours
I spent some time building a prototype to handle schema drift in high velocity data. It’s called the Resilient RAP Framework and uses semantic embeddings to fix broken data streams on the fly.
I tested it by feeding it F1 telemetry and then switching to clinical ICU data to see if it could map different fields without manual help. It successfully mapped drift across domains, like matching kph to km/h in racing data and resolving pulse oximetry to oxygen saturation in a health context.
The project hit 256 unique clones on GitHub this week. I am not sure if that is just noise or actual interest, but I’m interested to see if anyone else is using agentic recovery for ingestion or if there are better ways to handle this at scale. I would appreciate any feedback or comments you guys might have!
My GitHub is here: https://github.com/tarek-clarke/resilient-rap-framework
0 Comments