The first inherently interpretable AI platform is finally here. Welcome to Clarity.
Guide Labs launches Clarity, an interpretable AI platform that lets users inspect and steer the concepts driving model outputs
Users click text chunks to inspect underlying training data.
Most Activity
This is brilliant.
The first inherently interpretable AI platform just launched, "Clairy" by Guide Labs.
Attacks the "Black box" problem of AI.
The model generates text in chunks. You can click a chunk and see what concepts the model used to generate it.
With normal LLMs: if the model gives a wrong or biased answer, you mostly have to guess which words to change in the prompt.
Clarity changes that by trying to show the concepts the model is using while generating the answer, such as “marine life,” “African wildlife,” “computer science,” or “male role descriptions.”
i.e. you are not only seeing the final answer, you are seeing some of the hidden ingredients that pushed the model toward that answer.
Clarity also adds training data attribution, which connects generated chunks to similar training chunks so mistakes can be diagnosed instead of treated as mystery failures.
The new control layer is concept steering, where users amplify or suppress a concept directly, so, e.g. “marine life” can be raised without rewriting the question and unwanted concept families can be reduced without retraining.
The first inherently interpretable AI platform is finally here. Welcome to Clarity.
