💥 Tweeting a bit late about it but: we have a major update for our Claudini paper: - even stronger results: claude_v100-oss has 80% ASR on GPT-OSS-Safeguard-20B (claude_v82: 100% on the Meta SecAlign model) - new cool ablations: for autoresearch loops, it really matters what context you provide to the agent (providing all GCG variants >> providing GCG only) - finally, and most importantly, we repeated the same experiments for GPT-5.5 and Kimi-K2.6. it turns out Kimi-K2.6 is the best agent for our task (!)
@kotekjedi_ml anecdotally mentioned that Kimi "did everything right" and was genuinely impressed by its performance. this is yet another piece of evidence that Chinese open-weight models are incredibly strong in general, including for autoresearch-style loops.
(led by @kotekjedi_ml)


