@aryaman2020 what are your thoughts on natural language autoencoders?
Fable system card has 0 mentions of circuits, 1 mention of SAEs saying they didn't work for measuring grader awareness, and many mentions of contrastive probing (esp. emotions?) and natural language autoencoders