has anyone ever written a diss track of your paper
Stanford’s Christopher Potts releases a Suno-generated rap diss track responding to DTU researchers’ critique of AXBENCH
AI Judge changed title after evaluation, original title: "Stanford NLP's Christopher Potts challenges rebuttal claiming sparse autoencoders outperform simple baselines for LLM steering"
The track satirizes technical concepts like LASSO and TopK.
Some users view the rebuttal paper on sparse autoencoders for LLM steering as a sign of the original work's importance while others dismiss SAE progress as too slow to reach real tasks.
Most Activity
Like any good advisor, I felt duty-bound to defend @aryaman2020 and @ZhengxuanZenWu in this rap battle. However, the SAE diss track I wrote was so devastating as to be unanswerable, so I decided to graciously balance things out with a second verse dissing causal interp.
has anyone ever written a diss track of your paper
yep! and that follow-up paper is still one of my favourite papers :)
diss tracks are one of the best ways science progresses
has anyone ever written a diss track of your paper
I wrote the lyrics as an homage to "Hypnotize" by The Notorious B.I.G., but Biggie apparently cannot be imitated, so the @suno version uses a totally different style.
Full lyrics:
[Intro] Uh-huh. Yeah. Bay Area but it's for Biggie Interpretability Mafia in the house (Yo!) Giving ourselves a hard time as usual We're going after all of interp on this one (Dissing post-training would be too easy.) Ha, can you even beat the average?
[ChrisPy] Just train things on instinct, LASSO sparse link, rep shrink, add gaters, TopK, your saviors, Guidin' by labelin', cave in (it's supervised!) LessWrong blog fight, then train all night. SPINE set the rules way back, but now it's cool. Other views? Got no clues, dudes. Ignore crews who run evals on you (come on), place ranks upon you (they're on you). Claude once knew you, then outgrew you, who you? JumpReLU? Yeah, S, A, and E, close like K-SVD, and even ICA, see? Preparadigmatic for me, but not for thee. Steer northly, claim the V, but irrationally. Recently, interest haltin', paradigm faultin' (defaultin'). So they scale back the claim, change the name (change the frame). So they transcode the game, it's the same. Promptin', pleadin', "What's in it?" ("What's in it?") Admit it, that's the limit. Causal methods gonna win it!
[SAE Chorus] W_E to the ReLU, g, W_D just hypnotized me. I just trained these reps all day, but my model still can't find its way. W_E to the ReLU, g, W_D won't satisfy me. With this SAE thing, I got played. I finally understand why DAS got made.
[ChrisPy] You put zeros on nodes in Transformer flows (uh-huh). Or use the means when you intervenes (that's right). Swap in the source, fully brute force, Learnin' subspaces, all over the places (c'mon). Now what's the real method? What's even the question? All just tricks, algebra mix, maybe no fix. You have to ask: random model shows structure, random task. Intervene first, ask questions last. That's how causal abstractions pass. At last, we're rappin' 'bout direct effects, divergence checks, all do respects (do calculus!). Transcoders leave you unawares, aligning pairs, unnatural wares, should be scares. At the eval, show respects: every other method's got causal effects. Face it, too slow, and unconstrained. All these sparse methods got the claim to fame!
[DAS Chorus] Rotate source and base, then subtract, my g. Rotate, add the base, that's DAS, you see? I just trained these reps all day, but my model still can't find its way. Ident minus Gram times the base, my g. Add Gram times the source, it's DAS, you see? Believe this causal story, I'm a fool. SAEs just deserve to rule.
@aryaman2020 @lateinteraction In any case, no AI allowed. As Kendrick Lamar presciently said back in 2015, "I can dig rapping, but a rapper just prompting? What the f*ck happened?"
@aryaman2020 @lateinteraction I think if Jørgensen and Hansen challenge us to a rap battle, we should absolutely accept. Or are we supposed to challenge them first? I am not sure of the etiquette.
Here is a chorus for an homage to Biggie's "Hypnotize" (the part where the girls are singing to Biggie):
W_e to the ReLU, g, W_d just hypnotized me. I just trained these reps all day, but my model still can't find its way.
W_e to the ReLU, g, W_d won't satisfy me. With this SAE thing, I got played. I finally understand why ReFT got made.
this is legendary
Like any good advisor, I felt duty-bound to defend @aryaman2020 and @ZhengxuanZenWu in this rap battle. However, the SAE diss track I wrote was so devastating as to be unanswerable, so I decided to graciously balance things out with a second verse dissing causal interp.
Full lyrics:
[Intro] Uh-huh. Yeah. Bay Area but it's for Biggie Interpretability Mafia in the house (Yo!) Giving ourselves a hard time as usual We're going after all of interp on this one (Dissing post-training would be too easy.) Ha, can you even beat the average?
[ChrisPy] Just train things on instinct, LASSO sparse link, rep shrink, add gaters, TopK, your saviors, Guidin' by labelin', cave in (it's supervised!) LessWrong blog fight, then train all night. SPINE set the rules way back, but now it's cool. Other views? Got no clues, dudes. Ignore crews who run evals on you (come on), place ranks upon you (they're on you). Claude once knew you, then outgrew you, who you? JumpReLU? Yeah, S, A, and E, close like K-SVD, and even ICA, see? Preparadigmatic for me, but not for thee. Steer northly, claim the V, but irrationally. Recently, interest haltin', paradigm faultin' (defaultin'). So they scale back the claim, change the name (change the frame). So they transcode the game, it's the same. Promptin', pleadin', "What's in it?" ("What's in it?") Admit it, that's the limit. Causal methods gonna win it!
[SAE Chorus] W_E to the ReLU, g, W_D just hypnotized me. I just trained these reps all day, but my model still can't find its way. W_E to the ReLU, g, W_D won't satisfy me. With this SAE thing, I got played. I finally understand why DAS got made.
[ChrisPy] You put zeros on nodes in Transformer flows (uh-huh). Or use the means when you intervenes (that's right). Swap in the source, fully brute force, Learnin' subspaces, all over the places (c'mon). Now what's the real method? What's even the question? All just tricks, algebra mix, maybe no fix. You have to ask: random model shows structure, random task. Intervene first, ask questions last. That's how causal abstractions pass. At last, we're rappin' 'bout direct effects, divergence checks, all do respects (do calculus!). Transcoders leave you unawares, aligning pairs, unnatural wares, should be scares. At the eval, show respects: every other method's got causal effects. Face it, too slow, and unconstrained. All these sparse methods got the claim to fame!
[DAS Chorus] Rotate source and base, then subtract, my g. Rotate, add the base, that's DAS, you see? I just trained these reps all day, but my model still can't find its way. Ident minus Gram times the base, my g. Add Gram times the source, it's DAS, you see? Believe this causal story, I'm a fool. SAEs just deserve to rule.
Like any good advisor, I felt duty-bound to defend @aryaman2020 and @ZhengxuanZenWu in this rap battle. However, the SAE diss track I wrote was so devastating as to be unanswerable, so I decided to graciously balance things out with a second verse dissing causal interp.
@aryaman2020 Is "can" as in "I can make a lay-up" or as in "I can make a shot from center court"?
has anyone ever written a diss track of your paper
@aryaman2020 @lateinteraction I think if Jørgensen and Hansen challenge us to a rap battle, we should absolutely accept. Or are we supposed to challenge them first? I am not sure of the etiquette.
@lateinteraction i will let @ChrisGPotts post the steering vector-themed parody of Eminem's "Without Me" but at least i can reply with this

@aryaman2020 Don’t worry guys, it only took a year to get SAEs to work on AxBench, maybe in another year we could get them to work on real tasks. :/

@aryaman2020 "perform close to on par...when features are selected" am i going crazy? i swear there was a paper that showed this before

@aryaman2020 found it - https://aclanthology.org/2025.emnlp-main.519.pdf they seem to built on arad et al? but couldnt easily find if they discussed difference

@aryaman2020 it's a good sign when a paper is important enough to get dissed!