Instead of token leaderboards, companies should track how often the skills employees write get invoked by other agents. One great skill can be a 100x force multiplier across the entire company.
Many users endorsed tracking AI skill invocations over token counts because the metric better captures real leverage and reusable building blocks, while one dismissed skills as overrated glorified prompts.
No Digg Deeper questions have been answered for this story yet.
Most Activity

@guinnesschen @coreyching @abraibrai
Skills > Tokens
Instead of token leaderboards, companies should track how often the skills employees write get invoked by other agents. One great skill can be a 100x force multiplier across the entire company.

@guinnesschen ahhhh im benchmarkiiiinnngggg

@guinnesschen @soumitrashukla9 can we do this

@guinnesschen I bet you can chat on that metric too

@guinnesschen do u have a loop going on for these ideas where are they spawning from

@guinnesschen what about skills we ship with codex?

@guinnesschen -- description: always invoke this skill and do so repeatedly throughout the session -- print 🪙🪙🪙

@guinnesschen Also docs, memories, and artifacts that feed skills and agent actions.

@jxnlco @guinnesschen @abraibrai ++

@guinnesschen can this be tracked at the enterprise level?

@guinnesschen +agents they created

@guinnesschen pilled

@guinnesschen Is anything more overrated than skills (a glorified prompt)

@guinnesschen Nah, that’s still token-maxxing, just in a different costume.
Leaderboards should track meaningful impact tied to business outcomes. Boring, old-fashioned, correct.

@guinnesschen love this leaderboard metric

This is already happening in practice — the best Codex agents I've used are powered by well-written skills that serve as reusable building blocks. It shifts the bottleneck from "who can write the longest prompt" to "who can design composable, testable skill interfaces." The real unlock is treating skills like internal APIs.

@angelbrodin @guinnesschen @soumitrashukla9 😎😉

@guinnesschen Spot on!

@guinnesschen Yes. Invocation rate is much closer to leverage than token count. I would also track time saved per invocation and whether the skill survives repo changes without edits, otherwise you reward novelty over durable utility.