/AI12h ago

Prime Intellect's kalomaze highlights an audio conditioning method that projects raw waveforms directly into transformers

The system processes 25 patches per second without STFT.

9163135214K

Quote posts

#836

Reposts

#488

Original post

kalomaze@kalomaze#836inAI

it's BASED. they are linearly projecting raw samples into the transformer as patches for audio conditioning and its working. no freq domain priors, all the redundant phase info still present at the input, not even hardcoded STFT decomposition. 25 patches per second

12:52 PM · Jun 3, 2026 · 14.1K Views

/AI12h ago

Prime Intellect's kalomaze highlights an audio conditioning method that projects raw waveforms directly into transformers

The system processes 25 patches per second without STFT.

--0--

Quote posts

#836

Reposts

#488

Original post

kalomaze@kalomaze#836inAI

12:52 PM · Jun 3, 2026 · 14.1K Views

Sentiment

Users are excited about DeepMind feeding raw audio patches directly into a transformer without encoders because it marks a serious step toward feasible droids rather than over-focusing on agents.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

RETWEETS7

kalomaze@kalomaze

12h14.1K16352