/AI12h ago

Prime Intellect's kalomaze highlights an audio conditioning method that projects raw waveforms directly into transformers

The system processes 25 patches per second without STFT.

--0--
Quote posts
Reposts
Original post
kalomaze@kalomaze#836inAI

it's BASED. they are linearly projecting raw samples into the transformer as patches for audio conditioning and its working. no freq domain priors, all the redundant phase info still present at the input, not even hardcoded STFT decomposition. 25 patches per second

12:52 PM · Jun 3, 2026 · 14.1K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
RETWEETS7
kalomaze@kalomaze

it's BASED. they are linearly projecting raw samples into the transformer as patches for audio conditioning and its working. no freq domain priors, all the redundant phase info still present at the input, not even hardcoded STFT decomposition. 25 patches per second

12hViews 14.1KLikes 163Bookmarks 52