11h ago
Prime Intellect's kalomaze shows transformers can condition on audio by projecting raw waveform samples directly as patches, bypassing STFT
The direct projection method preserves complete audio phase information
The direct projection method preserves complete audio phase information