Gradium TTS is particularly good at transferring all characteristics including reverb, bandwidth (e.g. phone speech), even typical podcast mic saturation on plosives ("p" sound). All samples in the video just generated from 10 samples without any processing.
I'm just as surprised nobody in AI voice tech has realized a voice needs background and environmental noise to sound realistic
Even @ElevenLabs the leader in voice AI can not produce voice with background noise, or environment reverb sound
AI voices are always going to sound non-passable as human if they don't have that
And it's only me and this other guy even talking about it

