Current Large Audio Language Models largely transcribe rather than listen (via crmne) — discussion
#ai #audio