Current Large Audio Language Models largely transcribe rather than listen (via crmne) — discussion

#ai #audio