Audio Processing - 2024-02

Publish Date Title Authors PDF Translate Read Code
2024-02-29 Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems Quentin Raymondaud et.al. 2402.19443 translate read null
2024-02-29 Unraveling Adversarial Examples against Speaker Identification – Techniques for Attack Detection and Victim Model Classification Sonal Joshi et.al. 2402.19355 translate read null
2024-02-29 Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data Takaaki Saeki et.al. 2402.18932 translate read null
2024-02-29 Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition Jeehyun Lee et.al. 2402.18923 translate read null
2024-02-29 Investigation of Adapter for Automatic Speech Recognition in Noisy Environment Hao Shi et.al. 2402.18275 translate read null
2024-02-28 Multilingual Speech Models for Automatic Speech Recognition Exhibit Gender Performance Gaps Giuseppe Attanasio et.al. 2402.17954 translate read link
2024-02-24 ByteComposer: a Human-like Melody Composition Method based on Language Model Agent Xia Liang et.al. 2402.17785 translate read null
2024-02-27 High-Fidelity Neural Phonetic Posteriorgrams Cameron Churchwell et.al. 2402.17735 translate read link
2024-02-27 Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey Dinh-Viet-Toan Le et.al. 2402.17467 translate read null
2024-02-27 An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement Tzu-Ting Yang et.al. 2402.17189 translate read null
2024-02-27 Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models Rohit Prabhavalkar et.al. 2402.17184 translate read null
2024-02-26 Towards Decoding Brain Activity During Passive Listening of Speech Milán András Fodor et.al. 2402.16996 translate read link
2024-02-26 Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods Ivan Magrin-Chagnolleau et.al. 2402.16429 translate read null
2024-02-24 ArEEG_Chars: Dataset for Envisioned Speech Recognition using EEG for Arabic Characters Hazem Darwish et.al. 2402.15733 translate read null

(<a href=../Audio_Processing.md>back to Audio Processing</a>)