Audio Processing - 2024-02
Audio Processing - 2024-02
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-02-29 | Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems | Quentin Raymondaud et.al. | 2402.19443 | translate | read | null |
| 2024-02-29 | Unraveling Adversarial Examples against Speaker Identification – Techniques for Attack Detection and Victim Model Classification | Sonal Joshi et.al. | 2402.19355 | translate | read | null |
| 2024-02-29 | Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data | Takaaki Saeki et.al. | 2402.18932 | translate | read | null |
| 2024-02-29 | Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition | Jeehyun Lee et.al. | 2402.18923 | translate | read | null |
| 2024-02-29 | Investigation of Adapter for Automatic Speech Recognition in Noisy Environment | Hao Shi et.al. | 2402.18275 | translate | read | null |
| 2024-02-28 | Multilingual Speech Models for Automatic Speech Recognition Exhibit Gender Performance Gaps | Giuseppe Attanasio et.al. | 2402.17954 | translate | read | link |
| 2024-02-24 | ByteComposer: a Human-like Melody Composition Method based on Language Model Agent | Xia Liang et.al. | 2402.17785 | translate | read | null |
| 2024-02-27 | High-Fidelity Neural Phonetic Posteriorgrams | Cameron Churchwell et.al. | 2402.17735 | translate | read | link |
| 2024-02-27 | Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey | Dinh-Viet-Toan Le et.al. | 2402.17467 | translate | read | null |
| 2024-02-27 | An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement | Tzu-Ting Yang et.al. | 2402.17189 | translate | read | null |
| 2024-02-27 | Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models | Rohit Prabhavalkar et.al. | 2402.17184 | translate | read | null |
| 2024-02-26 | Towards Decoding Brain Activity During Passive Listening of Speech | Milán András Fodor et.al. | 2402.16996 | translate | read | link |
| 2024-02-26 | Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods | Ivan Magrin-Chagnolleau et.al. | 2402.16429 | translate | read | null |
| 2024-02-24 | ArEEG_Chars: Dataset for Envisioned Speech Recognition using EEG for Arabic Characters | Hazem Darwish et.al. | 2402.15733 | translate | read | null |
(<a href=../Audio_Processing.md>back to Audio Processing</a>)