Multimodal - 2024-08 | Paper Arxiv Daily

Multimodal - 2024-08

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-08-31	Comparative Analysis of Modality Fusion Approaches for Audio-Visual Person Identification and Verification	Aref Farhadipour et.al.	2409.00562	translate	read	null
2024-08-29	Toward Robust Early Detection of Alzheimer’s Disease via an Integrated Multimodal Learning Approach	Yifei Chen et.al.	2408.16343	translate	read	link
2024-08-28	Meta-Learn Unimodal Signals with Weak Supervision for Multimodal Sentiment Analysis	Sijie Mai et.al.	2408.16029	translate	read	null
2024-08-28	ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation	Tiantian Feng et.al.	2408.15803	translate	read	null
2024-08-28	Visual Prompt Engineering for Medical Vision Language Models in Radiology	Stefan Denner et.al.	2408.15802	translate	read	null
2024-08-27	The Benefits of Balance: From Information Projections to Variance Reduction	Lang Liu et.al.	2408.15065	translate	read	null
2024-08-27	NeuralOOD: Improving Out-of-Distribution Generalization Performance with Brain-machine Fusion Learning Framework	Shuangchen Zhao et.al.	2408.14950	translate	read	null
2024-08-25	Multimodal Ensemble with Conditional Feature Fusion for Dysgraphia Diagnosis in Children from Handwriting Samples	Jayakanth Kunhoth et.al.	2408.13754	translate	read	null
2024-08-24	R2G: Reasoning to Ground in 3D Scenes	Yixuan Li et.al.	2408.13499	translate	read	null
2024-08-23	Ada2I: Enhancing Modality Balance for Multimodal Conversational Emotion Recognition	Cam-Van Thi Nguyen et.al.	2408.12895	translate	read	null
2024-08-23	Has Multimodal Learning Delivered Universal Intelligence in Healthcare? A Comprehensive Survey	Qika Lin et.al.	2408.12880	translate	read	link
2024-08-23	Grounding Fallacies Misrepresenting Scientific Publications in Evidence	Max Glockner et.al.	2408.12812	translate	read	null
2024-08-22	Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models	Jean Park et.al.	2408.12763	translate	read	null
2024-08-22	Mental-Perceiver: Audio-Textual Multimodal Learning for Mental Health Assessment	Jinghui Qin et.al.	2408.12088	translate	read	null
2024-08-22	Video Emotion Open-vocabulary Recognition Based on Multimodal Large Language Model	Mengying Ge et.al.	2408.11286	translate	read	null
2024-08-21	SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition	Zebang Cheng et.al.	2408.10500	translate	read	link
2024-08-19	Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation	Liu He et.al.	2408.10453	translate	read	null
2024-08-18	Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition	Qifei Li et.al.	2408.09438	translate	read	link
2024-08-16	Multi Teacher Privileged Knowledge Distillation for Multimodal Expression Recognition	Muhammad Haseeb Aslam et.al.	2408.09035	translate	read	link
2024-08-14	Modality Invariant Multimodal Learning to Handle Missing Modalities: A Single-Branch Approach	Muhammad Saad Saeed et.al.	2408.07445	translate	read	null
2024-08-14	Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration	Xiaogen Zhon et.al.	2408.07341	translate	read	link
2024-08-14	Enhancing Visual Question Answering through Ranking-Based Hybrid Training and Multimodal Fusion	Peiyuan Chen et.al.	2408.07303	translate	read	null
2024-08-13	Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning	Jieming Bian et.al.	2408.06549	translate	read	null
2024-08-04	Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion	Shaoxu Cheng et.al.	2408.02695	translate	read	null
2024-08-06	Infusing Environmental Captions for Long-Form Video Language Grounding	Hyogun Lee et.al.	2408.02336	translate	read	null
2024-08-05	REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models	Agneet Chatterjee et.al.	2408.02231	translate	read	null
2024-08-04	CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization	Xiang He et.al.	2408.01952	translate	read	link
2024-08-02	Multimodal Fusion via Hypergraph Autoencoder and Contrastive Learning for Emotion Recognition in Conversation	Zijian Yi et.al.	2408.00970	translate	read	link
2024-08-01	The Monetisation of Toxicity: Analysing YouTube Content Creators and Controversy-Driven Engagement	Thales Bertaglia et.al.	2408.00534	translate	read	null
2024-08-02	XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training	Biao Wu et.al.	2407.19546	translate	read	link

(<a href=../Multimodal.md>back to Multimodal</a>)