Multimodal - 2024-04 | Paper Arxiv Daily

Multimodal - 2024-04

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-04-27	MediFact at MEDIQA-M3G 2024: Medical Question Answering in Dermatology with Multimodal Learning	Nadia Saeed et.al.	2405.01583	translate	read	null
2024-04-29	3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset	Xinyu Ma et.al.	2404.18413	translate	read	link
2024-04-28	LEGENT: Open Platform for Embodied Agents	Zhili Cheng et.al.	2404.18243	translate	read	null
2024-04-29	MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition	Zheng Lian et.al.	2404.17113	translate	read	link
2024-04-30	AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models	Zhiqiang Tang et.al.	2404.16233	translate	read	null
2024-04-23	Hidden in Plain Sight: Exploring the Intersections of Mental Health, Eating Disorders, and Content Moderation on TikTok	Charles Bickham et.al.	2404.15457	translate	read	null
2024-04-14	A Survey on Multimodal Wearable Sensor-based Human Action Recognition	Jianyuan Ni et.al.	2404.15349	translate	read	null
2024-04-23	Between Flat-Earthers and Fitness Coaches: Who is Citing Scientific Publications in YouTube Video Descriptions?	Olga Zagovora et.al.	2404.15083	translate	read	null
2024-04-19	Cooperative Sentiment Agents for Multimodal Sentiment Analysis	Shanmin Wang et.al.	2404.12642	translate	read	link
2024-04-18	Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities	Luciana Trinkaus Menon et.al.	2404.12251	translate	read	null
2024-04-19	TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content	Avinash Anand et.al.	2404.10305	translate	read	null
2024-04-15	AIGeN: An Adversarial Approach for Instruction Generation in VLN	Niyati Rawal et.al.	2404.10054	translate	read	null
2024-04-22	Neuro-Inspired Information-Theoretic Hierarchical Perception for Multimodal Learning	Xiongye Xiao et.al.	2404.09403	translate	read	link
2024-04-14	TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning	Quang Minh Dinh et.al.	2404.09275	translate	read	link
2024-04-13	MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild	Kateryna Chumachenko et.al.	2404.09010	translate	read	link
2024-04-12	OmniSat: Self-Supervised Modality Fusion for Earth Observation	Guillaume Astruc et.al.	2404.08351	translate	read	link
2024-04-11	Multimodal Emotion Recognition by Fusing Video Semantic in MOOC Learning Scenarios	Yuan Zhang et.al.	2404.07484	translate	read	null
2024-04-07	X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model	Jan Held et.al.	2404.06332	translate	read	null
2024-04-07	A Data-to-Product Multimodal Conceptual Framework to Achieve Automated Software Evolution for Context-rich Intelligent Applications	Songhui Yue et.al.	2404.04821	translate	read	null
2024-04-06	Interpretable Multimodal Learning for Cardiovascular Hemodynamics Assessment	Prasun C Tripathi et.al.	2404.04718	translate	read	link
2024-04-05	Mitigating Heterogeneity in Federated Multimodal Learning with Biomedical Vision-Language Pre-training	Zitao Shuai et.al.	2404.03854	translate	read	null
2024-04-02	On Stronger Computational Separations Between Multimodal and Unimodal Machine Learning	Ari Karchmer et.al.	2404.02254	translate	read	null
2024-04-01	iMD4GC: Incomplete Multimodal Data Integration to Advance Precise Treatment Response Prediction and Survival Analysis for Gastric Cancer	Fengtao Zhou et.al.	2404.01192	translate	read	link
2024-04-11	MIPS at SemEval-2024 Task 3: Multimodal Emotion-Cause Pair Extraction in Conversations with Multimodal Language Models	Zebang Cheng et.al.	2404.00511	translate	read	link
2024-04-02	Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition	R. Gnana Praveen et.al.	2403.13659	translate	read	null

(<a href=../Multimodal.md>back to Multimodal</a>)