Multimodal - 2024-03 | Paper Arxiv Daily

Multimodal - 2024-03

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-03-30	UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause	Guimin Hu et.al.	2404.00403	translate	read	null
2024-03-28	IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation	Jiacui Huang et.al.	2403.19336	translate	read	null
2024-03-26	Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation	Abdelrhman Werby et.al.	2403.17846	translate	read	null
2024-03-26	Project MOSLA: Recording Every Moment of Second Language Acquisition	Masato Hagiwara et.al.	2403.17314	translate	read	null
2024-03-17	A Survey of IMU Based Cross-Modal Transfer Learning in Human Activity Recognition	Abhi Kamboj et.al.	2403.15444	translate	read	null
2024-03-22	Contrastive Learning on Multimodal Analysis of Electronic Health Records	Tianxi Cai et.al.	2403.14926	translate	read	null
2024-03-20	Grounding Spatial Relations in Text-Only Language Models	Gorka Azkune et.al.	2403.13666	translate	read	link
2024-03-20	VL-Mamba: Exploring State Space Models for Multimodal Learning	Yanyuan Qiao et.al.	2403.13600	translate	read	null
2024-03-17	From Pixels to Predictions: Spectrogram and Vision Transformer for Better Time Series Forecasting	Zhen Zeng et.al.	2403.11047	translate	read	null
2024-03-26	Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity	Zhuo Zhi et.al.	2403.09428	translate	read	link
2024-03-14	Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation	Daniel Honerkamp et.al.	2403.08605	translate	read	link
2024-03-12	A Multimodal Intermediate Fusion Network with Manifold Learning for Stress Detection	Morteza Bodaghi et.al.	2403.08077	translate	read	null
2024-03-10	WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs	Deshun Yang et.al.	2403.07944	translate	read	null
2024-03-25	FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks	Muhammad Saif Ullah Khan et.al.	2403.06904	translate	read	null
2024-03-11	DiaLoc: An Iterative Approach to Embodied Dialog Localization	Chao Zhang et.al.	2403.06846	translate	read	null
2024-03-11	Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement	Che Liu et.al.	2403.06659	translate	read	link
2024-03-07	A Modular End-to-End Multimodal Learning Method for Structured and Unstructured Data	Marco D Alessandro et.al.	2403.04866	translate	read	link
2024-03-05	JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models	Arefa et.al.	2403.04798	translate	read	link
2024-03-07	CLIP the Bias: How Useful is Balancing Data in Multimodal Learning?	Ibrahim Alabdulmohsin et.al.	2403.04547	translate	read	null
2024-03-04	Reactive Programming without Functions	Bjarno Oeyen et.al.	2403.02296	translate	read	null
2024-03-03	Hyperspectral Image Analysis in Single-Modal and Multimodal setting using Deep Learning Techniques	Shivam Pande et.al.	2403.01546	translate	read	null
2024-03-02	ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation	Moran Yanuka et.al.	2403.01306	translate	read	link
2024-03-02	Adversarial Testing for Visual Grounding via Image-Aware Property Reduction	Zhiyuan Chang et.al.	2403.01118	translate	read	null

(<a href=../Multimodal.md>back to Multimodal</a>)