Multimodal - 2024-05

Publish Date Title Authors PDF Translate Read Code
2024-05-31 Ovis: Structural Embedding Alignment for Multimodal Large Language Model Shiyin Lu et.al. 2405.20797 translate read null
2024-05-31 Visual Attention Analysis in Online Learning Miriam Navarro et.al. 2405.20091 translate read null
2024-05-29 Thermodynamically Informed Multimodal Learning of High-Dimensional Free Energy Models in Molecular Coarse Graining Blake R. Duschatko et.al. 2405.19386 translate read null
2024-05-29 LLMs Meet Multimodal Generation and Editing: A Survey Yingqing He et.al. 2405.19334 translate read link
2024-05-29 Exploring Exotic Decays of the Higgs Boson to Multi-Photons at the LHC via Multimodal Learning Approaches A. Hammad et.al. 2405.18834 translate read null
2024-05-28 RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives Jaehong Yoon et.al. 2405.18406 translate read link
2024-05-28 MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance Yake Wei et.al. 2405.17730 translate read link
2024-05-27 Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning Zihua Zhao et.al. 2405.16996 translate read null
2024-05-27 Multilingual Diversity Improves Vision-Language Representations Thao Nguyen et.al. 2405.16915 translate read null
2024-05-27 Hawk: Learning to Understand Open-World Video Anomalies Jiaqi Tang et.al. 2405.16886 translate read link
2024-05-24 Shopping Queries Image Dataset (SQID): An Image-Enriched ESCI Dataset for Exploring Multimodal Learning in Product Search Marie Al Ghossein et.al. 2405.15190 translate read link
2024-05-23 TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing Teng Xu et.al. 2405.14455 translate read null
2024-05-22 Grounding Toxicity in Real-World Events across Languages Wondimagegnhue Tsegaye Tufa et.al. 2405.13754 translate read link
2024-05-21 A Survey of Robotic Language Grounding: Tradeoffs Between Symbols and Embeddings Vanya Cohen et.al. 2405.13245 translate read null
2024-05-21 Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition R Gnana Praveen et.al. 2405.12853 translate read null
2024-05-21 Scientific discourse on YouTube: Motivations for citing research in comments Sören Striewski et.al. 2405.12798 translate read null
2024-05-21 Amplifying Academic Research through YouTube: Engagement Metrics as Predictors of Citation Impact Olga Zagovora et.al. 2405.12734 translate read null
2024-05-21 A Multimodal Learning-based Approach for Autonomous Landing of UAV Francisco Neves et.al. 2405.12681 translate read null
2024-05-21 Mutual Information Analysis in Multimodal Learning Systems Hadi Hadizadeh et.al. 2405.12456 translate read null
2024-05-16 Grounded 3D-LLM with Referent Tokens Yilun Chen et.al. 2405.10370 translate read link
2024-05-13 Improving Multimodal Learning with Multi-Loss Gradient Modulation Konstantinos Kontras et.al. 2405.07930 translate read link
2024-05-13 Generating Human Motion in 3D Scenes from Text Descriptions Zhi Cen et.al. 2405.07784 translate read null
2024-05-13 An Efficient Multimodal Learning Framework to Comprehend Consumer Preferences Using BERT and Cross-Attention Junichiro Niimi et.al. 2405.07435 translate read null
2024-05-10 A First Step in Using Machine Learning Methods to Enhance Interaction Analysis for Embodied Learning Environments Joyce Fonteles et.al. 2405.06203 translate read null
2024-05-09 Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training Sheng Yan et.al. 2405.05523 translate read null
2024-05-08 Empathy Through Multimodality in Conversational Interfaces Mahyar Abbasian et.al. 2405.04777 translate read null
2024-05-08 All in One Framework for Multimodal Re-identification in the Wild He Li et.al. 2405.04741 translate read null
2024-05-07 Interpretable Tensor Fusion Saurabh Varshneya et.al. 2405.04671 translate read null
2024-05-03 Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum Tao Meng et.al. 2404.17862 translate read null

(<a href=../Multimodal.md>back to Multimodal</a>)