LLM - 2025-05

Publish Date Title Authors PDF Translate Read Code
2025-05-30 MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning Yiqing Liang et.al. 2505.24871 translate read link
2025-05-30 SiLVR: A Simple Language-based Video Reasoning Framework Ce Zhang et.al. 2505.24869 translate read link
2025-05-30 ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Mingjie Liu et.al. 2505.24864 translate read null
2025-05-30 MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Jingyan Shen et.al. 2505.24846 translate read null
2025-05-30 Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning Wanyun Xie et.al. 2505.24844 translate read null
2025-05-30 Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck Yuwen Tan et.al. 2505.24840 translate read null
2025-05-30 VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software Brandon Man et.al. 2505.24838 translate read link
2025-05-30 Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs Juraj Vladika et.al. 2505.24830 translate read null
2025-05-30 LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text Li yunhan et.al. 2505.24826 translate read null
2025-05-30 PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models Yinggan Xu et.al. 2505.24823 translate read null
2025-05-29 Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought Yunze Man et.al. 2505.23766 translate read null
2025-05-29 From Chat Logs to Collective Insights: Aggregative Question Answering Wentao Zhang et.al. 2505.23765 translate read null
2025-05-29 MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence Sihan Yang et.al. 2505.23764 translate read null
2025-05-29 Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch Aneeshan Sain et.al. 2505.23763 translate read null
2025-05-29 Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint Heekyung Lee et.al. 2505.23759 translate read link
2025-05-29 DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning Ziyin Zhang et.al. 2505.23754 translate read link
2025-05-29 ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks Akashah Shabbir et.al. 2505.23752 translate read link
2025-05-29 Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences? Paul Gölz et.al. 2505.23749 translate read null
2025-05-29 Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Diankun Wu et.al. 2505.23747 translate read link
2025-05-29 Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time Mohamad Chehade et.al. 2505.23729 translate read null
2025-05-28 Zero-Shot Vision Encoder Grafting via LLM Surrogates Kaiyu Yue et.al. 2505.22664 translate read link
2025-05-28 AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models Feng Luo et.al. 2505.22662 translate read null
2025-05-28 GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning Qingchen Yu et.al. 2505.22661 translate read link
2025-05-28 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model Wenbo Hu et.al. 2505.22657 translate read null
2025-05-28 Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents Michael Kirchhof et.al. 2505.22655 translate read null
2025-05-28 The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Ang Lv et.al. 2505.22653 translate read link
2025-05-28 Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese Hanjia Lyu et.al. 2505.22645 translate read link
2025-05-28 Learning Composable Chains-of-Thought Fangcong Yin et.al. 2505.22635 translate read null
2025-05-28 Spatial Knowledge Graph-Guided Multimodal Synthesis Yida Xue et.al. 2505.22633 translate read null
2025-05-28 Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs Ziling Cheng et.al. 2505.22630 translate read null
2025-05-27 Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making Yihan Wang et.al. 2505.21503 translate read null
2025-05-27 Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment Xiaojun Jia et.al. 2505.21494 translate read null
2025-05-27 Reinforcing General Reasoning without Verifiers Xiangxin Zhou et.al. 2505.21493 translate read null
2025-05-27 Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming Yang Yang et.al. 2505.21486 translate read null
2025-05-27 Are Language Models Consequentialist or Deontological Moral Reasoners? Keenan Samway et.al. 2505.21479 translate read null
2025-05-27 Policy Optimized Text-to-Image Pipeline Design Uri Gadot et.al. 2505.21478 translate read null
2025-05-27 Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration Zijun Liu et.al. 2505.21471 translate read link
2025-05-27 Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance Shintaro Ozaki et.al. 2505.21458 translate read null
2025-05-27 Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO Muzhi Zhu et.al. 2505.21457 translate read null
2025-05-27 Can Large Reasoning Models Self-Train? Sheikh Shafayat et.al. 2505.21444 translate read null
2025-05-26 Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs Hanting Chen et.al. 2505.20155 translate read null
2025-05-26 UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models Xueyan Zhang et.al. 2505.20154 translate read null
2025-05-26 MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents Ziming Wei et.al. 2505.20148 translate read null
2025-05-26 FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities Jin Wang et.al. 2505.20147 translate read null
2025-05-26 StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs Jialin Yang et.al. 2505.20139 translate read null
2025-05-26 Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers Zhengliang Shi et.al. 2505.20128 translate read null
2025-05-26 Agentic AI Process Observability: Discovering Behavioral Variability Fabiana Fournier et.al. 2505.20127 translate read null
2025-05-26 TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent Dominik Meier et.al. 2505.20118 translate read null
2025-05-26 Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi’s Zibaldone Cristian Santini et.al. 2505.20113 translate read null
2025-05-26 ResSVD: Residual Compensated SVD for Large Language Model Compression Haolei Bai et.al. 2505.20112 translate read null
2025-05-26 Language-Agnostic Suicidal Risk Detection Using Large Language Models June-Woo Kim et.al. 2505.20109 translate read null
2025-05-26 Adaptive Deep Reasoning: Triggering Deep Thinking When Needed Yunhao Wang et.al. 2505.20101 translate read null
2025-05-23 Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs Wafa Alghallabi et.al. 2505.18152 translate read null
2025-05-23 First Finish Search: Efficient Test-Time Scaling in Large Language Models Aradhye Agarwal et.al. 2505.18149 translate read null
2025-05-23 Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find Owen Bianchi et.al. 2505.18148 translate read null
2025-05-23 Gaming Tool Preferences in Agentic LLMs Kazem Faghih et.al. 2505.18135 translate read link
2025-05-23 Reward Model Overoptimisation in Iterated RLHF Lorenz Wolf et.al. 2505.18126 translate read null
2025-05-23 UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification Poojah Ganesan et.al. 2505.18122 translate read null
2025-05-23 ProgRM: Build Better GUI Agents with Progress Rewards Danyang Zhang et.al. 2505.18121 translate read null
2025-05-23 Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models Jiongran Wu et.al. 2505.18120 translate read null
2025-05-23 Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM Zinuo Li et.al. 2505.18110 translate read null
2025-05-23 ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework Lisheng Huang et.al. 2505.18105 translate read null
2025-05-22 CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms Shilin Yan et.al. 2505.17020 translate read link
2025-05-22 Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework Chenhao Zhang et.al. 2505.17019 translate read link
2025-05-22 SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Kaixuan Fan et.al. 2505.17018 translate read link
2025-05-22 Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO Chengzhuo Tong et.al. 2505.17017 translate read link
2025-05-22 Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Runsen Xu et.al. 2505.17015 translate read link
2025-05-22 SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding Haoning Wu et.al. 2505.17012 translate read link
2025-05-22 R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning Huatong Song et.al. 2505.17005 translate read link
2025-05-22 Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? Jin Jiang et.al. 2505.16998 translate read link
2025-05-22 DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization Chao Zhang et.al. 2505.16995 translate read null
2025-05-22 Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding Runpeng Yu et.al. 2505.16990 translate read link
2025-05-21 The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation Patrick Kahardipraja et.al. 2505.15807 translate read null
2025-05-21 Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering Hwan Chang et.al. 2505.15805 translate read null
2025-05-21 STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs Zongzhao Li et.al. 2505.15804 translate read null
2025-05-21 VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Yuchen Yan et.al. 2505.15801 translate read null
2025-05-21 Reverse Engineering Human Preferences with Reinforcement Learning Lisa Alazraki et.al. 2505.15795 translate read null
2025-05-21 HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving Zhiwen Chen et.al. 2505.15793 translate read null
2025-05-21 Large Language Models as Computable Approximations to Solomonoff Induction Jun Wan et.al. 2505.15784 translate read null
2025-05-21 ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning Changtai Zhu et.al. 2505.15776 translate read null
2025-05-21 Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention Huanxuan Liao et.al. 2505.15774 translate read null
2025-05-21 MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling Cheng Yifan et.al. 2505.15772 translate read null
2025-05-20 Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Haolei Xu et.al. 2505.14684 translate read null
2025-05-20 UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation Rui Tian et.al. 2505.14682 translate read null
2025-05-20 UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models Xiaojie Gu et.al. 2505.14679 translate read null
2025-05-20 Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning Jiaer Xia et.al. 2505.14677 translate read null
2025-05-20 Reward Reasoning Model Jiaxin Guo et.al. 2505.14674 translate read null
2025-05-20 Quartet: Native FP4 Training Can Be Optimal for Large Language Models Roberto L. Castro et.al. 2505.14669 translate read null
2025-05-20 ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions Bufang Yang et.al. 2505.14668 translate read null
2025-05-20 Beyond Words: Multimodal LLM Knows When to Speak Zikai Liao et.al. 2505.14654 translate read null
2025-05-20 General-Reasoner: Advancing LLM Reasoning Across All Domains Xueguang Ma et.al. 2505.14652 translate read null
2025-05-20 Think Only When You Need with Large Hybrid-Reasoning Models Lingjie Jiang et.al. 2505.14631 translate read null
2025-05-19 CIE: Controlling Language Model Text Generations Using Continuous Signals Vinay Samuel et.al. 2505.13448 translate read link
2025-05-19 Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards Xiaoyuan Liu et.al. 2505.13445 translate read null
2025-05-19 Optimizing Anytime Reasoning via Budget Relative Policy Optimization Penghui Qi et.al. 2505.13438 translate read link
2025-05-19 SMOTExT: SMOTE meets Large Language Models Mateusz Bystroński et.al. 2505.13434 translate read null
2025-05-19 Fine-tuning Quantized Neural Networks with Zeroth-order Optimization Sifeng Shang et.al. 2505.13430 translate read null
2025-05-19 Understanding Complexity in VideoQA via Visual Program Generation Cristobal Eyzaguirre et.al. 2505.13429 translate read null
2025-05-19 MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Lingxiao Du et.al. 2505.13427 translate read link
2025-05-19 Learnware of Language Models: Specialized Small Language Models Can Do Big Zhi-Hao Tan et.al. 2505.13425 translate read null
2025-05-19 Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard Si-Yang Liu et.al. 2505.13421 translate read null
2025-05-19 FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning Zhuozhao Hu et.al. 2505.13419 translate read link
2025-05-16 Modeling cognitive processes of natural reading with transformer-based Language Models Bruno Bianchi et.al. 2505.11485 translate read null
2025-05-16 msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML Zhaolan Huang et.al. 2505.11483 translate read null
2025-05-16 Improving Assembly Code Performance with Large Language Models via Reinforcement Learning Anjiang Wei et.al. 2505.11480 translate read null
2025-05-16 HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages Zhilin Wang et.al. 2505.11475 translate read null
2025-05-16 Disentangling Reasoning and Knowledge in Medical Large Language Models Rahul Thapa et.al. 2505.11462 translate read null
2025-05-16 ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks Zhixiong Zhuang et.al. 2505.11459 translate read null
2025-05-16 HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation Shaina Raza et.al. 2505.11454 translate read null
2025-05-16 LLMs unlock new paths to monetizing exploits Nicholas Carlini et.al. 2505.11449 translate read null
2025-05-16 Is Compression Really Linear with Code Intelligence? Xianzhen Luo et.al. 2505.11441 translate read null
2025-05-16 GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art Chenkai Zhang et.al. 2505.11436 translate read null
2025-05-15 End-to-End Vision Tokenizer Tuning Wenxuan Wang et.al. 2505.10562 translate read null
2025-05-15 Neural Thermodynamic Laws for Large Language Model Training Ziming Liu et.al. 2505.10559 translate read null
2025-05-15 MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning Ke Wang et.al. 2505.10557 translate read link
2025-05-15 Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data Yiwen Liu et.al. 2505.10551 translate read link
2025-05-15 Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models Annie Wong et.al. 2505.10543 translate read link
2025-05-15 Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis Pengfei Wang et.al. 2505.10541 translate read link
2025-05-15 S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit Imranur Rahman et.al. 2505.10538 translate read null
2025-05-15 RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs Vibha Belavadi et.al. 2505.10495 translate read null
2025-05-15 Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective Yutao Mou et.al. 2505.10494 translate read link
2025-05-15 CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning Shaohan Wang et.al. 2505.10493 translate read null
2025-05-14 Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors Nicolas Dupuis et.al. 2505.09610 translate read null
2025-05-14 Adversarial Suffix Filtering: a Defense Pipeline for LLMs David Khachaturov et.al. 2505.09602 translate read null
2025-05-14 How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference Nidhal Jegham et.al. 2505.09598 translate read null
2025-05-14 WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models Abdullah Mushtaq et.al. 2505.09595 translate read null
2025-05-14 Variational Visual Question Answering Tobias Jan Wieczorek et.al. 2505.09591 translate read null
2025-05-14 Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media Yuchen Wu et.al. 2505.09583 translate read null
2025-05-14 Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach Shannon Lodoen et.al. 2505.09576 translate read null
2025-05-14 MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8 Linbo Liu et.al. 2505.09569 translate read null
2025-05-14 PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning Zongqian Li et.al. 2505.09519 translate read null
2025-05-14 Layered Unlearning for Adversarial Relearning Timothy Qian et.al. 2505.09500 translate read link
2025-05-13 CodePDE: An Inference Framework for LLM-driven PDE Solver Generation Shanda Li et.al. 2505.08783 translate read null
2025-05-13 HealthBench: Evaluating Large Language Models Towards Improved Human Health Rahul K. Arora et.al. 2505.08775 translate read link
2025-05-14 Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology Yatai Ji et.al. 2505.08765 translate read null
2025-05-13 AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models Yanxi Zhang et.al. 2505.08750 translate read null
2025-05-13 DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models Xiaoyang Chen et.al. 2505.08744 translate read link
2025-05-13 Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies Xiaoliang Luo et.al. 2505.08739 translate read null
2025-05-13 NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context Ben Yao et.al. 2505.08734 translate read null
2025-05-13 Securing RAG: A Risk Assessment and Mitigation Framework Lukas Ammann et.al. 2505.08728 translate read null
2025-05-13 PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts Yang Su et.al. 2505.08719 translate read null
2025-05-13 LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs K M Sajjadul Islam et.al. 2505.08704 translate read null
2025-05-12 A Comparative Analysis of Static Word Embeddings for Hungarian Máté Gedeon et.al. 2505.07809 translate read null
2025-05-12 Learning Dynamics in Continual Pre-Training for Large Language Models Xingjin Wang et.al. 2505.07796 translate read null
2025-05-12 Domain Regeneration: How well do LLMs match syntactic properties of text domains? Da Ju et.al. 2505.07784 translate read null
2025-05-12 Relative Overfitting and Accept-Reject Framework Yanxin Liu et.al. 2505.07783 translate read null
2025-05-12 MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering Rushi Qiang et.al. 2505.07782 translate read null
2025-05-12 Must Read: A Systematic Survey of Computational Persuasion Nimet Beyza Bozdag et.al. 2505.07775 translate read null
2025-05-12 Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving Xinji Mai et.al. 2505.07773 translate read link
2025-05-12 Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding Yifeng Di et.al. 2505.07768 translate read null
2025-05-12 Assessing the Chemical Intelligence of Large Language Models Nicholas T. Runcie et.al. 2505.07735 translate read null
2025-05-12 Spoken Language Understanding on Unseen Tasks With In-Context Learning Neeraj Agrawal et.al. 2505.07731 translate read null
2025-05-09 From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling Vahid Rahimzadeh et.al. 2505.06184 translate read null
2025-05-09 A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows Linjiang Cao et.al. 2505.06178 translate read null
2025-05-09 MonetGPT: Solving Puzzles Enhances MLLMs’ Image Retouching Skills Niladri Shekhar Dutt et.al. 2505.06176 translate read null
2025-05-09 Turbo-ICL: In-Context Learning-Based Turbo Equalization Zihang Song et.al. 2505.06175 translate read null
2025-05-09 A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets Ryan Lagasse et.al. 2505.06150 translate read null
2025-05-09 Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study Faeze Ghorbanpour et.al. 2505.06149 translate read null
2025-05-09 LLMs Get Lost In Multi-Turn Conversation Philippe Laban et.al. 2505.06120 translate read link
2025-05-09 Multimodal Sentiment Analysis on CMU-MOSEI Dataset using Transformer-based Models Jugal Gajjar et.al. 2505.06110 translate read null
2025-05-09 LLMs Outperform Experts on Challenging Biology Benchmarks Lennart Justen et.al. 2505.06108 translate read null
2025-05-09 Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs Sam Bush et.al. 2505.06096 translate read null
2025-05-08 Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation Chao Liao et.al. 2505.05472 translate read null
2025-05-08 Flow-GRPO: Training Flow Matching Models via Online RL Jie Liu et.al. 2505.05470 translate read link
2025-05-08 Generating Physically Stable and Buildable LEGO Designs from Text Ava Pun et.al. 2505.05469 translate read link
2025-05-08 StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant Haibo Wang et.al. 2505.05467 translate read null
2025-05-08 ComPO: Preference Alignment via Comparison Oracles Peter Chen et.al. 2505.05465 translate read null
2025-05-08 Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging Shiqi Chen et.al. 2505.05464 translate read link
2025-05-08 UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections Fatima Haouari et.al. 2505.05459 translate read null
2025-05-08 SITE: towards Spatial Intelligence Thorough Evaluation Wenqi Wang et.al. 2505.05456 translate read null
2025-05-08 Conversational Process Model Redesign Nataliia Klievtsova et.al. 2505.05453 translate read null
2025-05-08 clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations Chalamalasetti Kranti et.al. 2505.05445 translate read null
2025-05-07 EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning Zhenghao Xing et.al. 2505.04623 translate read null
2025-05-07 On Path to Multimodal Generalist: General-Level and General-Bench Hao Fei et.al. 2505.04620 translate read link
2025-05-07 OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution Lianghong Guo et.al. 2505.04606 translate read null
2025-05-08 MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection Zhihao Zhang et.al. 2505.04594 translate read null
2025-05-07 ZeroSearch: Incentivize the Search Capability of LLMs without Searching Hao Sun et.al. 2505.04588 translate read link
2025-05-07 SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions Chloe Qianhui Zhao et.al. 2505.04584 translate read null
2025-05-07 Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization Wenjun Cao et.al. 2505.04578 translate read null
2025-05-07 Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development Kuen Sum Cheung et.al. 2505.04521 translate read null
2025-05-07 Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs Yehui Tang et.al. 2505.04519 translate read null
2025-05-07 CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation Jiahao Li et.al. 2505.04481 translate read null
2025-05-06 VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model Zuwei Long et.al. 2505.03739 translate read link
2025-05-06 Graph Drawing for LLMs: An Empirical Evaluation Walter Didimo et.al. 2505.03678 translate read null
2025-05-06 Binding threshold units with artificial oscillatory neurons Vladimir Fanaskov et.al. 2505.03648 translate read null
2025-05-06 PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing Yiping Xie et.al. 2505.03621 translate read null
2025-05-06 A Unifying Bias-aware Multidisciplinary Framework for Investigating Socio-Technical Issues Sacha Hasan et.al. 2505.03593 translate read null
2025-05-06 BCause: Human-AI collaboration to improve hybrid mapping and ideation in argumentation-grounded deliberation Lucas Anastasiou et.al. 2505.03584 translate read null
2025-05-06 DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes Sergey Linok et.al. 2505.03581 translate read link
2025-05-06 LlamaFirewall: An open source guardrail system for building secure AI agents Sahana Chennabasappa et.al. 2505.03574 translate read null
2025-05-06 Say It Another Way: A Framework for User-Grounded Paraphrasing Cléa Chataigner et.al. 2505.03563 translate read null
2025-05-06 A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges Feibo Jiang et.al. 2505.03556 translate read null
2025-05-05 Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation Lu Ling et.al. 2505.02836 translate read null
2025-05-05 R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning Yi-Fan Zhang et.al. 2505.02835 translate read link
2025-05-05 ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations Dmitriy Shopkhoev et.al. 2505.02819 translate read link
2025-05-05 Towards Quantifying the Hessian Structure of Neural Networks Zhaorui Dong et.al. 2505.02809 translate read null
2025-05-05 Generating HomeAssistant Automations Using an LLM-based Chatbot Mathyas Giudici et.al. 2505.02802 translate read null
2025-05-05 HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models Zheng Lin et.al. 2505.02795 translate read null
2025-05-05 Beyond the Monitor: Mixed Reality Visualization and AI for Enhanced Digital Pathology Workflow Jai Prakash Veerla et.al. 2505.02780 translate read null
2025-05-05 Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control Nam H. Le et.al. 2505.02766 translate read null
2025-05-05 Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models Matthew Dahl et.al. 2505.02763 translate read null
2025-05-05 Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation Pons Gerard et.al. 2505.02737 translate read null
2025-05-02 Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System Sheikh Samit Muhaimin et.al. 2505.01315 translate read null
2025-05-02 Enhancing SPARQL Query Rewriting for Complex Ontology Alignments Anicet Lepetit Ondo et.al. 2505.01309 translate read null
2025-05-02 Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments Regan Bolton et.al. 2505.01307 translate read null
2025-05-02 FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing Gaoxiang Cong et.al. 2505.01263 translate read null
2025-05-02 Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research Flavio Lichtenstein et.al. 2505.01259 translate read null
2025-05-02 CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning Tsai-Ning Wang et.al. 2505.01199 translate read null
2025-05-02 LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures Francisco Aguilera-Martínez et.al. 2505.01177 translate read null
2025-05-02 Methodological Foundations for AI-Driven Survey Question Generation Ted K. Mburu et.al. 2505.01150 translate read null
2025-05-02 Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications Jiawei He et.al. 2505.01146 translate read null
2025-05-02 MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning Murtadha Ahmed et.al. 2505.01110 translate read null
2025-05-01 T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Dongzhi Jiang et.al. 2505.00703 translate read link
2025-05-01 Steering Large Language Models with Register Analysis for Arbitrary Style Transfer Xinchen Yang et.al. 2505.00679 translate read null
2025-05-01 Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions Yiming Du et.al. 2505.00675 translate read link
2025-05-01 DeepCritic: Deliberate Critique with Large Language Models Wenkai Yang et.al. 2505.00662 translate read link
2025-05-01 On the generalization of language models from in-context learning and finetuning: a controlled study Andrew K. Lampinen et.al. 2505.00661 translate read null
2025-05-01 Large Language Models Understanding: an Inherent Ambiguity Barrier Daniel N. Nissani et.al. 2505.00654 translate read null
2025-05-01 Open-Source LLM-Driven Federated Transformer for Predictive IoV Management Yazan Otoum et.al. 2505.00651 translate read null
2025-05-01 Investigating Task Arithmetic for Zero-Shot Information Retrieval Marco Braga et.al. 2505.00649 translate read null
2025-05-01 The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) Zihao Wang et.al. 2505.00626 translate read null
2025-05-01 FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation Chaitali Bhattacharyya et.al. 2505.00624 translate read null

(<a href=../LLM.md>back to LLM</a>)