LLM - 2025-03

Publish Date Title Authors PDF Translate Read Code
2025-03-31 Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Shengqiong Wu et.al. 2503.24379 translate read link
2025-03-31 Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Rui Wang et.al. 2503.24377 translate read link
2025-03-31 Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 Yi Chen et.al. 2503.24376 translate read link
2025-03-31 Effectively Controlling Reasoning Models through Thinking Intervention Tong Wu et.al. 2503.24370 translate read null
2025-03-31 ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion Rana Muhammad Shahroz Khan et.al. 2503.24354 translate read null
2025-03-31 BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models Alok Abhishek et.al. 2503.24310 translate read null
2025-03-31 A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG Arshia Kermani et.al. 2503.24307 translate read null
2025-03-31 Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning Jiacheng Lin et.al. 2503.24289 translate read link
2025-03-31 Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality Sewoong Lee et.al. 2503.24277 translate read link
2025-03-31 Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation Dun Yuan et.al. 2503.24245 translate read null
2025-03-28 Q-Insight: Understanding Image Quality via Visual Reinforcement Learning Weiqi Li et.al. 2503.22679 translate read link
2025-03-28 QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? Belinda Z. Li et.al. 2503.22674 translate read link
2025-03-28 Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers Francesca Pezzuti et.al. 2503.22672 translate read link
2025-03-28 Unicorn: Text-Only Data Synthesis for Vision Language Model Training Xiaomin Yu et.al. 2503.22655 translate read link
2025-03-28 Sentiment Classification of Thai Central Bank Press Releases Using Supervised Learning Stefano Grassi et.al. 2503.22629 translate read null
2025-03-28 Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users Antonia Karamolegkou et.al. 2503.22610 translate read null
2025-03-28 On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact – A Case Study on Expert Statements from the Science Media Center Germany Dirk Tunger et.al. 2503.22594 translate read null
2025-03-28 LLM-enabled Instance Model Generation Fengjunjie Pan et.al. 2503.22587 translate read null
2025-03-28 Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish Kevin Cohen et.al. 2503.22585 translate read link
2025-03-28 Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation Sarubi Thillainathan et.al. 2503.22582 translate read null
2025-03-27 Video-R1: Reinforcing Video Reasoning in MLLMs Kaituo Feng et.al. 2503.21776 translate read link
2025-03-27 LOCORE: Image Re-ranking with Long-Context Sequence Modeling Zilin Xiao et.al. 2503.21772 translate read link
2025-03-27 MemInsight: Autonomous Memory Augmentation for LLM Agents Rana Salama et.al. 2503.21760 translate read null
2025-03-27 Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck Adrian Bulat et.al. 2503.21757 translate read null
2025-03-27 LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis Shitian Zhao et.al. 2503.21749 translate read link
2025-03-27 CTRL-O: Language-Controllable Object-Centric Visual Representation Learning Aniket Didolkar et.al. 2503.21747 translate read null
2025-03-27 GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics Arsham Gholamzadeh Khoee et.al. 2503.21735 translate read null
2025-03-27 Effective Skill Unlearning through Intervention and Abstention Yongce Li et.al. 2503.21730 translate read link
2025-03-27 Collab: Controlled Decoding using Mixture of Agents for LLM Alignment Souradip Chakraborty et.al. 2503.21720 translate read null
2025-03-27 Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs Boyang Yang et.al. 2503.21710 translate read null
2025-03-26 Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark Sondos Mahmoud Bsharat et.al. 2503.20786 translate read link
2025-03-26 Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields Shijie Zhou et.al. 2503.20776 translate read null
2025-03-26 MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams Yanpeng Sun et.al. 2503.20745 translate read null
2025-03-26 Dynamic Motion Blending for Versatile Motion Editing Nan Jiang et.al. 2503.20724 translate read null
2025-03-26 From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models Nikita Neveditsin et.al. 2503.20715 translate read null
2025-03-27 Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy Yinan Sun et.al. 2503.20673 translate read null
2025-03-26 TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews Huimin Xu et.al. 2503.20666 translate read null
2025-03-26 Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging Han Wu et.al. 2503.20641 translate read link
2025-03-26 Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions Alessandro Maisto et.al. 2503.20623 translate read null
2025-03-26 What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond Wenchao Gu et.al. 2503.20589 translate read null
2025-03-25 CoLLM: A Large Language Model for Composed Image Retrieval Chuong Huynh et.al. 2503.19910 translate read link
2025-03-25 A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design Jie Tian et.al. 2503.19889 translate read null
2025-03-25 CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation Nengbo Wang et.al. 2503.19878 translate read null
2025-03-25 SLA-Awareness for AI-assisted coding Kishanthan Thangarajah et.al. 2503.19876 translate read null
2025-03-25 Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Xiaoyu Tian et.al. 2503.19855 translate read link
2025-03-25 Towards Online Multi-Modal Social Interaction Understanding Xinpeng Li et.al. 2503.19851 translate read null
2025-03-25 FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs Carlos Plou et.al. 2503.19850 translate read null
2025-03-25 A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 Zhao Fang et.al. 2503.19844 translate read null
2025-03-25 SemEval-2025 Task 9: The Food Hazard Detection Challenge Korbinian Randl et.al. 2503.19800 translate read null
2025-03-25 PAVE: Patching and Adapting Video Large Language Models Zhuoming Liu et.al. 2503.19794 translate read link
2025-03-24 SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding Mingze Xu et.al. 2503.18943 translate read null
2025-03-24 Video-T1: Test-Time Scaling for Video Generation Fangfu Liu et.al. 2503.18942 translate read link
2025-03-24 Exploring Training and Inference Scaling Laws in Generative Retrieval Hongru Cai et.al. 2503.18941 translate read null
2025-03-24 Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training Brian R. Bartoldson et.al. 2503.18929 translate read link
2025-03-24 FFN Fusion: Rethinking Sequential Computation in Large Language Models Akhiad Bercovich et.al. 2503.18908 translate read null
2025-03-24 xKV: Cross-Layer SVD for KV-Cache Compression Chi-Chih Chang et.al. 2503.18893 translate read link
2025-03-24 AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration Zhexuan Wang et.al. 2503.18891 translate read null
2025-03-24 Toward building next-generation Geocoding systems: a systematic review Zhengcong Yin et.al. 2503.18888 translate read null
2025-03-24 I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Andrey Galichin et.al. 2503.18878 translate read link
2025-03-24 Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design Rui Xie et.al. 2503.18869 translate read null
2025-03-21 Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique Yansi Li et.al. 2503.17363 translate read null
2025-03-21 OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Yihe Deng et.al. 2503.17352 translate read link
2025-03-21 Capturing Individual Human Preferences with Reward Features André Barreto et.al. 2503.17338 translate read null
2025-03-21 Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs Reem Gody et.al. 2503.17336 translate read null
2025-03-21 CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities Yuxuan Zhu et.al. 2503.17332 translate read link
2025-03-21 LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language Kun Chu et.al. 2503.17309 translate read null
2025-03-21 Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests John Naulty et.al. 2503.17302 translate read null
2025-03-21 CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement Gaifan Zhang et.al. 2503.17279 translate read null
2025-03-21 SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging Aladin Djuhera et.al. 2503.17239 translate read null
2025-03-21 FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs Albert Sawczyn et.al. 2503.17229 translate read null
2025-03-20 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Yang Sui et.al. 2503.16419 translate read link
2025-03-20 The Emperor’s New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination Yifan Sun et.al. 2503.16402 translate read null
2025-03-20 Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them Guanyu Chen et.al. 2503.16401 translate read null
2025-03-20 Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation Yijia Luo et.al. 2503.16385 translate read link
2025-03-20 LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images Leyang Wang et.al. 2503.16376 translate read null
2025-03-20 CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners Yunzhi Yao et.al. 2503.16356 translate read link
2025-03-20 LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates Ying Shen et.al. 2503.16334 translate read null
2025-03-20 OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence Long Yuan et.al. 2503.16326 translate read null
2025-03-20 Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 Peiran Gu et.al. 2503.16304 translate read null
2025-03-20 Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens Shuqi Lu et.al. 2503.16278 translate read link
2025-03-19 SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks Yifei Zhou et.al. 2503.15478 translate read link
2025-03-19 Cube: A Roblox View of 3D Intelligence Foundation AI Team et.al. 2503.15475 translate read link
2025-03-19 From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment Jia-Nan Li et.al. 2503.15463 translate read null
2025-03-19 Visual Position Prompt for MLLM based Visual Grounding Wei Tang et.al. 2503.15426 translate read link
2025-03-19 Probing the topology of the space of tokens with structured prompts Michael Robinson et.al. 2503.15421 translate read null
2025-03-19 EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models Yinan Liang et.al. 2503.15369 translate read null
2025-03-19 SemEval-2025 Task 1: AdMIRe – Advancing Multimodal Idiomaticity Representation Thomas Pickard et.al. 2503.15358 translate read null
2025-03-19 SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models I-Fan Lin et.al. 2503.15351 translate read null
2025-03-19 TruthLens:A Training-Free Paradigm for DeepFake Detection Ritabrata Chakraborty et.al. 2503.15342 translate read null
2025-03-19 Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs Yuqi Zhu et.al. 2503.15341 translate read null
2025-03-18 Aligning Multimodal LLM with Human Preference: A Survey Tao Yu et.al. 2503.14504 translate read null
2025-03-18 Engineering Scientific Assistants using Interactive Structured Induction of Programs Shraddha Surana et.al. 2503.14488 translate read null
2025-03-18 Gricean Norms as a Basis for Effective Collaboration Fardin Saad et.al. 2503.14484 translate read null
2025-03-18 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM Xinyu Fang et.al. 2503.14478 translate read link
2025-03-18 EnvBench: A Benchmark for Automated Environment Setup Aleksandra Eliseeva et.al. 2503.14443 translate read link
2025-03-18 LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers Nikhil Abhyankar et.al. 2503.14434 translate read link
2025-03-18 PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play Wei Fang et.al. 2503.14432 translate read null
2025-03-18 Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models Siwei Zhang et.al. 2503.14411 translate read null
2025-03-18 Large Language Models for Virtual Human Gesture Selection Parisa Ghanad Torshizi et.al. 2503.14408 translate read null
2025-03-18 From “Hallucination” to “Suture”: Insights from Language Philosophy to Enhance Large Language Models Qiantong Wang et.al. 2503.14392 translate read null
2025-03-17 MetaScale: Test-Time Scaling with Evolving Meta-Thoughts Qin Liu et.al. 2503.13447 translate read null
2025-03-17 Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance Noah Y. Siegel et.al. 2503.13445 translate read null
2025-03-17 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning Ye Liu et.al. 2503.13444 translate read null
2025-03-17 xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference Maximilian Beck et.al. 2503.13427 translate read null
2025-03-17 A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives Weiqiang Jin et.al. 2503.13415 translate read null
2025-03-17 DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective Dengyun Peng et.al. 2503.13413 translate read null
2025-03-17 Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis Alexander Ku et.al. 2503.13401 translate read null
2025-03-17 MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research James Burgess et.al. 2503.13399 translate read null
2025-03-17 Scale Efficient Training for Large Datasets Qing Zhou et.al. 2503.13385 translate read null
2025-03-17 Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning Mengyao Lyu et.al. 2503.13383 translate read null
2025-03-14 ASMA-Tune: Unlocking LLMs’ Assembly Code Comprehension via Structural-Semantic Instruction Tuning Xinyi Wang et.al. 2503.11617 translate read null
2025-03-14 Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space Zhiliang Chen et.al. 2503.11586 translate read null
2025-03-14 Synthesizing Access Control Policies using Large Language Models Adarsh Vatsa et.al. 2503.11573 translate read null
2025-03-14 Implicit Bias-Like Patterns in Reasoning Models Messi H. J. Lee et.al. 2503.11572 translate read null
2025-03-14 VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity Jing Bi et.al. 2503.11557 translate read null
2025-03-14 Potential of large language model-powered nudges for promoting daily water and energy conservation Zonghan Li et.al. 2503.11531 translate read null
2025-03-14 HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models Ziqin Zhou et.al. 2503.11513 translate read null
2025-03-14 V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning Zixu Cheng et.al. 2503.11495 translate read null
2025-03-14 A Review of DeepSeek Models’ Key Innovative Techniques Chengen Wang et.al. 2503.11486 translate read null
2025-03-14 T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation Seyed Mohammad Hadi Hosseini et.al. 2503.11481 translate read null
2025-03-13 GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Rongyao Fang et.al. 2503.10639 translate read link
2025-03-13 HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model Jiaming Liu et.al. 2503.10631 translate read null
2025-03-13 UniGoal: Towards Universal Zero-shot Goal-oriented Navigation Hang Yin et.al. 2503.10630 translate read null
2025-03-13 DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding Ayesha Ishaq et.al. 2503.10621 translate read link
2025-03-13 From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM Kshitij Ambilduke et.al. 2503.10620 translate read null
2025-03-13 Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search Andy Zhou et.al. 2503.10619 translate read null
2025-03-13 Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models Andy Zhou et.al. 2503.10617 translate read null
2025-03-13 R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization Yi Yang et.al. 2503.10615 translate read link
2025-03-13 CoSTA $\ast$ : Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Advait Gupta et.al. 2503.10613 translate read link
2025-03-13 TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention Jinhao Duan et.al. 2503.10602 translate read link
2025-03-12 MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System Jihao Zhao et.al. 2503.09600 translate read null
2025-03-12 How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation Ruohao Guo et.al. 2503.09598 translate read null
2025-03-12 SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment Katrin Renz et.al. 2503.09594 translate read null
2025-03-12 BIMBA: Selective-Scan Compression for Long-Range Video Question Answering Md Mohaiminul Islam et.al. 2503.09590 translate read null
2025-03-12 Cost-Optimal Grouped-Query Attention for Long-Context LLMs Yingfa Chen et.al. 2503.09579 translate read link
2025-03-12 Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks Lutfi Eren Erdogan et.al. 2503.09572 translate read null
2025-03-12 Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models Qiguang Chen et.al. 2503.09567 translate read null
2025-03-12 Large Language Models for Multi-Facility Location Mechanism Design Nguyen Thach et.al. 2503.09533 translate read null
2025-03-12 Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Bowen Jin et.al. 2503.09516 translate read null
2025-03-12 ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning Ziyu Wan et.al. 2503.09501 translate read null
2025-03-11 Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs Ariba Khan et.al. 2503.08688 translate read null
2025-03-11 OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models Jialv Zou et.al. 2503.08686 translate read null
2025-03-11 Self-Taught Self-Correction for Small Language Models Viktor Moskvoretskii et.al. 2503.08681 translate read null
2025-03-11 Exploring the Word Sense Disambiguation Capabilities of Large Language Models Pierpaolo Basile et.al. 2503.08662 translate read null
2025-03-11 LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Xianfeng Wu et.al. 2503.08619 translate read null
2025-03-11 EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments Dongping Li et.al. 2503.08604 translate read null
2025-03-11 NSF-SciFy: Mining the NSF Awards Database for Scientific Claims Delip Rao et.al. 2503.08600 translate read null
2025-03-11 HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding Shehreen Azad et.al. 2503.08585 translate read null
2025-03-11 RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding Xichen Tan et.al. 2503.08576 translate read null
2025-03-11 DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process Minjun Zhu et.al. 2503.08569 translate read null
2025-03-10 Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru Dunant Cusipuma et.al. 2503.07587 translate read null
2025-03-10 Talking to GDELT Through Knowledge Graphs Audun Myers et.al. 2503.07584 translate read null
2025-03-10 AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning Yangzhe Kong et.al. 2503.07557 translate read null
2025-03-10 Junior Software Developers’ Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review Samuel Ferino et.al. 2503.07556 translate read null
2025-03-10 KSOD: Knowledge Supplement for LLMs On Demand Haoran Li et.al. 2503.07550 translate read null
2025-03-10 Bi-Directional Mental Model Reconciliation for Human-Robot Interaction with Large Language Models Nina Moorman et.al. 2503.07547 translate read null
2025-03-10 Queueing, Predictions, and LLMs: Challenges and Open Problems Michael Mitzenmacher et.al. 2503.07545 translate read null
2025-03-10 XIFBench: Evaluating Large Language Models on Multilingual Instruction Following Zhenyu Li et.al. 2503.07539 translate read null
2025-03-10 TokenButler: Token Importance is Predictable Yash Akhauri et.al. 2503.07518 translate read null
2025-03-10 Language Models Fail to Introspect About Their Knowledge of Language Siyuan Song et.al. 2503.07513 translate read null
2025-03-10 LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition? Bangyan Li et.al. 2503.07487 translate read null
2025-03-10 GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models Ryugo Morita et.al. 2503.07463 translate read null
2025-03-10 MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning Xiangru Tang et.al. 2503.07459 translate read null
2025-03-10 LLMs syntactically adapt their language use to their conversational partner Florian Kandra et.al. 2503.07457 translate read null
2025-03-10 From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development – An Opinion Paper Sargam Yadav et.al. 2503.07450 translate read null
2025-03-10 From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics Jaewook Lee et.al. 2503.07429 translate read null
2025-03-10 RePO: ReLU-based Preference Optimization Junkang Wu et.al. 2503.07426 translate read null
2025-03-10 REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding Yan Tai et.al. 2503.07413 translate read link
2025-03-10 Revisiting Noise in Natural Language Processing for Computational Social Science Nadav Borenstein et.al. 2503.07395 translate read null
2025-03-10 Process-Supervised LLM Recommenders via Flow-guided Tuning Chongming Gao et.al. 2503.07377 translate read null
2025-03-07 Understanding the Limits of Lifelong Knowledge Editing in LLMs Lukas Thede et.al. 2503.05683 translate read null
2025-03-07 A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval Yu Zhang et.al. 2503.05659 translate read null
2025-03-07 Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings Xuanqing Liu et.al. 2503.05620 translate read null
2025-03-07 A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models Dong Shu et.al. 2503.05613 translate read null
2025-03-07 R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning Huatong Song et.al. 2503.05592 translate read null
2025-03-07 Evaluating open-source Large Language Models for automated fact-checking Nicolo’ Fontana et.al. 2503.05565 translate read null
2025-03-07 Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance Bryan Etzine et.al. 2503.05551 translate read null
2025-03-07 Leveraging Approximate Caching for Faster Retrieval-Augmented Generation Shai Bergman et.al. 2503.05530 translate read null
2025-03-07 PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs Roberto Cerina et.al. 2503.05529 translate read null
2025-03-07 Cognitive Bias Detection Using Advanced Prompt Engineering Frederic Lemieux et.al. 2503.05516 translate read null
2025-03-06 L $^2$ M: Mutual Information Scaling Law for Long-Context Language Modeling Zhuo Chen et.al. 2503.04725 translate read null
2025-03-06 Shifting Long-Context LLMs Research from Input to Output Yuhao Wu et.al. 2503.04723 translate read null
2025-03-06 Enough Coin Flips Can Make LLMs Act Bayesian Ritwik Gupta et.al. 2503.04722 translate read null
2025-03-06 Predictable Scale: Part I – Optimal Hyperparameter Scaling Law in Large Language Model Pretraining Houyi Li et.al. 2503.04715 translate read null
2025-03-06 Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size Alireza Behtash et.al. 2503.04704 translate read null
2025-03-06 UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets Wenyu Wang et.al. 2503.04693 translate read null
2025-03-06 Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases Pengcheng Qiu et.al. 2503.04691 translate read null
2025-03-06 LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue Sangyeop Kim et.al. 2503.04675 translate read null
2025-03-06 RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining Tengfei Zhang et.al. 2503.04653 translate read null
2025-03-06 Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment Wen Yang et.al. 2503.04647 translate read null
2025-03-05 The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems Richard Ren et.al. 2503.03750 translate read null
2025-03-05 Process-based Self-Rewarding Language Models Shimao Zhang et.al. 2503.03746 translate read null
2025-03-05 Towards Understanding Distilled Reasoning Models: A Representational Approach David D. Baek et.al. 2503.03730 translate read null
2025-03-05 Improving LLM Safety Alignment with Dual-Objective Optimization Xuandong Zhao et.al. 2503.03710 translate read null
2025-03-05 Effective LLM Knowledge Learning via Model Generalization Mingkang Zhu et.al. 2503.03705 translate read null
2025-03-05 A Practical Memory Injection Attack against LLM Agents Shen Dong et.al. 2503.03704 translate read null
2025-03-05 Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models Jiyue Jiang et.al. 2503.03702 translate read null
2025-03-05 Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks Zihao Zhao et.al. 2503.03687 translate read null
2025-03-05 Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models Bar Karov et.al. 2503.03669 translate read null
2025-03-05 Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction Gustaw Opiełka et.al. 2503.03666 translate read null
2025-03-04 Wikipedia in the Era of LLMs: Evolution and Risks Siming Huang et.al. 2503.02879 translate read null
2025-03-04 The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models Ke Ji et.al. 2503.02875 translate read null
2025-03-04 Prompting Generative AI with Interaction-Augmented Instructions Leixian Shen et.al. 2503.02874 translate read null
2025-03-04 FairSense-AI: Responsible AI Meets Sustainability Shaina Raza et.al. 2503.02865 translate read null
2025-03-04 Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework Ziang Zhou et.al. 2503.02863 translate read null
2025-03-04 Privacy and Accuracy-Aware AI/ML Model Deduplication Hong Guan et.al. 2503.02862 translate read null
2025-03-04 Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs’ Decoding Layers Zicong He et.al. 2503.02851 translate read null
2025-03-04 Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs Yuzhe Gu et.al. 2503.02846 translate read null
2025-03-04 AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation Songming Zhang et.al. 2503.02832 translate read null
2025-03-04 Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Nathan Godey et.al. 2503.02812 translate read null
2025-03-03 ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer Omer Goldman et.al. 2502.21228 translate read null

(<a href=../LLM.md>back to LLM</a>)