LLM - 2026-02

Publish Date Title Authors PDF Translate Read Code
2026-02-28 Learning Nested Named Entity Recognition from Flat Annotations Igor Rozhkov et.al. 2603.00840 translate read null
2026-02-28 Constitutional Black-Box Monitoring for Scheming in LLM Agents Simon Storf et.al. 2603.00829 translate read null
2026-02-28 A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations Hossein Javidnia et.al. 2603.00824 translate read null
2026-02-28 A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction Ruihao Pan et.al. 2603.00823 translate read null
2026-02-28 ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files Reshabh K Sharma et.al. 2603.00822 translate read null
2026-02-28 From Dyads to Groups: Rethinking Emotional Support with Conversational AI Yuqing Hu et.al. 2603.00797 translate read null
2026-02-28 Identifying the Geographic Foci of US Local News Gangani Ariyarathne et.al. 2603.00787 translate read null
2026-02-28 Structure Matters: Evaluating Multi-Agents Orchestration in Generative Therapeutic Chatbots Sina Elahimanesh et.al. 2603.00774 translate read null
2026-02-28 LLM-Powered Automatic Theorem Proving and Synthesis for Hybrid Systems and Game Aditi Kabra et.al. 2603.00737 translate read null
2026-02-28 RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models Andrew Zhuoer Feng et.al. 2603.00724 translate read null
2026-02-28 MARS: Harmonizing Multimodal Convergence via Adaptive Rank Search Minkyoung Cho et.al. 2603.00720 translate read null
2026-02-28 DRIV-EX: Counterfactual Explanations for Driving LLMs Amaia Cardiel et.al. 2603.00696 translate read null
2026-02-28 Wild-Drive: Off-Road Scene Captioning and Path Planning via Robust Multi-modal Routing and Efficient Large Language Model Zihang Wang et.al. 2603.00694 translate read null
2026-02-28 RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis Andrew Zhuoer Feng et.al. 2603.00686 translate read null
2026-02-28 Stateful Cross-layer Vision Modulation Ying Liu et.al. 2603.00655 translate read null
2026-02-28 Historian: Reducing Manual Validation in APR Benchmarking via Evidence-Based Assessment Sahand Moslemi et.al. 2603.00649 translate read null
2026-02-28 RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation Jin Zeng et.al. 2603.00638 translate read null
2026-02-28 TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces Shu-Xun Yang et.al. 2603.00623 translate read null
2026-02-28 PlantWhisperer: Designing Conversational AI to Support Plant Care Daniel Mejer Christensen et.al. 2603.00598 translate read null
2026-02-28 UNICBench: UNIfied Counting Benchmark for MLLM Chenggang Rong et.al. 2603.00595 translate read null
2026-02-28 Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs Yiran Zhao et.al. 2603.00590 translate read null
2026-02-28 Energy-Efficient Information Representation in MNIST Classification Using Biologically Inspired Learning Patrick Stricker et.al. 2603.00588 translate read null
2026-02-28 Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research Yubo Dong et.al. 2603.00582 translate read null
2026-02-28 CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging Jie Cao et.al. 2603.00573 translate read null
2026-02-28 MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs Yilian Liu et.al. 2603.00565 translate read null
2026-02-28 Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation Zeyu Chen et.al. 2603.00546 translate read null
2026-02-28 LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks Yucheng Zeng et.al. 2603.00540 translate read null
2026-02-28 Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement Haolin Jin et.al. 2603.00539 translate read null
2026-02-28 CaptionFool: Universal Image Captioning Model Attacks Swapnil Parekh et.al. 2603.00529 translate read null
2026-02-28 ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data Haodong Zhao et.al. 2603.00516 translate read null
2026-02-28 MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence Xingyilang Yin et.al. 2603.00515 translate read null
2026-02-28 Multimodal Adaptive Retrieval Augmented Generation through Internal Representation Learning Ruoshuang Du et.al. 2603.00511 translate read null
2026-02-28 What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models Yingqi Fan et.al. 2603.00510 translate read null
2026-02-28 M $^2$ : Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight Retrieval Dawei Yan et.al. 2603.00503 translate read null
2026-02-28 WirelessAgent++: Automated Agentic Workflow Design and Benchmarking for Wireless Networks Jingwen Tong et.al. 2603.00501 translate read null
2026-02-28 Zero-Shot Robotic Manipulation via 3D Gaussian Splatting-Enhanced Multimodal Retrieval-Augmented Generation Zilong Xie et.al. 2603.00500 translate read null
2026-02-28 Antibody: Strengthening Defense Against Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Gradient Influence Quoc Minh Nguyen et.al. 2603.00498 translate read null
2026-02-28 LifeEval: A Multimodal Benchmark for Assistive AI in Egocentric Daily Life Tasks Hengjian Gao et.al. 2603.00490 translate read null
2026-02-28 Does My README File Need To Be Updated? Exploring LLM-Based README Maintenance Haoyu Gao et.al. 2603.00489 translate read null
2026-02-28 Wireless Power Control Based on Large Language Models Jiacheng Wang et.al. 2603.00474 translate read null
2026-02-28 Optimizing In-Context Demonstrations for LLM-based Automated Grading Yucheng Chu et.al. 2603.00465 translate read null
2026-02-28 MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval Shuheng Chen et.al. 2603.00460 translate read null
2026-02-28 Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training Xi Wang et.al. 2603.00454 translate read null
2026-02-28 Confusion-Aware Rubric Optimization for LLM-based Automated Grading Yucheng Chu et.al. 2603.00451 translate read null
2026-02-28 SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment Zhuoran Zhao et.al. 2603.00443 translate read null
2026-02-28 ROKA: Robust Knowledge Unlearning against Adversaries Jinmyeong Shin et.al. 2603.00436 translate read null
2026-02-28 Personalities at Play: Probing Alignment in AI Teammates Mohammad Amin Samadi et.al. 2603.00429 translate read null
2026-02-28 LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation Cunyuan Yang et.al. 2603.00426 translate read null
2026-02-28 SSR: Pushing the Limit of Spatial Intelligence with Structured Scene Reasoning Yi Zhang et.al. 2603.00409 translate read null
2026-02-28 A Data-Driven Analysis for Engineering Conferences: The Institute of Industrial and Systems Engineering (IISE) Annual Conference Proceedings (2002-2005) H. Sinan Bank et.al. 2603.00399 translate read null
2026-02-26 MediX-R1: Open Ended Medical Reinforcement Learning Sahal Shaji Mullappilly et.al. 2602.23363 translate read null
2026-02-26 Utilizing LLMs for Industrial Process Automation Salim Fares et.al. 2602.23331 translate read null
2026-02-26 Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks Kunihiro Miyazaki et.al. 2602.23330 translate read null
2026-02-26 LLM Novice Uplift on Dual-Use, In Silico Biology Tasks Chen Bo Calvin Zhang et.al. 2602.23329 translate read null
2026-02-26 Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction Rafael R. Baptista et.al. 2602.23312 translate read null
2026-02-26 ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding Yiran Guan et.al. 2602.23306 translate read null
2026-02-26 A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations Soumya Dutta et.al. 2602.23300 translate read null
2026-02-26 CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays Hyungyung Lee et.al. 2602.23276 translate read null
2026-02-26 Mitigating Legibility Tax with Decoupled Prover-Verifier Games Yegon Kim et.al. 2602.23248 translate read null
2026-02-26 Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive Radha Sarma et.al. 2602.23239 translate read null
2026-02-26 MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction Yizhi Li et.al. 2602.23228 translate read null
2026-02-26 STELLAR: Storage Tuning Engine Leveraging LLM Autonomous Reasoning for High Performance Parallel File Systems Chris Egersdoerfer et.al. 2602.23220 translate read null
2026-02-26 InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models Sayed Mohammadreza Tayaranian Hosseini et.al. 2602.23200 translate read null
2026-02-26 SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation Jiahao Zhao et.al. 2602.23199 translate read null
2026-02-26 Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models Chungpa Lee et.al. 2602.23197 translate read null
2026-02-26 ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering Elzo Brito dos Santos Filho et.al. 2602.23193 translate read null
2026-02-26 MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations Sara Rosenthal et.al. 2602.23184 translate read null
2026-02-26 A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring Usman Anwar et.al. 2602.23163 translate read null
2026-02-26 Multi-Agent Large Language Model Based Emotional Detoxification Through Personalized Intensity Control for Consumer Protection Keito Inoshita et.al. 2602.23123 translate read null
2026-02-26 Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design Zhuoliang Xie et.al. 2602.23092 translate read null
2026-02-26 Cytoarchitecture in Words: Weakly Supervised Vision-Language Modeling for Human Brain Microscopy Matthew Sutton et.al. 2602.23088 translate read null
2026-02-26 Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent Boyang Zhang et.al. 2602.23079 translate read null
2026-02-26 CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery Mengze Hong et.al. 2602.23075 translate read null
2026-02-26 TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment Trung Dang et.al. 2602.23068 translate read null
2026-02-26 LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer Kunpeng Zhang et.al. 2602.23065 translate read null
2026-02-26 Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department Gabriela Anna Kaczmarek et.al. 2602.23062 translate read null
2026-02-26 CL4SE: A Context Learning Benchmark For Software Engineering Tasks Haichuan Hu et.al. 2602.23047 translate read null
2026-02-26 LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure Jaehong Cho et.al. 2602.23036 translate read null
2026-02-26 WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval Tianyue Wang et.al. 2602.23029 translate read null
2026-02-26 Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization Zeyuan Liu et.al. 2602.23008 translate read null
2026-02-26 Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search Xun Huang et.al. 2602.22983 translate read null
2026-02-26 Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots Dimitrios P. Panagoulias et.al. 2602.22973 translate read null
2026-02-26 SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy Peiyao Xiao et.al. 2602.22971 translate read null
2026-02-26 Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression Yifeng Guan et.al. 2602.22967 translate read null
2026-02-26 FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning Zehao Li et.al. 2602.22963 translate read null
2026-02-26 Can Agents Distinguish Visually Hard-to-Separate Diseases in a Zero-Shot Setting? A Pilot Study Zihao Zhao et.al. 2602.22959 translate read null
2026-02-26 ClawMobile: Rethinking Smartphone-Native Agentic Systems Hongchao Du et.al. 2602.22942 translate read null
2026-02-26 MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding Wenhui Tan et.al. 2602.22932 translate read null
2026-02-26 SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress Yang Yu et.al. 2602.22913 translate read null
2026-02-26 PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA Yunpeng Hong et.al. 2602.22903 translate read null
2026-02-26 Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space Xingcheng Fu et.al. 2602.22879 translate read null
2026-02-26 Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching Roy Miles et.al. 2602.22871 translate read null
2026-02-26 Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference Yushi Ye et.al. 2602.22868 translate read null
2026-02-26 TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought Jianmin Li et.al. 2602.22828 translate read null
2026-02-26 TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models Reihaneh Iranmanesh et.al. 2602.22827 translate read null
2026-02-26 Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks Shuo He et.al. 2602.22817 translate read null
2026-02-26 MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks Shiqian Su et.al. 2602.22808 translate read null
2026-02-26 Natural Language Declarative Prompting (NLD-P): A Modular Governance Method for Prompt Design Under Model Drift Hyunwoo Kim et.al. 2602.22790 translate read null
2026-02-26 Probing for Knowledge Attribution in Large Language Models Ivo Brink et.al. 2602.22787 translate read null
2026-02-26 ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making Yusuke Watanabe et.al. 2602.22771 translate read null
2026-02-26 AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications Yujie Zhao et.al. 2602.22769 translate read null
2026-02-26 Imagination Helps Visual Reasoning, But Not Yet in Latent Space You Li et.al. 2602.22766 translate read null
2026-02-26 Towards Better RL Training Data Utilization via Second-Order Rollout Zhe Yang et.al. 2602.22765 translate read null
2026-02-26 Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study Philipp Wiesner et.al. 2602.22760 translate read null
2026-02-26 Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction Nils Schwager et.al. 2602.22752 translate read null
2026-02-26 Generative Recommendation for Large-Scale Advertising Ben Xue et.al. 2602.22732 translate read null
2026-02-26 Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks Jakub Šmíd et.al. 2602.22730 translate read null
2026-02-26 AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification Tian Zhang et.al. 2602.22724 translate read null
2026-02-26 Replacing Multi-Step Assembly of Data Preparation Pipelines with One-Step LLM Pipeline Generation for Table QA Fengyu Li et.al. 2602.22721 translate read null
2026-02-26 RLHFless: Serverless Computing for Efficient RLHF Rui Wei et.al. 2602.22718 translate read null
2026-02-26 SoPE: Spherical Coordinate-Based Positional Embedding for Enhancing Spatial Perception of 3D LVLMs Guanting Ye et.al. 2602.22716 translate read null
2026-02-26 LLM-driven discovery for carbon allotropes with bond-network entropy Yuzhou Hao et.al. 2602.22706 translate read null
2026-02-26 IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation Yanpei Guo et.al. 2602.22700 translate read null
2026-02-26 Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs Siyue Su et.al. 2602.22698 translate read null
2026-02-26 Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue Ning Gao et.al. 2602.22697 translate read null
2026-02-26 SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses Zhuohang Jiang et.al. 2602.22683 translate read null
2026-02-26 Accelerating LLM Pre-Training through Flat-Direction Dynamics Enhancement Shuchen Zhu et.al. 2602.22681 translate read null
2026-02-26 Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions Yue Xu et.al. 2602.22680 translate read null
2026-02-26 Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning Qin-Wen Luo et.al. 2602.22642 translate read null
2026-02-26 MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios Zhiheng Song et.al. 2602.22638 translate read null
2026-02-26 Fine-grained Semantics Integration for Large Language Model-based Recommendation Jiawen Feng et.al. 2602.22632 translate read null
2026-02-26 Instruction-based Image Editing with Planning, Reasoning, and Generation Liya Ji et.al. 2602.22624 translate read null
2026-02-26 Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA Hai Huang et.al. 2602.22617 translate read null
2026-02-26 Transformers converge to invariant algorithmic cores Joshua S. Schiffman et.al. 2602.22600 translate read null
2026-02-26 FLYING SERVING: On-the-Fly Parallelism Switching for Large Language Model Serving Shouwei Gao et.al. 2602.22593 translate read null
2026-02-26 pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training Wenzheng Zhang et.al. 2602.22592 translate read null
2026-02-26 Where Relevance Emerges: A Layer-Wise Study of Internal Attention for Zero-Shot Re-Ranking Haodong Chen et.al. 2602.22591 translate read null
2026-02-26 Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training Tianle Xia et.al. 2602.22576 translate read null
2026-02-26 Addressing Climate Action Misperceptions with Generative AI Miriam Remshard et.al. 2602.22564 translate read null
2026-02-26 Layer-Targeted Multilingual Knowledge Erasure in Large Language Models Taoran Li et.al. 2602.22562 translate read null
2026-02-26 CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety Umid Suleymanov et.al. 2602.22557 translate read null
2026-02-26 Autoregressive Visual Decoding from EEG Signals Sicheng Dai et.al. 2602.22555 translate read null
2026-02-26 Multilingual Safety Alignment Via Sparse Weight Editing Jiaming Liang et.al. 2602.22554 translate read null
2026-02-26 Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention Zhiming Wang et.al. 2602.22546 translate read null
2026-02-26 Ruyi2 Technical Report Huan Song et.al. 2602.22543 translate read null
2026-02-26 Agentic AI for Intent-driven Optimization in Cell-free O-RAN Mohammad Hossein Shokouhi et.al. 2602.22539 translate read null
2026-02-26 Generative Agents Navigating Digital Libraries Saber Zerhoudi et.al. 2602.22529 translate read null
2026-02-26 Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o Samay Bhojwani et.al. 2602.22524 translate read null
2026-02-26 Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents Ryan Liu et.al. 2602.22523 translate read null
2026-02-26 Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning Guoyizhe Wei et.al. 2602.22510 translate read null
2026-02-26 Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models Ik-hwan Kim et.al. 2602.22508 translate read null
2026-02-26 Mapping the Landscape of Artificial Intelligence in Life Cycle Assessment Using Large Language Models Anastasija Mensikova et.al. 2602.22500 translate read null
2026-02-26 Reinforcement-aware Knowledge Distillation for LLM Reasoning Zhaoyang Zhang et.al. 2602.22495 translate read null
2026-02-25 Importance of Prompt Optimisation for Error Detection in Medical Notes Using Language Models Craig Myles et.al. 2602.22483 translate read null
2026-02-25 Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models Binchi Zhang et.al. 2602.22475 translate read null
2026-02-25 ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization Joseph Tso et.al. 2602.22465 translate read null
2026-02-25 CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling Dong Xu et.al. 2602.22457 translate read null
2026-02-25 Automating the Detection of Requirement Dependencies Using Large Language Models Ikram Darif et.al. 2602.22456 translate read null
2026-02-25 Exploring Multimodal LMMs for Online Episodic Memory Question Answering on the Edge Giuseppe Lando et.al. 2602.22455 translate read null
2026-02-25 CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines Chayan Banerjee et.al. 2602.22452 translate read null
2026-02-25 Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace Qianlong Lan et.al. 2602.22450 translate read null
2026-02-25 A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines Gaoyuan Du et.al. 2602.22442 translate read null
2026-02-25 HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems Idan Habler et.al. 2602.22427 translate read null
2026-02-25 SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read Yibo Peng et.al. 2602.22426 translate read null
2026-02-25 Causality $\neq$ Invariance: Function and Concept Vectors in LLMs Gustaw Opiełka et.al. 2602.22424 translate read null
2026-02-25 Seeing Graphs Like Humans: Benchmarking Computational Measures and MLLMs for Similarity Assessment Seokweon Jung et.al. 2602.22416 translate read null
2026-02-25 Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents Cosmo Santoni et.al. 2602.22402 translate read null
2026-02-25 VoiceAlign: A Shimming Layer for Enhancing the Usability of Legacy Voice User Interface Systems Md Ehtesham-Ul-Haque et.al. 2602.22374 translate read null
2026-02-25 EyeLayer: Integrating Human Attention Patterns into LLM-Based Code Summarization Jiahao Zhang et.al. 2602.22368 translate read null
2026-02-25 E3VA: Enhancing Emotional Expressiveness in Virtual Conversational Agents Abhishek Kulkarni et.al. 2602.22362 translate read null
2026-02-25 Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts Arno Simons et.al. 2602.22359 translate read null
2026-02-25 STILTS-NLI: A Natural Language Interface for STILTS R. A. Shaw et.al. 2602.22357 translate read null
2026-02-25 Decoder-based Sense Knowledge Distillation Qitong Wang et.al. 2602.22351 translate read null
2026-02-25 Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory Davide Ettori et.al. 2602.22345 translate read null
2026-02-25 Conversational Successes and Breakdowns in Everyday Non-Display Smart Glasses Use Xiuqi Tommy Zhu et.al. 2602.22340 translate read null
2026-02-25 Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads Kunpeng Zhang et.al. 2602.22299 translate read null
2026-02-25 UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs Devan Shah et.al. 2602.22296 translate read null
2026-02-25 Manifold of Failure: Behavioral Attraction Basins in Language Models Sarthak Munshi et.al. 2602.22291 translate read null
2026-02-25 OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data Yan Zhao et.al. 2602.22286 translate read null
2026-02-25 BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning Mingi Kim et.al. 2602.22284 translate read null
2026-02-25 Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion Md. Tahsin Amin et.al. 2602.22280 translate read null
2026-02-25 RETLLM: Training and Data-Free MLLMs for Multimodal Information Retrieval Dawei Su et.al. 2602.22278 translate read null
2026-02-25 EmpiRE-Compass: A Neuro-Symbolic Dashboard for Sustainable and Dynamic Knowledge Exploration, Synthesis, and Reuse Oliver Karras et.al. 2602.22276 translate read null
2026-02-25 Sustainable LLM Inference using Context-Aware Model Switching Yuvarani et.al. 2602.22261 translate read null
2026-02-24 A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM Morteza Eskandarian et.al. 2602.22250 translate read null
2026-02-24 Accelerating Incident Response: A Hybrid Approach for Data Breach Reporting Aurora Arrus et.al. 2602.22244 translate read null
2026-02-24 Analysis of LLMs Against Prompt Injection and Jailbreak Attacks Piyush Jaiswal et.al. 2602.22242 translate read null
2026-02-24 From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation Linus Bantel et.al. 2602.22240 translate read null
2026-02-23 CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction Rabeya Tus Sadia et.al. 2602.22236 translate read null
2026-02-25 Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets Hanna Yukhymenko et.al. 2602.22207 translate read null
2026-02-25 A Taxonomy of Human–MLLM Interaction in Early-Stage Sketch-Based Design Ideation Weiayn Shi et.al. 2602.22171 translate read null
2026-02-25 LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models Minqiu Sun et.al. 2602.22158 translate read null
2026-02-25 Dynamic Personality Adaptation in Large Language Models via State Machines Leon Pielage et.al. 2602.22157 translate read null
2026-02-25 Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual Yining Li et.al. 2602.22146 translate read null
2026-02-25 When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models Satyam Kumar Navneet et.al. 2602.22145 translate read null
2026-02-25 WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs Yulin Zhang et.al. 2602.22142 translate read null
2026-02-25 Confidence-Driven Multi-Scale Model Selection for Cost-Efficient Inference Bo-Wei Chen et.al. 2602.22090 translate read null
2026-02-25 ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent Chunggi Lee et.al. 2602.22077 translate read null
2026-02-25 Understanding Artificial Theory of Mind: Perturbed Tasks and Reasoning in Large Language Models Christian Nickel et.al. 2602.22072 translate read null
2026-02-25 Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts Jessica Y. Bo et.al. 2602.22070 translate read null
2026-02-25 DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain Walter Hernandez Cruz et.al. 2602.22045 translate read null
2026-02-25 RT-RMOT: A Dataset and Framework for RGB-Thermal Referring Multi-Object Tracking Yanqiu Yu et.al. 2602.22033 translate read null
2026-02-25 Enhancing LLM-Based Test Generation by Eliminating Covered Code WeiZhe Xu et.al. 2602.21997 translate read null
2026-02-25 CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models Miyu Oba et.al. 2602.21978 translate read null
2026-02-25 Global-Aware Edge Prioritization for Pose Graph Initialization Tong Wei et.al. 2602.21963 translate read null
2026-02-25 Global-Local Dual Perception for MLLMs in High-Resolution Text-Rich Image Translation Junxin Lu et.al. 2602.21956 translate read null
2026-02-25 RADAR: Reasoning as Discrimination with Aligned Representations for LLM-based Knowledge Graph Reasoning Bo Xue et.al. 2602.21951 translate read null
2026-02-25 MEDSYN: Benchmarking Multi-EviDence SYNthesis in Complex Clinical Cases for Multimodal Large Language Models Boqi Chen et.al. 2602.21950 translate read null
2026-02-25 Large Language Models are Algorithmically Blind Sohan Venkatesh et.al. 2602.21947 translate read null
2026-02-25 Hidden Topics: Measuring Sensitive AI Beliefs with List Experiments Maxim Chupilkin et.al. 2602.21939 translate read null
2026-02-25 Small Wins Big: Comparing Large Language Models and Domain Fine-Tuned Models for Sarcasm Detection in Code-Mixed Hinglish Text Bitan Majumder et.al. 2602.21933 translate read null
2026-02-25 EmoOmni: Bridging Emotional Understanding and Expression in Omni-Modal LLMs Wenjie Tian et.al. 2602.21900 translate read null
2026-02-25 APFuzz: Towards Automatic Greybox Protocol Fuzzing Yu Wang et.al. 2602.21892 translate read null
2026-02-25 How to Take a Memorable Picture? Empowering Users with Actionable Feedback Francesco Laiti et.al. 2602.21877 translate read null
2026-02-25 Personalized Graph-Empowered Large Language Model for Proactive Information Access Chia Cheng Chang et.al. 2602.21862 translate read null
2026-02-25 ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices Dezhi Kong et.al. 2602.21858 translate read null
2026-02-25 FewMMBench: A Benchmark for Multimodal Few-Shot Learning Mustafa Dogan et.al. 2602.21854 translate read null
2026-02-25 From Restructuring to Stabilization: A Large-Scale Experiment on Iterative Code Readability Refactoring with Large Language Models Norman Peitek et.al. 2602.21833 translate read null
2026-02-25 A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios Kimberly T. Mai et.al. 2602.21831 translate read null
2026-02-25 SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Guibin Chen et.al. 2602.21818 translate read null
2026-02-25 Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem Heejin Jo et.al. 2602.21814 translate read null
2026-02-25 An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention Madhusudan Ghosh et.al. 2602.21800 translate read null
2026-02-25 DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism Yifan Niu et.al. 2602.21788 translate read null
2026-02-25 D-COT: Disciplined Chain-of-Thought Learning for Efficient Reasoning in Small Language Models Shunsuke Ubukata et.al. 2602.21786 translate read null
2026-02-25 Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control Beatrice Luciani et.al. 2602.21783 translate read null
2026-02-25 Generalisation of RLHF under Reward Shift and Clipped KL Regularisation Kenton Tang et.al. 2602.21765 translate read null
2026-02-25 Improving Implicit Discourse Relation Recognition with Natural Language Explanations from LLMs Heng Wang et.al. 2602.21763 translate read null
2026-02-25 Offline Reasoning for Efficient Recommendation: LLM-Empowered Persona-Profiled Item Indexing Deogyong Kim et.al. 2602.21756 translate read null
2026-02-25 From Words to Amino Acids: Does the Curse of Depth Persist? Aleena Siji et.al. 2602.21750 translate read null
2026-02-25 Enhancing Multi-Modal LLMs Reasoning via Difficulty-Aware Group Normalization Jinghan Li et.al. 2602.21743 translate read null
2026-02-25 Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling Shiqi Yan et.al. 2602.21728 translate read null
2026-02-25 TranX-Adapter: Bridging Artifacts and Semantics within MLLMs for Robust AI-generated Image Detection Wenbin Wang et.al. 2602.21716 translate read null
2026-02-25 Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach Xu Yang et.al. 2602.21715 translate read null
2026-02-25 EditFlow: Benchmarking and Optimizing Code Edit Recommendation Systems via Reconstruction of Developer Flows Chenyan Liu et.al. 2602.21697 translate read null
2026-02-25 Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning Tomoya Kawabe et.al. 2602.21670 translate read null
2026-02-25 DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation Duc Trung Vu et.al. 2602.21669 translate read null
2026-02-25 CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning Zhijiang Tang et.al. 2602.21655 translate read null
2026-02-25 Irresponsible Counselors: Large Language Models and the Loneliness of Modern Humans Abas Bertina et.al. 2602.21653 translate read null
2026-02-25 Sparsity Induction for Accurate Post-Training Pruning of Large Language Models Minhao Jiang et.al. 2602.21652 translate read null
2026-02-25 Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration Tangsang Chongbang et.al. 2602.21647 translate read null
2026-02-25 Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion Yexing Du et.al. 2602.21646 translate read null
2026-02-25 RuCL: Stratified Rubric-Based Curriculum Learning for Multimodal Large Language Model Reasoning Yukun Chen et.al. 2602.21628 translate read null
2026-02-25 Multi-Layer Scheduling for MoE-Based LLM Reasoning Yifan Sun et.al. 2602.21626 translate read null
2026-02-25 Structurally Aligned Subtask-Level Memory for Software Engineering Agents Kangning Shen et.al. 2602.21611 translate read null
2026-02-25 MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification Kazi Samin Yasar Alam et.al. 2602.21608 translate read null
2026-02-25 Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation Qiange Wang et.al. 2602.21604 translate read null
2026-02-25 AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking Ganap Ashit Tewary et.al. 2602.21600 translate read null
2026-02-25 SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints Hyungmin Kim et.al. 2602.21595 translate read null
2026-02-25 Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection Zheng Gao et.al. 2602.21593 translate read null
2026-02-25 Revisiting RAG Retrievers: An Information Theoretic Benchmark Wenqing Zheng et.al. 2602.21553 translate read null
2026-02-25 RAC: Relation-Aware Cache Replacement for Large Language Models Yuchong Wu et.al. 2602.21547 translate read null
2026-02-25 Muon+: Towards Better Muon via One Additional Normalization Step Ruijie Zhang et.al. 2602.21545 translate read null
2026-02-25 Reasoning-Driven Design of Single Atom Catalysts via a Multi-Agent Large Language Model Framework Dong Hyeon Mok et.al. 2602.21533 translate read null
2026-02-25 One Brain, Omni Modalities: Towards Unified Non-Invasive Brain Decoding with Large Language Models Changli Tang et.al. 2602.21522 translate read null
2026-02-25 Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information Umid Suleymanov et.al. 2602.21496 translate read null
2026-02-25 GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning Ningyuan Yang et.al. 2602.21492 translate read null
2026-02-25 Evaluating the Usage of African-American Vernacular English in Large Language Models Deja Dunlap et.al. 2602.21485 translate read null
2026-02-25 The Design Space of Tri-Modal Masked Diffusion Models Louis Bethune et.al. 2602.21472 translate read null
2026-02-25 iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis Sofoklis Kakouros et.al. 2602.21464 translate read null
2026-02-25 Revisiting Text Ranking in Deep Research Chuan Meng et.al. 2602.21456 translate read null
2026-02-24 MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning Jesse He et.al. 2602.21442 translate read null
2026-02-24 Causal Decoding for Hallucination-Resistant Multimodal Large Language Models Shiwei Tan et.al. 2602.21441 translate read null
2026-02-24 Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning Yuanda Xu et.al. 2602.21420 translate read null
2026-02-24 MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection Xuan Chen et.al. 2602.21394 translate read null
2026-02-24 Interleaved Head Attention Sai Surya Duvvuri et.al. 2602.21371 translate read null
2026-02-24 A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives Dmitrii Pantiukhin et.al. 2602.21351 translate read null
2026-02-24 Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment Mengxuan Hu et.al. 2602.21346 translate read null
2026-02-24 Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data Emre Can Acikgoz et.al. 2602.21320 translate read null
2026-02-24 Shared Nature, Unique Nurture: PRISM for Pluralistic Reasoning via In-context Structure Modeling Guancheng Tu et.al. 2602.21317 translate read null
2026-02-24 Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space Wang Zixian et.al. 2602.21269 translate read null
2026-02-24 Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models Sasha Robinson et.al. 2602.21262 translate read null
2026-02-23 Structured Prompt Language: Declarative Context Management for LLMs Wen G. Gong et.al. 2602.21257 translate read null
2026-02-23 A General Equilibrium Theory of Orchestrated AI Agent Systems Jean-Philippe Garnier et.al. 2602.21255 translate read null
2026-02-24 On Data Engineering for Scaling LLM Terminal Capabilities Renjie Pi et.al. 2602.21193 translate read null
2026-02-24 Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training Anas Barakat et.al. 2602.21189 translate read null
2026-02-24 Seeing Through Words: Controlling Visual Retrieval Quality with Language Models Jianglin Lu et.al. 2602.21175 translate read null
2026-02-24 PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data Samah Fodeh et.al. 2602.21165 translate read null
2026-02-24 ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking Guangming Wang et.al. 2602.21161 translate read null
2026-02-24 SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards Dengjia Zhang et.al. 2602.21158 translate read null
2026-02-24 Scaling State-Space Models on Multiple GPUs with Tensor Parallelism Anurag Dutt et.al. 2602.21144 translate read null
2026-02-24 A Benchmark for Deep Information Synthesis Debjit Paul et.al. 2602.21143 translate read null
2026-02-24 SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery David Anugraha et.al. 2602.21136 translate read null
2026-02-24 “Are You Sure?”: An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems Xinfeng Li et.al. 2602.21127 translate read null
2026-02-24 Turning Semantics into Topology: LLM-Driven Attribute Augmentation for Collaborative Filtering Junjie Meng et.al. 2602.21099 translate read null
2026-02-24 Can Interest-Bearing Positions Solve the Long-Horizon Problem in Prediction Markets? Caleb Maresca et.al. 2602.21091 translate read null
2026-02-24 Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification Vishal Patil et.al. 2602.21082 translate read null
2026-02-24 An Expert Schema for Evaluating Large Language Model Errors in Scholarly Question-Answering Systems Anna Martin-Boyle et.al. 2602.21059 translate read null
2026-02-24 PaperTrail: A Claim-Evidence Interface for Grounding Provenance in LLM-based Scholarly Q&A Anna Martin-Boyle et.al. 2602.21045 translate read null
2026-02-24 LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification Yanrui Wu et.al. 2602.21044 translate read null
2026-02-24 Generative Pseudo-Labeling for Pre-Ranking with LLMs Junyu Bi et.al. 2602.20995 translate read null
2026-02-24 CrystaL: Spontaneous Emergence of Visual Latents in MLLMs Yang Zhang et.al. 2602.20980 translate read null
2026-02-24 Evaluating Proactive Risk Awareness of Large Language Models Xuan Luo et.al. 2602.20976 translate read null
2026-02-24 Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving Yuliang Ji et.al. 2602.20973 translate read null
2026-02-24 Are Multimodal Large Language Models Good Annotators for Image Tagging? Ming-Kun Xie et.al. 2602.20972 translate read null
2026-02-24 Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models Paola Merlo et.al. 2602.20966 translate read null
2026-02-24 The Art of Efficient Reasoning: Data, Reward, and Optimization Taiqiang Wu et.al. 2602.20945 translate read null
2026-02-24 Extending $μ$ P: Spectral Conditions for Feature Learning Across Optimizers Akshita Gupta et.al. 2602.20937 translate read null
2026-02-24 Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence ChengYou Li et.al. 2602.20934 translate read null
2026-02-24 HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG Yuqi Huang et.al. 2602.20926 translate read null
2026-02-24 Predicting Sentence Acceptability Judgments in Multimodal Contexts Hyewon Jang et.al. 2602.20918 translate read null
2026-02-24 LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding Jihao Qiu et.al. 2602.20913 translate read null
2026-02-24 TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering Hanshen Zhu et.al. 2602.20903 translate read null
2026-02-24 SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models Yuechen Xie et.al. 2602.20901 translate read null
2026-02-24 Exa-PSD: a new Persian sentiment analysis dataset on Twitter Seyed Himan Ghaderi et.al. 2602.20892 translate read null
2026-02-24 Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs Dhita Putri Pratama et.al. 2602.20878 translate read null
2026-02-24 MUSE: Harnessing Precise and Diverse Semantics for Few-Shot Whole Slide Image Classification Jiahao Xu et.al. 2602.20873 translate read null
2026-02-24 Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset Jia-Rui Lin et.al. 2602.20812 translate read null
2026-02-24 Unseen-Codebases-Domain Data Synthesis and Training Based on Code Graphs Guangsheng Ou et.al. 2602.20799 translate read null
2026-02-24 SPP-SCL: Semi-Push-Pull Supervised Contrastive Learning for Image-Text Sentiment Analysis and Beyond Jiesheng Wu et.al. 2602.20767 translate read null
2026-02-24 Overton Pluralistic Reinforcement Learning for Large Language Models Yu Fu et.al. 2602.20759 translate read null
2026-02-24 Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI Feedback Chenyang Zhao et.al. 2602.20728 translate read null
2026-02-24 ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition Xindian Ma et.al. 2602.20727 translate read null
2026-02-24 Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning Xu Wan et.al. 2602.20722 translate read null
2026-02-24 AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs Che Wang et.al. 2602.20720 translate read null
2026-02-24 PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring Xiting Liu et.al. 2602.20717 translate read null
2026-02-24 ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction Che Wang et.al. 2602.20708 translate read null
2026-02-24 PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding Baolong Bi et.al. 2602.20696 translate read null
2026-02-24 Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment Mohamed Shamseldein et.al. 2602.20683 translate read null
2026-02-24 CAMEL: Confidence-Gated Reflection for Reward Modeling Zirui Zhu et.al. 2602.20670 translate read null
2026-02-24 ICSSPulse: A Modular LLM-Assisted Platform for Industrial Control System Penetration Testing Michail Takaronis et.al. 2602.20663 translate read null
2026-02-24 TOM: A Ternary Read-only Memory Accelerator for LLM-powered Edge Intelligence Hongyi Guan et.al. 2602.20662 translate read null
2026-02-24 CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models Anqi Li et.al. 2602.20648 translate read null
2026-02-24 An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation Fida Khandaker Safa et.al. 2602.20644 translate read null
2026-02-24 Grounding LLMs in Scientific Discovery via Embodied Actions Bo Zhang et.al. 2602.20639 translate read null
2026-02-24 QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs Santiago Gonzalez et.al. 2602.20629 translate read null
2026-02-24 Physics-based phenomenological characterization of cross-modal bias in multimodal models Hyeongmo Kim et.al. 2602.20624 translate read null
2026-02-24 SpecMind: Cognitively Inspired, Interactive Multi-Turn Framework for Postcondition Inference Cuong Chi Le et.al. 2602.20610 translate read null
2026-02-24 Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion Jiaru Zhang et.al. 2602.20577 translate read null
2026-02-24 From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production Yucheng Shi et.al. 2602.20558 translate read null
2026-02-24 Standard Transformers Achieve the Minimax Rate in Nonparametric Regression with $C^{s,λ}$ Targets Yanming Lai et.al. 2602.20555 translate read null
2026-02-24 What Drives Students’ Use of AI Chatbots? Technology Acceptance in Conversational AI Griffin Pitts et.al. 2602.20547 translate read null
2026-02-24 Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training Zhengyao Gu et.al. 2602.20532 translate read null
2026-02-24 FAST-Prefill: FPGA Accelerated Sparse Attention for Long Context LLM Prefill Rakshith Jayanth et.al. 2602.20515 translate read null
2026-02-24 From Performance to Purpose: A Sociotechnical Taxonomy for Evaluating Large Language Model Utility Gavin Levinson et.al. 2602.20513 translate read null
2026-02-24 AWCP: A Workspace Delegation Protocol for Deep-Engagement Collaboration across Remote Agents Xiaohang Nie et.al. 2602.20493 translate read null
2026-02-24 Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA Nuocheng Yang et.al. 2602.20492 translate read null
2026-02-24 Application of Large Language Models for Container Throughput Forecasting: Incorporating Contextual Information in Port Logistics Minseop Kim et.al. 2602.20489 translate read null
2026-02-24 Hybrid LLM-Embedded Dialogue Agents for Learner Reflection: Designing Responsive and Theory-Driven Interactions Paras Sharma et.al. 2602.20486 translate read null
2026-02-24 Oracle-Robust Online Alignment for Large Language Models Zimeng Li et.al. 2602.20457 translate read null
2026-02-23 Emergent Manifold Separability during Reasoning in Large Language Models Alexandre Polo et.al. 2602.20338 translate read null
2026-02-23 DMCD: Semantic-Statistical Framework for Causal Discovery Samarth KaPatel et.al. 2602.20333 translate read null
2026-02-23 No One Size Fits All: QueryBandits for Hallucination Mitigation Nicole Cho et.al. 2602.20332 translate read null
2026-02-23 An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models Cathy Shyr et.al. 2602.20324 translate read null
2026-02-23 What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance William Watson et.al. 2602.20300 translate read null
2026-02-23 InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation Yu Li et.al. 2602.20294 translate read null
2026-02-23 PhantomRun: Auto Repair of Compilation Errors in Embedded Open Source Software Han Fu et.al. 2602.20284 translate read null
2026-02-23 The Truthfulness Spectrum Hypothesis Zhuofan Josh Ying et.al. 2602.20273 translate read null
2026-02-23 HieraMAS: Optimizing Intra-Node LLM Mixtures and Inter-Node Topology for Multi-Agent Systems Tianjun Yao et.al. 2602.20229 translate read null
2026-02-23 Exploring Anti-Aging Literature via ConvexTopics and Large Language Models Lana E. Yeganova et.al. 2602.20224 translate read null
2026-02-23 An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction Guanting Shen et.al. 2602.20219 translate read null
2026-02-23 CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions Jingwei Shi et.al. 2602.20213 translate read null
2026-02-22 Golden Layers and Where to Find Them: Improved Knowledge Editing for Large Language Models Via Layer Gradient Analysis Shrestha Datta et.al. 2602.20207 translate read null
2026-02-22 Mitigating “Epistemic Debt” in Generative AI-Scaffolded Novice Programming using Metacognitive Scripts Sreecharan Sankaranarayanan et.al. 2602.20206 translate read null
2026-02-22 OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport Xiwen Chen et.al. 2602.20205 translate read null
2026-02-22 Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study Jeel Piyushkumar Khatiwala et.al. 2602.20202 translate read null
2026-02-22 Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning Zhuoxu Huang et.al. 2602.20197 translate read null
2026-02-23 Do Large Language Models Understand Data Visualization Rules? Martin Sinnona et.al. 2602.20137 translate read null
2026-02-23 KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration Mohammad Amanlou et.al. 2602.20135 translate read null
2026-02-23 AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization Mert Cemri et.al. 2602.20133 translate read null
2026-02-23 To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering Zaifu Zhan et.al. 2602.20130 translate read null
2026-02-23 NanoKnow: How to Know What Your Language Model Knows Lingwei Gu et.al. 2602.20122 translate read null
2026-02-23 BarrierSteer: LLM Safety via Learning Barrier Steering Thanh Q. Tran et.al. 2602.20102 translate read null
2026-02-23 CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching Yuzhe Wang et.al. 2602.20094 translate read null
2026-02-23 How Retrieved Context Shapes Internal Representations in RAG Samuel Yeh et.al. 2602.20091 translate read null
2026-02-23 Do Large Language Models Understand Data Visualization Principles? Martin Sinnona et.al. 2602.20084 translate read null
2026-02-23 Multilingual Large Language Models do not comprehend all natural languages to equal degrees Natalia Moskvina et.al. 2602.20065 translate read null
2026-02-23 The LLMbda Calculus: AI Agents, Conversations, and Information Flow Zac Garby et.al. 2602.20064 translate read null
2026-02-23 Can You Tell It’s AI? Human Perception of Synthetic Voices in Vishing Scenarios Zoha Hayat Bhatti et.al. 2602.20061 translate read null
2026-02-23 Entropy in Large Language Models Marco Scharringhausen et.al. 2602.20052 translate read null
2026-02-23 Closing the gap in multimodal medical representation alignment Eleonora Grassucci et.al. 2602.20046 translate read null
2026-02-23 Let There Be Claws: An Early Social Network Analysis of AI Agents on Moltbook H. C. W. Price et.al. 2602.20044 translate read null
2026-02-23 Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously Han Bao et.al. 2602.20042 translate read null
2026-02-23 AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization Fahmida Liza Piya et.al. 2602.20040 translate read null
2026-02-23 gencat: Generative computerized adaptive testing Wanyong Feng et.al. 2602.20020 translate read null
2026-02-23 ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting Yuxing Tian et.al. 2602.19969 translate read null
2026-02-23 Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval Yibo Yan et.al. 2602.19961 translate read null
2026-02-23 Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming Ian Steenstra et.al. 2602.19948 translate read null
2026-02-23 A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs Zijie Liu et.al. 2602.19938 translate read null
2026-02-23 BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models Chenran Kou et.al. 2602.19929 translate read null
2026-02-23 Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models Jin Liu et.al. 2602.19926 translate read null
2026-02-23 DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning Zhongwei Wan et.al. 2602.19895 translate read null
2026-02-23 SHIELD: Semantic Heterogeneity Integrated Embedding for Latent Discovery in Clinical Trial Safety Signals Francois Vandenhende et.al. 2602.19855 translate read null
2026-02-23 LLM-enabled Applications Require System-Level Threat Monitoring Yedi Zhang et.al. 2602.19844 translate read null
2026-02-23 SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation Jingzhuo Wu et.al. 2602.19840 translate read null
2026-02-23 An Explainable Memory Forensics Approach for Malware Analysis Silvia Lucia Sanna et.al. 2602.19831 translate read null
2026-02-23 TextShield-R1: Reinforced Reasoning for Tampered Text Detection Chenfan Qu et.al. 2602.19828 translate read null
2026-02-23 Universal Pose Pretraining for Generalizable Vision-Language-Action Policies Haitao Lin et.al. 2602.19710 translate read null
2026-02-23 “The explanation makes sense”: An Empirical Study on LLM Performance in News Classification and its Influence on Judgment in Human-AI Collaborative Annotation Qile Wang et.al. 2602.19690 translate read null
2026-02-23 KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge Alex Robertson et.al. 2602.19643 translate read null
2026-02-23 Evaluating the Impact of Data Anonymization on Image Retrieval Marvin Chen et.al. 2602.19641 translate read null
2026-02-23 Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering Chih-Hong Cheng et.al. 2602.19614 translate read null
2026-02-23 Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning Borisiuk Anna et.al. 2602.19612 translate read null
2026-02-23 CLCR: Cross-Level Semantic Collaborative Representation for Multimodal Learning Chunlei Meng et.al. 2602.19605 translate read null
2026-02-23 Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis Chunlei Meng et.al. 2602.19585 translate read null
2026-02-23 CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment Hanwen Liu et.al. 2602.19574 translate read null
2026-02-23 Identifying, Explaining, and Correcting Ableist Language with AI Kynnedy Simone Smith et.al. 2602.19560 translate read null
2026-02-23 Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains Xiaochong Jiang et.al. 2602.19555 translate read null
2026-02-23 Vinedresser3D: Agentic Text-guided 3D Editing Yankuan Chi et.al. 2602.19542 translate read null
2026-02-23 Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial Yousef Emami et.al. 2602.19534 translate read null
2026-02-23 Ada-RS: Adaptive Rejection Sampling for Selective Thinking Yirou Ge et.al. 2602.19519 translate read null
2026-02-23 Anticipate, Adapt, Act: A Hybrid Framework for Task Planning Nabanita Dash et.al. 2602.19518 translate read null
2026-02-23 Classroom Final Exam: An Instructor-Tested Reasoning Benchmark Chongyang Gao et.al. 2602.19517 translate read null
2026-02-23 Pixel2Phys: Distilling Governing Laws from Visual Dynamics Ruikun Li et.al. 2602.19516 translate read null
2026-02-23 Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference Arindam Khaled et.al. 2602.19509 translate read null
2026-02-23 Conversational AI for Automated Patient Questionnaire Completion: Development Insights and Design Principles David Fraile Navarro et.al. 2602.19507 translate read null
2026-02-23 Test-Time Computing for Referring Multimodal Large Language Models Mingrui Wu et.al. 2602.19505 translate read null
2026-02-23 MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models Mingrui Wu et.al. 2602.19497 translate read null
2026-02-23 Botson: An Accessible and Low-Cost Platform for Social Robotics Research Samuel Bellaire et.al. 2602.19491 translate read null
2026-02-23 Can Large Language Models Replace Human Coders? Introducing ContentBench Michael Haman et.al. 2602.19467 translate read null
2026-02-23 SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning Zelin He et.al. 2602.19455 translate read null
2026-02-23 Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments Kunal Mukherjee et.al. 2602.19450 translate read null
2026-02-23 Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images Yuxuan Yang et.al. 2602.19424 translate read null
2026-02-23 AuditoryHuM: Auditory Scene Label Generation and Clustering using Human-MLLM Collaboration Henry Zhong et.al. 2602.19409 translate read null
2026-02-23 Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases Indira Vats et.al. 2602.19407 translate read null
2026-02-23 Personalized Prediction of Perceived Message Effectiveness Using Large Language Model Based Digital Twins Jasmin Han et.al. 2602.19403 translate read null
2026-02-23 Hiding in Plain Text: Detecting Concealed Jailbreaks via Activation Disentanglement Amirhossein Farzam et.al. 2602.19396 translate read null
2026-02-22 LLMs Can Learn to Reason Via Off-Policy RL Daniel Ritter et.al. 2602.19362 translate read null
2026-02-22 Compliance Management for Federated Data Processing Natallia Kokash et.al. 2602.19360 translate read null
2026-02-22 Smooth Gate Functions for Soft Advantage Policy Optimization Egor Denisov et.al. 2602.19345 translate read null
2026-02-22 Soft Sequence Policy Optimization Svetlana Glazyrina et.al. 2602.19327 translate read null
2026-02-22 Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations Dongming Jiang et.al. 2602.19320 translate read null
2026-02-22 A Power Market Model with Hypersaclers and Modular Datacenters Yihsu Chen et.al. 2602.19310 translate read null
2026-02-22 Scaling Inference-Time Computation via Opponent Simulation: Enabling Online Strategic Adaptation in Repeated Negotiation Xiangyu Liu et.al. 2602.19309 translate read null
2026-02-22 The Path to Conversational AI Tutors: Integrating Tutoring Best Practices and Targeted Technologies to Produce Scalable AI Agents Kirk Vanacore et.al. 2602.19303 translate read null
2026-02-22 Automated Generation of Microfluidic Netlists using Large Language Models Jasper Davidson et.al. 2602.19297 translate read null
2026-02-22 Towards Automated Page Object Generation for Web Testing using Large Language Models Betül Karagöz et.al. 2602.19294 translate read null
2026-02-22 Limited Reasoning Space: The cage of long-horizon reasoning in LLMs Zhenyu Li et.al. 2602.19281 translate read null
2026-02-22 ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback Jingyu Xiao et.al. 2602.19276 translate read null
2026-02-22 KUDA: Knowledge Unlearning by Deviating Representation for Large Language Models Ce Fang et.al. 2602.19275 translate read null
2026-02-22 No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection Zunkai Dai et.al. 2602.19248 translate read null
2026-02-22 Topology of Reasoning: Retrieved Cell Complex-Augmented Generation for Textual Graph Question Answering Sen Zhao et.al. 2602.19240 translate read null
2026-02-22 Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations Ahmed Karim et.al. 2602.19239 translate read null
2026-02-22 Knowledge-aware Visual Question Generation for Remote Sensing Images Siran Li et.al. 2602.19224 translate read null
2026-02-22 Gecko: A Simulation Environment with Stateful Feedback for Refining Agent Tool Calls Zeyu Zhang et.al. 2602.19218 translate read null
2026-02-22 Questions beyond Pixels: Integrating Commonsense Knowledge in Visual Question Generation for Remote Sensing Siran Li et.al. 2602.19217 translate read null
2026-02-22 Statistical Measures for Explainable Aspect-Based Sentiment Analysis: A Case Study on Environmental Discourse in Reddit Luisa Stracqualursi et.al. 2602.19216 translate read null
2026-02-22 How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization Yangyi Fang et.al. 2602.19208 translate read null
2026-02-22 PositionOCR: Augmenting Positional Awareness in Multi-Modal Models via Hybrid Specialist Integration Chen Duan et.al. 2602.19188 translate read null
2026-02-22 Next Reply Prediction X Dataset: Linguistic Discrepancies in Naively Generated Content Simon Münker et.al. 2602.19177 translate read null
2026-02-22 TurkicNLP: An NLP Toolkit for Turkic Languages Sherzod Hakimov et.al. 2602.19174 translate read null
2026-02-22 Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing Maciej Świechowski et.al. 2602.19160 translate read null
2026-02-22 DoAtlas-1: A Causal Compilation Paradigm for Clinical AI Yulong Li et.al. 2602.19158 translate read null
2026-02-22 Facet-Level Persona Control by Trait-Activated Routing with Contrastive SAE for Role-Playing LLMs Wenqiu Tang et.al. 2602.19157 translate read null
2026-02-22 A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions Stefanie Schneider et.al. 2602.19133 translate read null
2026-02-22 K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model Shiyi Cao et.al. 2602.19128 translate read null
2026-02-22 AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG Qijie You et.al. 2602.19127 translate read null
2026-02-22 Dark and Bright Side of Participatory Red-Teaming with Targets of Stereotyping for Eliciting Harmful Behaviors from Large Language Models Sieun Kim et.al. 2602.19124 translate read null
2026-02-22 How Do LLMs Encode Scientific Quality? An Empirical Study Using Monosemantic Features from Sparse Autoencoders Michael McCoubrey et.al. 2602.19115 translate read null
2026-02-22 Universal 3D Shape Matching via Coarse-to-Fine Language Guidance Qinfeng Xiao et.al. 2602.19112 translate read null
2026-02-22 Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models Kainan Liu et.al. 2602.19111 translate read null
2026-02-22 Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models Seong Hah Cho et.al. 2602.19101 translate read null
2026-02-22 CREM: Compression-Driven Representation Enhancement for Multimodal Retrieval and Comprehension Lihao Liu et.al. 2602.19091 translate read null
2026-02-22 TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes Roman Egger et.al. 2602.19079 translate read null
2026-02-22 Evaluation and Benchmarking Suite for Financial Large Language Models and Agents Shengyuan Lin et.al. 2602.19073 translate read null
2026-02-22 IDLM: Inverse-distilled Diffusion Language Models David Li et.al. 2602.19066 translate read null
2026-02-22 Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents Chanjin Park et.al. 2602.19065 translate read null
2026-02-22 Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer Chenhang Cui et.al. 2602.19058 translate read null
2026-02-22 IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning Yinhan He et.al. 2602.19049 translate read null
2026-02-22 Uncovering Context Reliance in Unstructured Knowledge Editing Zisheng Zhou et.al. 2602.19043 translate read null
2026-02-22 Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning Jiahao Zhang et.al. 2602.19041 translate read null
2026-02-05 Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning Xuejun Zhang et.al. 2602.06041 translate read null
2026-02-05 SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs Jintao Tong et.al. 2602.06040 translate read link
2026-02-05 DyTopo: Dynamic Topology Routing for Multi-Agent Reasoning via Semantic Matching Yuxing Lu et.al. 2602.06039 translate read null
2026-02-05 Thinking with Geometry: Active Geometry Integration for Spatial Reasoning Haoyuan Li et.al. 2602.06037 translate read link
2026-02-05 DFlash: Block Diffusion for Flash Speculative Decoding Jian Chen et.al. 2602.06036 translate read link
2026-02-05 V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval Dongyang Chen et.al. 2602.06034 translate read link
2026-02-05 PhysicsAgentABM: Physics-Guided Generative Agent-Based Modeling Kavana Venkatesh et.al. 2602.06030 translate read null
2026-02-05 Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory Haozhen Zhang et.al. 2602.06025 translate read null
2026-02-05 Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering Miranda Muqing Miao et.al. 2602.06022 translate read null
2026-02-05 A Systematic Evaluation of Large Language Models for PTSD Severity Estimation: The Role of Contextual Knowledge and Modeling Strategies Panagiotis Kaliosis et.al. 2602.06015 translate read null
2026-02-05 AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions Xianyang Liu et.al. 2602.06008 translate read null
2026-02-05 VisRefiner: Learning from Visual Differences for Screenshot-to-Code Generation Jie Deng et.al. 2602.05998 translate read null
2026-02-05 DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs Lizhuo Luo et.al. 2602.05992 translate read link
2026-02-05 Layer-wise LoRA fine-tuning: a similarity metric approach Keith Ando Ogawa et.al. 2602.05988 translate read null
2026-02-05 From Human-Human Collaboration to Human-Agent Collaboration: A Vision, Design Philosophy, and an Empirical Framework for Achieving Successful Partnerships Between Humans and LLM Agents Bingsheng Yao et.al. 2602.05987 translate read null
2026-02-05 Inverse Depth Scaling From Most Layers Being Similar Yizhou Liu et.al. 2602.05970 translate read null
2026-02-05 Orthogonal Model Merging Sihan Yang et.al. 2602.05943 translate read null
2026-02-05 Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions Léo Labat et.al. 2602.05932 translate read null
2026-02-05 Compound Deception in Elite Peer Review: A Failure Mode Taxonomy of 100 Fabricated Citations at NeurIPS 2025 Samar Ansari et.al. 2602.05930 translate read null
2026-02-05 KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs Jian Chen et.al. 2602.05929 translate read null
2026-02-05 Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences Siquan Li et.al. 2602.05927 translate read null
2026-02-05 CLIP-Map: Structured Matrix Mapping for Parameter-Efficient CLIP Compression Kangjie Zhang et.al. 2602.05909 translate read null
2026-02-05 Codified Finite-state Machines for Role-playing Letian Peng et.al. 2602.05905 translate read null
2026-02-05 Regularized Calibration with Successive Rounding for Post-Training Quantization Seohyeon Cha et.al. 2602.05902 translate read null
2026-02-05 Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models Shuo Nie et.al. 2602.05897 translate read null
2026-02-05 When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models Shenyu Zheng et.al. 2602.05891 translate read null
2026-02-05 A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges Philippe J. Giabbanelli et.al. 2602.05883 translate read null
2026-02-05 EuroLLM-22B: Technical Report Miguel Moura Ramos et.al. 2602.05879 translate read null
2026-02-05 Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy Lukas Stappen et.al. 2602.05877 translate read null
2026-02-05 xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection Adrián Girón et.al. 2602.05874 translate read null
2026-02-05 DLM-Scope: Mechanistic Interpretability of Diffusion Language Models via Sparse Autoencoders Xu Wang et.al. 2602.05859 translate read null
2026-02-05 BABE: Biology Arena BEnchmark Junting Zhou et.al. 2602.05857 translate read null
2026-02-05 “It Talks Like a Patient, But Feels Different”: Co-Designing AI Standardized Patients with Medical Learners Zhiqi Gao et.al. 2602.05856 translate read null
2026-02-05 RRAttention: Dynamic Block Sparse Attention via Per-Head Round-Robin Shifts for Long-Context Inference Siran Liu et.al. 2602.05853 translate read null
2026-02-05 OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions Fangzhi Xu et.al. 2602.05843 translate read null
2026-02-05 Reinforcement World Model Learning for LLM-based Agents Xiao Yu et.al. 2602.05842 translate read null
2026-02-05 Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation Hai Zhang et.al. 2602.05827 translate read null
2026-02-05 Whispers of the Butterfly: A Research-through-Design Exploration of In-Situ Conversational AI Guidance in Large-Scale Outdoor MR Exhibitions Dongyijie Primo Pan et.al. 2602.05826 translate read null
2026-02-05 ToMigo: Interpretable Design Concept Graphs for Aligning Generative AI with Creative Intent Lena Hegemann et.al. 2602.05825 translate read null
2026-02-05 Authorship Drift: How Self-Efficacy and Trust Evolve During LLM-Assisted Writing Yeon Su Park et.al. 2602.05819 translate read null
2026-02-05 TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning Zihao Jiang et.al. 2602.05818 translate read null
2026-02-05 Where Does Warm-Up Come From? Adaptive Scheduling for Norm-Constrained Optimizers Artem Riabinin et.al. 2602.05813 translate read null
2026-02-05 NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking Kang Chen et.al. 2602.05805 translate read null
2026-02-05 Task-Oriented Robot-Human Handovers on Legged Manipulators Andreea Tulbure et.al. 2602.05760 translate read null
2026-02-05 Towards Green AI: Decoding the Energy of LLM Inference in Software Development Lola Solovyeva et.al. 2602.05712 translate read null
2026-02-05 Determining Energy Efficiency Sweet Spots in Production LLM Inference Hiari Pizzini Cavagna et.al. 2602.05695 translate read null
2026-02-05 Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation Shuting Jiang et.al. 2602.05694 translate read null
2026-02-05 MedErrBench: A Fine-Grained Multilingual Benchmark for Medical Error Detection and Correction with Clinical Expert Annotations Congbo Ma et.al. 2602.05692 translate read null
2026-02-05 Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction Pavithren V S Pakianathan et.al. 2602.05687 translate read null
2026-02-05 Graph-based Agent Memory: Taxonomy, Techniques, and Applications Chang Yang et.al. 2602.05665 translate read null
2026-02-05 Alignment Verifiability in Large Language Models: Normative Indistinguishability under Behavioral Evaluation Igor Santos-Grueiro et.al. 2602.05656 translate read null
2026-02-05 Generative Ontology: When Structured Knowledge Learns to Create Benny Cheung et.al. 2602.05636 translate read null
2026-02-05 CASTLE: A Comprehensive Benchmark for Evaluating Student-Tailored Personalized Safety in Large Language Models Rui Jia et.al. 2602.05633 translate read null
2026-02-05 Rewards as Labels: Revisiting RLVR from a Classification Perspective Zepeng Zhai et.al. 2602.05630 translate read null
2026-02-05 AI chatbots versus human healthcare professionals: a systematic review and meta-analysis of empathy in patient care Alastair Howcroft et.al. 2602.05628 translate read null
2026-02-05 Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents Stephen Pilli et.al. 2602.05597 translate read null
2026-02-05 Multi-Task GRPO: Reliable LLM Reasoning Across Tasks Shyam Sundhar Ramesh et.al. 2602.05547 translate read null
2026-02-05 Reasoning-guided Collaborative Filtering with Language Models for Explainable Recommendation Fahad Anwaar et.al. 2602.05544 translate read null
2026-02-05 Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities Florian Dietz et.al. 2602.05532 translate read null
2026-02-05 AI Agent Systems for Supply Chains: Structured Decision Prompts and Memory Retrieval Konosuke Yoshizato et.al. 2602.05524 translate read null
2026-02-05 Capture the Flags: Family-Based Evaluation of Agentic LLMs via Semantics-Preserving Transformations Shahin Honarvar et.al. 2602.05523 translate read null
2026-02-05 A Human-in-the-Loop, LLM-Centered Architecture for Knowledge-Graph Question Answering Larissa Pusch et.al. 2602.05512 translate read null
2026-02-05 Relying on LLMs: Student Practices and Instructor Norms are Changing in Computer Science Education Xinrui Lin et.al. 2602.05506 translate read null
2026-02-05 SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration Hanyu Wei et.al. 2602.05499 translate read null
2026-02-05 Transport and Merge: Cross-Architecture Merging for Large Language Models Chenhang Cui et.al. 2602.05495 translate read null
2026-02-05 A Unified Framework for Rethinking Policy Divergence Measures in GRPO Qingyuan Wu et.al. 2602.05494 translate read null
2026-02-05 LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation Bingru Li et.al. 2602.05493 translate read null
2026-02-05 Fine-Tuning Large Language Models for Automatic Detection of Sexually Explicit Content in Spanish-Language Song Lyrics Dolores Zamacola Sánchez de Lamadrid et.al. 2602.05485 translate read null
2026-02-05 Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection Takashi Koide et.al. 2602.05484 translate read null
2026-02-05 LMMRec: LLM-driven Motivation-aware Multimodal Recommendation Yicheng Di et.al. 2602.05474 translate read null
2026-02-05 ALIVE: Awakening LLM Reasoning via Adversarial Learning and Instructive Verbal Evaluation Yiwen Duan et.al. 2602.05472 translate read null
2026-02-05 Can We Classify Flaky Tests Using Only Test Code? An LLM-Based Empirical Study Alexander Berndt et.al. 2602.05465 translate read null
2026-02-05 DistillER: Knowledge Distillation in Entity Resolution with Large Language Models Alexandros Zeakis et.al. 2602.05452 translate read null
2026-02-05 BLITZRANK: Principled Zero-shot Ranking Agents with Tournament Graphs Sheshansh Agrawal et.al. 2602.05448 translate read null
2026-02-05 Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale Damon McMillan et.al. 2602.05447 translate read null
2026-02-05 DiLLS: Interactive Diagnosis of LLM-based Multi-agent Systems via Layered Summary of Agent Behaviors Rui Sheng et.al. 2602.05446 translate read null
2026-02-05 Causal Front-Door Adjustment for Robust Jailbreak Attacks on LLMs Yao Zhou et.al. 2602.05444 translate read null
2026-02-05 SciDef: Automating Definition Extraction from Academic Literature with Large Language Models Filip Kučera et.al. 2602.05413 translate read null
2026-02-05 BadTemplate: A Training-Free Backdoor Attack via Chat Template Against Large Language Models Zihan Wang et.al. 2602.05401 translate read null
2026-02-05 OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Shaobo Wang et.al. 2602.05400 translate read null
2026-02-05 Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Ji Zhao et.al. 2602.05393 translate read null
2026-02-05 Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening Zhenxiong Yu et.al. 2602.05386 translate read null
2026-02-05 IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models Tao Liu et.al. 2602.05385 translate read null
2026-02-05 Clinical Validation of Medical-based Large Language Model Chatbots on Ophthalmic Patient Queries with LLM-based Evaluation Ting Fang Tan et.al. 2602.05381 translate read null
2026-02-05 Cross-Lingual Empirical Evaluation of Large Language Models for Arabic Medical Tasks Chaimae Abouzahir et.al. 2602.05374 translate read null
2026-02-05 Speech-XL: Towards Long-Form Speech Understanding in Large Speech Language Models Haoqin Sun et.al. 2602.05373 translate read null
2026-02-05 PACE: Defying the Scaling Hypothesis of Exploration in Iterative Alignment for Mathematical Reasoning Jun Rao et.al. 2602.05370 translate read null
2026-02-05 RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs Youngcheon You et.al. 2602.05367 translate read null
2026-02-05 Multi-Field Tool Retrieval Yichen Tang et.al. 2602.05366 translate read null
2026-02-05 Multimodal Latent Reasoning via Hierarchical Visual Cues Injection Yiming Zhang et.al. 2602.05359 translate read null
2026-02-05 AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction Ruijie Shi et.al. 2602.05353 translate read null
2026-02-05 SynAT: Enhancing Security Knowledge Bases via Automatic Synthesizing Attack Tree from Crowd Discussions Ziyou Jiang et.al. 2602.05329 translate read null
2026-02-05 ProAct: Agentic Lookahead in Interactive Environments Yangbin Yu et.al. 2602.05327 translate read null
2026-02-05 ORACL: Optimized Reasoning for Autoscaling via Chain of Thought with LLMs for Microservices Haoyu Bai et.al. 2602.05292 translate read null
2026-02-05 Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science Jingru Fan et.al. 2602.05289 translate read null
2026-02-05 Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities Pengyi Li et.al. 2602.05281 translate read null
2026-02-05 Hallucination-Resistant Security Planning with a Large Language Model Kim Hammar et.al. 2602.05279 translate read null
2026-02-05 Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs Qi Li et.al. 2602.05275 translate read null
2026-02-05 PatchGuru: Patch Oracle Inference from Natural Language Artifacts with Large Language Models Thanh Le-Cong et.al. 2602.05270 translate read null
2026-02-05 Hybrid Gated Flow (HGF): Stabilizing 1.58-bit LLMs via Selective Low-Rank Correction David Alejandro Trejo Pizzo et.al. 2602.05269 translate read null
2026-02-05 Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR Fanfan Liu et.al. 2602.05261 translate read null
2026-02-05 CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs Haoran Li et.al. 2602.05258 translate read null
2026-02-05 EGSS: Entropy-guided Stepwise Scaling for Reliable Software Engineering Chenhui Mao et.al. 2602.05242 translate read null
2026-02-05 FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters Zhilin Liang et.al. 2602.05235 translate read null
2026-02-05 Surgery: Mitigating Harmful Fine-Tuning for Large Language Models via Attention Sink Guozhi Liu et.al. 2602.05228 translate read null
2026-02-05 E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching Jiahao Nie et.al. 2602.05215 translate read null
2026-02-05 Aligning Large Language Model Behavior with Human Citation Preferences Kenichiro Ando et.al. 2602.05205 translate read null
2026-02-05 Double-P: Hierarchical Top-P Sparse Attention for Long-Context LLMs Wentao Ni et.al. 2602.05191 translate read null
2026-02-05 Are Open-Weight LLMs Ready for Social Media Moderation? A Comparative Study on Bluesky Hsuan-Yu Chou et.al. 2602.05189 translate read null
2026-02-05 Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning John Yan et.al. 2602.05183 translate read null
2026-02-05 EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization Kevin Han et.al. 2602.05165 translate read null
2026-02-05 GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek Yang Zhang et.al. 2602.05150 translate read null
2026-02-05 CoSA: Compressed Sensing-Based Adaptation of Large Language Models Songtao Wei et.al. 2602.05148 translate read null
2026-02-04 HugRAG: Hierarchical Causal Knowledge Graph Design for RAG Nengbo Wang et.al. 2602.05143 translate read null
2026-02-04 SemPipes – Optimizable Semantic Data Operators for Tabular Machine Learning Pipelines Olga Ovcharenko et.al. 2602.05134 translate read null
2026-02-04 SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers Keyang Xuan et.al. 2602.05115 translate read null
2026-02-04 Understanding LLM Evaluator Behavior: A Structured Multi-Evaluator Framework for Merchant Risk Assessment Liang Wang et.al. 2602.05110 translate read null
2026-02-04 GAMMS: Graph based Adversarial Multiagent Modeling Simulator Rohan Patil et.al. 2602.05105 translate read null
2026-02-04 VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health Kate H. Bentley et.al. 2602.05088 translate read null
2026-02-04 Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents Changdae Oh et.al. 2602.05073 translate read null
2026-02-04 Evaluating Large Language Models on Solved and Unsolved Problems in Graph Theory: Implications for Computing Education Adithya Kulkarni et.al. 2602.05059 translate read null
2026-02-04 DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search Zhanli Li et.al. 2602.05014 translate read null
2026-02-04 Private PoEtry: Private In-Context Learning via Product of Experts Rob Romijnders et.al. 2602.05012 translate read null
2026-02-04 CoWork-X: Experience-Optimized Co-Evolution for Multi-Agent Collaboration System Zexin Lin et.al. 2602.05004 translate read null
2026-02-04 Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning Yu-Ang Lee et.al. 2602.04998 translate read null
2026-02-04 BioACE: An Automated Framework for Biomedical Answer and Citation Evaluations Deepak Gupta et.al. 2602.04982 translate read null
2026-02-04 Learning Context Matters: Measuring and Diagnosing Personalization Gaps in LLM-Based Instructional Design Johaun Hatchett et.al. 2602.04972 translate read null
2026-02-04 Large Language Models in Software Documentation and Modeling: A Literature Review and Findings Lukas Radosky et.al. 2602.04938 translate read null
2026-02-04 Linear Model Merging Unlocks Simple and Scalable Multimodal Data Mixture Optimization Davide Berasi et.al. 2602.04937 translate read null
2026-02-04 Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models Shahar Haim et.al. 2602.04931 translate read null
2026-02-04 TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation Junhan Kim et.al. 2602.04929 translate read link
2026-02-04 PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM Gautam Savaliya et.al. 2602.04927 translate read null
2026-02-04 Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering Dinh Phu Tran et.al. 2602.04924 translate read null
2026-02-04 Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog Yiran Zhao et.al. 2602.04919 translate read null
2026-02-04 Simulated Adoption: Decoupling Magnitude and Direction in LLM In-Context Conflict Resolution Long Zhang et.al. 2602.04918 translate read null
2026-02-04 AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design Ling Luo et.al. 2602.04916 translate read null
2026-02-04 From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM Penglei Sun et.al. 2602.04914 translate read null
2026-02-04 A $^2$ -LLM: An End-to-end Conversational Audio Avatar Large Language Model Xiaolin Hu et.al. 2602.04913 translate read null
2026-02-04 Reducing the Costs of Proof Synthesis on Rust Systems by Scaling Up a Seed Training Set Nongyu Di et.al. 2602.04910 translate read null
2026-02-04 Learning Where It Matters: Geometric Anchoring for Robust Preference Alignment Youngjae Cho et.al. 2602.04909 translate read null
2026-02-03 Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization Sai Sindhur Malleni et.al. 2602.04900 translate read null
2026-02-03 Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models Chen Xiong et.al. 2602.04896 translate read null
2026-02-04 Reinforced Attention Learning Bangzheng Li et.al. 2602.04884 translate read null
2026-02-04 Rethinking the Trust Region in LLM Reinforcement Learning Penghui Qi et.al. 2602.04879 translate read null
2026-02-04 Multi-Head LatentMoE and Head Parallel: Communication-Efficient and Deterministic MoE Parallelism Chenwei Cui et.al. 2602.04870 translate read null
2026-02-04 Subliminal Effects in Your Data: A General Mechanism via Log-Linearity Ishaq Aden-Ali et.al. 2602.04863 translate read null
2026-02-04 CoT is Not the Chain of Truth: An Empirical Internal Analysis of Reasoning LLMs for Fake News Generation Zhao Tong et.al. 2602.04856 translate read null
2026-02-04 Decomposed Prompting Does Not Fix Knowledge Gaps, But Helps Models Say “I Don’t Know” Dhruv Madhwal et.al. 2602.04853 translate read null
2026-02-04 Horizon-LM: A RAM-Centric Architecture for LLM Training Zhengqing Yuan et.al. 2602.04816 translate read link
2026-02-04 Agentic AI in Healthcare & Medicine: A Seven-Dimensional Taxonomy for Empirical Evaluation of LLM-based Agents Shubham Vatsal et.al. 2602.04813 translate read null
2026-02-04 OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Yue Ding et.al. 2602.04804 translate read null
2026-02-04 Team, Then Trim: An Assembly-Line LLM Framework for High-Quality Tabular Data Generation Congjing Zhang et.al. 2602.04785 translate read null
2026-02-04 NeuroCanvas: VLLM-Powered Robust Seizure Detection by Reformulating Multichannel EEG as Image Yan Chen et.al. 2602.04769 translate read null
2026-02-04 Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation Luis Frentzen Salim et.al. 2602.04764 translate read null
2026-02-04 When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond? Xinyu Zhou et.al. 2602.04755 translate read null
2026-02-04 Decomposing Query-Key Feature Interactions Using Contrastive Covariances Andrew Lee et.al. 2602.04752 translate read null
2026-02-04 Exploiting contextual information to improve stance detection in informal political discourse with LLMs Arman Engin Sucu et.al. 2602.04750 translate read null
2026-02-04 Inference-Time Reasoning Selectively Reduces Implicit Social Bias in Large Language Models Molly Apsel et.al. 2602.04742 translate read null
2026-02-04 Alignment Drift in Multimodal LLMs: A Two-Phase, Longitudinal Evaluation of Harm Across Eight Model Releases Casey Ford et.al. 2602.04739 translate read null
2026-02-04 From Data to Behavior: Predicting Unintended Model Behaviors Before Training Mengru Wang et.al. 2602.04735 translate read link
2026-02-04 Less Finetuning, Better Retrieval: Rethinking LLM Adaptation for Biomedical Retrievers via Synthetic Data and Model Merging Sameh Khattab et.al. 2602.04731 translate read null
2026-02-04 “Be My Cheese?”: Cultural Nuance Benchmarking for Machine Translation in Multilingual LLMs Madison Van Doren et.al. 2602.04729 translate read null
2026-02-04 Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation Marian Kica et.al. 2602.04726 translate read null
2026-02-04 SAR-RAG: ATR Visual Question Answering by Semantic Search, Retrieval, and MLLM Generation David F. Ramirez et.al. 2602.04712 translate read null
2026-02-04 LinGO: A Linguistic Graph Optimization Framework with LLMs for Interpreting Intents of Online Uncivil Discourse Yuan Zhang et.al. 2602.04693 translate read null
2026-02-04 UniAudio 2.0: A Unified Audio Language Model with Text-Aligned Factorized Audio Tokenization Dongchao Yang et.al. 2602.04683 translate read link
2026-02-04 Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility Eun Cheol Choi et.al. 2602.04674 translate read null
2026-02-04 Relational Scene Graphs for Object Grounding of Natural Language Commands Julia Kuhn et.al. 2602.04635 translate read null
2026-02-04 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Zelai Xu et.al. 2602.04634 translate read link
2026-02-04 Disentangling meaning from language in LLM-based machine translation Théo Lasnier et.al. 2602.04613 translate read null
2026-02-04 Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection Junhao Liu et.al. 2602.04607 translate read null
2026-02-04 Automated Extraction of Multicomponent Alloy Data Using Large Language Models for Sustainable Design Aravindan Kamatchi Sundaram et.al. 2602.04602 translate read null
2026-02-04 Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference Xinyu Wang et.al. 2602.04595 translate read null
2026-02-04 AIANO: Enhancing Information Retrieval with AI-Augmented Annotation Sameh Khattab et.al. 2602.04579 translate read null
2026-02-04 Semantic Self-Distillation for Language Model Uncertainty Edward Phillips et.al. 2602.04577 translate read null
2026-02-04 Can LLMs capture stable human-generated sentence entropy measures? Estrella Pivel-Villanueva et.al. 2602.04570 translate read null
2026-02-04 LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding Gang Lin et.al. 2602.04541 translate read null
2026-02-04 HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing Xuenan Xu et.al. 2602.04535 translate read null
2026-02-04 Landscape-aware Automated Algorithm Design: An Efficient Framework for Real-world Optimization Haoran Yin et.al. 2602.04529 translate read null
2026-02-04 OSCAgent: Accelerating the Discovery of Organic Solar Cells with LLM Agents Zhaolin Hu et.al. 2602.04510 translate read null
2026-02-04 Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models Hyeontaek Hwang et.al. 2602.04509 translate read null
2026-02-04 ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control Zhentao Tang et.al. 2602.04496 translate read null
2026-02-04 PersoDPO: Scalable Preference Optimization for Instruction-Adherent, Persona-Grounded Dialogue via Multi-LLM Evaluation Saleh Afzoon et.al. 2602.04493 translate read null
2026-02-04 The Supportiveness-Safety Tradeoff in LLM Well-Being Agents Himanshi Lalwani et.al. 2602.04487 translate read null
2026-02-04 Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition Jinlong Ma et.al. 2602.04486 translate read null
2026-02-04 Vision-aligned Latent Reasoning for Multi-modal Large Language Model Byungwoo Jeon et.al. 2602.04476 translate read null
2026-02-04 LLM-Empowered Cooperative Content Caching in Vehicular Fog Caching-Assisted Platoon Networks Bowen Tan et.al. 2602.04471 translate read null
2026-02-04 DOS: Dual-Flow Orthogonal Semantic IDs for Recommendation in Meituan Junwei Yin et.al. 2602.04460 translate read null
2026-02-04 Growth First, Care Second? Tracing the Landscape of LLM Value Preferences in Everyday Dilemmas Zhiyi Chen et.al. 2602.04456 translate read null
2026-02-04 Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search Tianming Liang et.al. 2602.04454 translate read link
2026-02-04 SDR-CIR: Semantic Debias Retrieval Framework for Training-Free Zero-Shot Composed Image Retrieval Yi Sun et.al. 2602.04451 translate read null
2026-02-04 What’s in a Benchmark? The Case of SWE-Bench in Automated Program Repair Matias Martinez et.al. 2602.04449 translate read null
2026-02-04 Fine-Grained Activation Steering: Steering Less, Achieving More Zijian Feng et.al. 2602.04428 translate read null
2026-02-04 Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning Heqing Yang et.al. 2602.04419 translate read null
2026-02-04 EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL Lunjun Zhang et.al. 2602.04417 translate read null
2026-02-04 History-Guided Iterative Visual Reasoning with Self-Correction Xinglong Yang et.al. 2602.04413 translate read null
2026-02-04 Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts Yujie Lin et.al. 2602.04398 translate read null
2026-02-04 Evaluating the Presence of Sex Bias in Clinical Reasoning by Large Language Models Isabel Tsintsiper et.al. 2602.04392 translate read null
2026-02-04 Beyond Rejection Sampling: Trajectory Fusion for Scaling Mathematical Reasoning Jie Deng et.al. 2602.04391 translate read null
2026-02-04 On the use of LLMs to generate a dataset of Neural Networks Nadia Daoudi et.al. 2602.04388 translate read null
2026-02-04 Multi-scale hypergraph meets LLMs: Aligning large language models for time series analysis Zongjiang Shang et.al. 2602.04369 translate read null
2026-02-04 EXaMCaP: Subset Selection with Entropy Gain Maximization for Probing Capability Gains of Large Chart Understanding Training Sets Jiapeng Liu et.al. 2602.04365 translate read null
2026-02-04 Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models Stefan Otten et.al. 2602.04358 translate read null
2026-02-04 Can Vision Replace Text in Working Memory? Evidence from Spatial n-Back in Vision-Language Models Sichu Liang et.al. 2602.04355 translate read null
2026-02-04 UnMaskFork: Test-Time Scaling for Masked Diffusion via Deterministic Action Branching Kou Misaki et.al. 2602.04344 translate read null
2026-02-04 From Assumptions to Actions: Turning LLM Reasoning into Uncertainty-Aware Planning for Embodied Agents SeungWon Seo et.al. 2602.04326 translate read null
2026-02-04 A Domain-Specific Curated Benchmark for Entity and Document-Level Relation Extraction Marco Martinelli et.al. 2602.04320 translate read null
2026-02-04 DeFrame: Debiasing Large Language Models Against Framing Effects Kahee Lim et.al. 2602.04306 translate read null
2026-02-04 Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification Branislav Pecher et.al. 2602.04297 translate read null
2026-02-04 ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas Wenjun Peng et.al. 2602.04296 translate read link
2026-02-04 How Few-shot Demonstrations Affect Prompt-based Defenses Against LLM Jailbreak Attacks Yanshu Wang et.al. 2602.04294 translate read null
2026-02-04 Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration Sudipto Ghosh et.al. 2602.04291 translate read null
2026-02-04 Guided Verifier: Collaborative Multimodal Reasoning via Dynamic Process Supervision Lingzhuang Sun et.al. 2602.04290 translate read null
2026-02-04 Contextual Drag: How Errors in the Context Affect LLM Reasoning Yun Cheng et.al. 2602.04288 translate read null
2026-02-04 ECG-R1: Protocol-Guided and Modality-Agnostic MLLM for Reliable ECG Interpretation Jiarui Jin et.al. 2602.04279 translate read link
2026-02-04 MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation Lin Wang et.al. 2602.04278 translate read null
2026-02-04 KVSmooth: Mitigating Hallucination in Multi-modal Large Language Models through Key-Value Smoothing Siyu Jiang et.al. 2602.04268 translate read null
2026-02-04 Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning Wenze Lin et.al. 2602.04265 translate read null
2026-02-04 Data Agents: Levels, State of the Art, and Open Problems Yuyu Luo et.al. 2602.04261 translate read null
2026-02-04 Scaling Agentic Verifier for Competitive Coding Zeyao Ma et.al. 2602.04254 translate read null
2026-02-04 Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search Hao Lu et.al. 2602.04248 translate read null
2026-02-04 CoLT: Reasoning with Chain of Latent Tool Calls Fangwei Zhu et.al. 2602.04246 translate read null
2026-02-04 On the Uncertainty of Large Language Model-Based Multi-Agent Systems Yuxuan Zhao et.al. 2602.04234 translate read null
2026-02-04 Following the TRAIL: Predicting and Explaining Tomorrow’s Hits with a Fine-Tuned LLM Yinan Zhang et.al. 2602.04225 translate read null
2026-02-04 Language Models Struggle to Use Representations Learned In-Context Michael A. Lepori et.al. 2602.04212 translate read null
2026-02-04 Steering LLMs via Scalable Interactive Oversight Enyu Zhou et.al. 2602.04210 translate read null
2026-02-04 Enforcing Monotonic Progress in Legal Cross-Examination: Preventing Long-Horizon Stagnation in LLM-Based Inquiry Hsien-Jyh Liao et.al. 2602.04206 translate read null
2026-02-04 Semantic Consensus Decoding: Backdoor Defense for Verilog Code Generation Guang Yang et.al. 2602.04195 translate read null
2026-02-04 SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow Suyu Ma et.al. 2602.04185 translate read null
2026-02-04 I Can’t Believe It’s Not a Valid Exploit Derin Gezgin et.al. 2602.04165 translate read null
2026-02-04 BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models Junyu Chen et.al. 2602.04163 translate read null
2026-02-04 Paint by Odor: An Exploration of Odor Visualization through Large Language Model and Generative AI Gang Yu et.al. 2602.04159 translate read null
2026-02-04 A Modern System Recipe for Situated Embodied Human-Robot Conversation with Real-Time Multimodal LLMs and Tool-Calling Dong Won Lee et.al. 2602.04157 translate read null
2026-02-04 JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset built with Large Language Models Hiroshi Sasaki et.al. 2602.04142 translate read null
2026-02-04 Semantic Pilot Design for Data-Aided Channel Estimation Using a Large Language Model Sojeong Park et.al. 2602.04126 translate read null
2026-02-04 Making Videos Accessible for Blind and Low Vision Users Using a Multimodal Agent Video Player Adriana Olmos et.al. 2602.04104 translate read null
2026-02-04 Rethinking Perplexity: Revealing the Impact of Input Length on Perplexity Evaluation in LLMs Letian Cheng et.al. 2602.04099 translate read null
2026-02-03 Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL Xiaofeng Lin et.al. 2602.04089 translate read null
2026-02-03 Abstraction Induces the Brain Alignment of Language and Speech Models Emily Cheng et.al. 2602.04081 translate read null
2026-02-03 Stroke Lesions as a Rosetta Stone for Language Model Interpretability Julius Fridriksson et.al. 2602.04074 translate read null
2026-02-03 Data Verification is the Future of Quantum Computing Copilots Junhao Song et.al. 2602.04072 translate read null
2026-02-03 Exploring the Potential of Large Language Models in Simulink-Stateflow Mutant Generation Pablo Valle et.al. 2602.04066 translate read null
2026-02-03 The CitizenQuery Benchmark: A Novel Dataset and Evaluation Pipeline for Measuring LLM Performance in Citizen Query Tasks Neil Majithia et.al. 2602.04064 translate read null
2026-02-03 RareCollab – An Agentic System Diagnosing Mendelian Disorders with Integrated Phenotypic and Molecular Evidence Guantong Qi et.al. 2602.04058 translate read null
2026-02-03 Evaluating the Vulnerability Landscape of LLM-Generated Smart Contracts Hoang Long Do et.al. 2602.04039 translate read null
2026-02-03 On the Credibility of Evaluating LLMs using Survey Questions Jindřich Libovický et.al. 2602.04033 translate read null
2026-02-03 Understanding and Guiding Layer Placement in Parameter-Efficient Fine-Tuning of Large Language Models Yichen Xu et.al. 2602.04019 translate read null
2026-02-03 Chaplains’ Reflections on the Design and Usage of AI for Conversational Care Joel Wester et.al. 2602.04017 translate read null
2026-02-03 PromptSplit: Revealing Prompt-Level Disagreement in Generative Models Mehdi Lotfian et.al. 2602.04009 translate read null
2026-02-03 StraTyper: Automated Semantic Type Discovery and Multi-Type Annotation for Dataset Collections Christos Koutras et.al. 2602.04004 translate read null
2026-02-03 When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Making Shutong Fan et.al. 2602.04003 translate read null
2026-02-03 After Talking with 1,000 Personas: Learning Preference-Aligned Proactive Assistants From Large-Scale Persona Interactions Ziyi Xuan et.al. 2602.04000 translate read null
2026-02-03 When Chains of Thought Don’t Matter: Causal Bypass in Large Language Models Anish Sathyanarayanan et.al. 2602.03994 translate read null
2026-02-03 Likelihood-Based Reward Designs for General LLM Reasoning Ariel Kwiatkowski et.al. 2602.03979 translate read null
2026-02-03 Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure Shuhui Qu et.al. 2602.03975 translate read null
2026-02-03 Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem Shama Magnur et.al. 2602.03969 translate read null
2026-02-03 Automatic Classification of Pedagogical Materials against CS Curriculum Guidelines Erik Saule et.al. 2602.03962 translate read null
2026-02-03 AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent Yinyi Luo et.al. 2602.03955 translate read link
2026-02-03 SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? Azmine Toushik Wasi et.al. 2602.03916 translate read link
2026-02-03 Knowledge Model Prompting Increases LLM Performance on Planning Tasks Erik Goh et.al. 2602.03900 translate read null
2026-02-03 Audit After Segmentation: Reference-Free Mask Quality Assessment for Language-Referred Audio-Visual Segmentation Jinxing Zhou et.al. 2602.03892 translate read null
2026-02-03 4DPC $^2$ hat: Towards Dynamic Point Cloud Understanding with Failure-Aware Bootstrapping Xindan Zhang et.al. 2602.03890 translate read null
2026-02-03 Understanding and Exploiting Weight Update Sparsity for Communication-Efficient Distributed RL Erfan Miahi et.al. 2602.03839 translate read null
2026-02-03 Accelerating Scientific Research with Gemini: Case Studies and Common Techniques David P. Woodruff et.al. 2602.03837 translate read null
2026-02-03 Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning Dingkun Zhang et.al. 2602.03815 translate read null
2026-02-03 Conformal Thinking: Risk Control for Reasoning on a Compute Budget Xi Wang et.al. 2602.03814 translate read null
2026-02-03 Antidistillation Fingerprinting Yixuan Even Xu et.al. 2602.03812 translate read null
2026-02-03 Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation Ziru Chen et.al. 2602.03806 translate read link
2026-02-03 Context Compression via Explicit Information Transmission Jiangnan Ye et.al. 2602.03784 translate read null
2026-02-03 Efficient Estimation of Kernel Surrogate Models for Task Attribution Zhenshuo Zhang et.al. 2602.03783 translate read null
2026-02-03 QVLA: Not All Channels Are Equal in Vision-Language-Action Model’s Quantization Yuhao Xu et.al. 2602.03782 translate read null
2026-02-03 A Scene Graph Backed Approach to Open Set Semantic Mapping Martin Günther et.al. 2602.03781 translate read null
2026-02-03 An Empirical Study of Collective Behaviors and Social Dynamics in Large Language Model Agents Farnoosh Hashemi et.al. 2602.03775 translate read null
2026-02-03 Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Ian Wu et.al. 2602.03773 translate read null
2026-02-03 UniGeM: Unifying Data Mixing and Selection via Geometric Exploration and Mining Changhao Wang et.al. 2602.03772 translate read null
2026-02-03 Training Multi-Turn Search Agent via Contrastive Dynamic Branch Sampling Yubao Zhao et.al. 2602.03719 translate read null
2026-02-03 SWE-Refactor: A Repository-Level Benchmark for Real-World LLM-Based Code Refactoring Yisen Xu et.al. 2602.03712 translate read null
2026-02-03 No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding Vynska Amalia Permadi et.al. 2602.03709 translate read null
2026-02-03 Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States Ximing Dong et.al. 2602.03708 translate read null
2026-02-03 Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models Yu Tian et.al. 2602.03704 translate read null
2026-02-03 Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging Alexandru Meterez et.al. 2602.03702 translate read null
2026-02-03 Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates Duy Nguyen et.al. 2602.03696 translate read null
2026-02-03 LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization Zishi Zhang et.al. 2602.03690 translate read null
2026-02-03 Universal One-third Time Scaling in Learning Peaked Distributions Yizhou Liu et.al. 2602.03685 translate read null
2026-02-03 Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration Yu Zhang et.al. 2602.03677 translate read null
2026-02-03 Mitigating Conversational Inertia in Multi-Turn Agents Yang Wan et.al. 2602.03664 translate read null
2026-02-03 Reinforcement Fine-Tuning for History-Aware Dense Retriever in RAG Yicheng Zhang et.al. 2602.03645 translate read null
2026-02-03 TRE: Encouraging Exploration in the Trust Region Chao Huang et.al. 2602.03635 translate read link
2026-02-03 Can LLMs Do Rocket Science? Exploring the Limits of Complex Reasoning with GTOC 12 Iñaki del Campo et.al. 2602.03630 translate read null
2026-02-03 Toward a new AI winter? How diffusion of technological innovation on networks leads to chaotic boom-bust cycles Sabin Roman et.al. 2602.03620 translate read null
2026-02-03 Controlling Output Rankings in Generative Engines for LLM-based Search Haibo Jin et.al. 2602.03608 translate read null
2026-02-03 Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation Haichao Jiang et.al. 2602.03595 translate read null
2026-02-03 SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM Ming Nie et.al. 2602.03589 translate read null
2026-02-03 $V_0$ : A Generalist Value Model for Any Policy at State Zero Yi-Kai Zhang et.al. 2602.03584 translate read null
2026-02-03 Don’t believe everything you read: Understanding and Measuring MCP Behavior under Misleading Tool Descriptions Zhihao Li et.al. 2602.03580 translate read null
2026-02-03 Use Graph When It Needs: Efficiently and Adaptively Integrating Retrieval-Augmented Generation with Graphs Su Dong et.al. 2602.03578 translate read null
2026-02-03 EHRWorld: A Patient-Centric Medical World Model for Long-Horizon Clinical Trajectories Linjie Mu et.al. 2602.03569 translate read null
2026-02-03 CoGenCast: A Coupled Autoregressive-Flow Generative Framework for Time Series Forecasting Yaguo Liu et.al. 2602.03564 translate read null
2026-02-03 Scaling Test-Driven Code Generation from Functions to Classes: An Empirical Study Yunhao Liang et.al. 2602.03557 translate read null
2026-02-03 When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs Bogdan Zagribelnyy et.al. 2602.03554 translate read null
2026-02-03 Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models Vitalii Hirak et.al. 2602.03551 translate read null
2026-02-03 SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue Yuqin Dai et.al. 2602.03548 translate read link
2026-02-03 Persona Generators: Generating Diverse Synthetic Personas at Scale Davide Paglieri et.al. 2602.03545 translate read null
2026-02-03 Can Large Language Models Generalize Procedures Across Representations? Fangru Lin et.al. 2602.03542 translate read null
2026-02-03 PnP-U3D: Plug-and-Play 3D Framework Bridging Autoregression and Diffusion for Unified Understanding and Generation Yongwei Chen et.al. 2602.03533 translate read null
2026-02-03 Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning Zixiang Di et.al. 2602.03516 translate read null
2026-02-03 Learning to Reason Faithfully through Step-Level Faithfulness Maximization Runquan Gui et.al. 2602.03507 translate read null
2026-02-03 Lookahead Path Likelihood Optimization for Diffusion LLMs Xuejie Liu et.al. 2602.03496 translate read null
2026-02-03 IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning Haohao Luo et.al. 2602.03468 translate read null
2026-02-03 Quantum Circuit Generation via test-time learning with large language models Adriano Macarone-Palmieri et.al. 2602.03466 translate read null
2026-02-03 RAL-Bench: Benchmarking for Application-Level Functional Correctness and Non-Functional Quality Attributes Ruwei Pan et.al. 2602.03462 translate read null
2026-02-03 Contextualized Visual Personalization in Vision-Language Models Yeongtak Oh et.al. 2602.03454 translate read null
2026-02-03 Beyond Variance: Prompt-Efficient RLVR via Rare-Event Amplification and Bidirectional Pairing Xin Sheng et.al. 2602.03452 translate read null
2026-02-03 Ontology-to-tools compilation for executable semantic constraint enforcement in LLM agents Xiaochi Zhou et.al. 2602.03439 translate read null
2026-02-03 When control meets large language models: From words to dynamics Komeil Nosrati et.al. 2602.03433 translate read null
2026-02-03 ProAct: A Benchmark and Multimodal Framework for Structure-Aware Proactive Response Xiaomeng Zhu et.al. 2602.03430 translate read null
2026-02-03 DiscoverLLM: From Executing Intents to Discovering Them Tae Soo Kim et.al. 2602.03429 translate read null
2026-02-03 RankSteer: Activation Steering for Pointwise LLM Ranking Yumeng Wang et.al. 2602.03422 translate read null
2026-02-03 SWE-World: Building Software Engineering Agents in Docker-Free Environments Shuang Sun et.al. 2602.03419 translate read link
2026-02-03 Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction Zhengbo Jiao et.al. 2602.03414 translate read null
2026-02-03 Verified Critical Step Optimization for LLM Agents Mukai Li et.al. 2602.03412 translate read null
2026-02-03 Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility Mengxuan Wang et.al. 2602.03402 translate read null
2026-02-03 Precision in Practice: Knowledge Guided Code Summarizing Grounded in Industrial Expectations Jintai Li et.al. 2602.03400 translate read null
2026-02-03 Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective Hao Fang et.al. 2602.03396 translate read null
2026-02-03 On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models Shumin Wang et.al. 2602.03392 translate read null
2026-02-03 Pursuing Best Industrial Practices for Retrieval-Augmented Generation in the Medical Domain Wei Zhu et.al. 2602.03368 translate read null
2026-02-03 MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling Ning Ding et.al. 2602.03359 translate read null
2026-02-03 MentalSeek-Dx: Towards Progressive Hypothetico-Deductive Reasoning for Real-world Psychiatric Diagnosis Xiao Sun et.al. 2602.03340 translate read null
2026-02-03 The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas Jacopo Amidei et.al. 2602.03334 translate read null
2026-02-03 MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning Shengyuan Liu et.al. 2602.03320 translate read null
2026-02-03 MIRROR: A Multi-Agent Framework with Iterative Adaptive Revision and Hierarchical Retrieval for Optimization Modeling in Operations Research Yifan Shi et.al. 2602.03318 translate read null
2026-02-03 Multi-Level Testing of Conversational AI Systems Elena Masserini et.al. 2602.03311 translate read null
2026-02-03 Entropy-Gated Selective Policy Optimization:Token-Level Gradient Allocation for Hybrid Training of Large Language Models Yuelin Hu et.al. 2602.03309 translate read null
2026-02-03 medR: Reward Engineering for Clinical Offline Reinforcement Learning via Tri-Drive Potential Functions Qianyi Xu et.al. 2602.03305 translate read null
2026-02-03 R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model? Jingyi Zhang et.al. 2602.03300 translate read null
2026-02-03 POP: Prefill-Only Pruning for Efficient Large Model Inference Junhui He et.al. 2602.03295 translate read null
2026-02-03 Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis Zhengbo Jiao et.al. 2602.03279 translate read null
2026-02-03 LogicScan: An LLM-driven Framework for Detecting Business Logic Vulnerabilities in Smart Contracts Jiaqi Gao et.al. 2602.03271 translate read null
2026-02-03 Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models Hicham Eddoubi et.al. 2602.03265 translate read null
2026-02-03 CSR-Bench: A Benchmark for Evaluating the Cross-modal Safety and Reliability of MLLMs Yuxuan Liu et.al. 2602.03263 translate read null
2026-02-03 The Necessity of a Unified Framework for LLM-Based Agent Evaluation Pengyu Zhu et.al. 2602.03238 translate read null
2026-02-03 Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations Yuxuan Yao et.al. 2602.03237 translate read null
2026-02-03 EventFlash: Towards Efficient MLLMs for Event-Based Vision Shaoyu Liu et.al. 2602.03230 translate read null
2026-02-03 Spiral RoPE: Rotate Your Rotary Positional Embeddings in the 2D Plane Haoyu Liu et.al. 2602.03227 translate read null
2026-02-03 ATACompressor: Adaptive Task-Aware Compression for Efficient Long-Context Processing in LLMs Xuancheng Li et.al. 2602.03226 translate read null
2026-02-03 Beyond Quantity: Trajectory Diversity Scaling for Code Agents Guhong Chen et.al. 2602.03219 translate read null
2026-02-03 Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection Dongwon Jo et.al. 2602.03216 translate read null
2026-02-03 ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution Zican Dong et.al. 2602.03203 translate read null
2026-02-03 Reinforcement Learning with Promising Tokens for Large Language Models Jing-Cheng Pang et.al. 2602.03195 translate read null
2026-02-03 Prompt Augmentation Scales up GRPO Training on Mathematical Reasoning Wenquan Lu et.al. 2602.03190 translate read null
2026-02-03 DynSplit-KV: Dynamic Semantic Splitting for KVCache Compression in Efficient Long-Context LLM Inference Jiancai Ye et.al. 2602.03184 translate read null
2026-02-03 Privasis: Synthesizing the Largest “Public” Private Dataset from Scratch Hyunwoo Kim et.al. 2602.03183 translate read null
2026-02-03 VALUEFLOW: Toward Pluralistic and Steerable Value-based Alignment in Large Language Models Woojin Kim et.al. 2602.03160 translate read null
2026-02-03 PAMAS: Self-Adaptive Multi-Agent System with Perspective Aggregation for Misinformation Detection Zongwei Wang et.al. 2602.03158 translate read null
2026-02-03 Is It Possible to Make Chatbots Virtuous? Investigating a Virtue-Based Design Methodology Applied to LLMs Matthew P. Lad et.al. 2602.03155 translate read null
2026-02-03 FASA: Frequency-aware Sparse Attention Yifei Wang et.al. 2602.03152 translate read null
2026-02-03 Internet of Agentic AI: Incentive-Compatible Distributed Teaming and Workflow Ya-Ting Yang et.al. 2602.03145 translate read null
2026-02-03 Self-Hinting Language Models Enhance Reinforcement Learning Baohao Liao et.al. 2602.03143 translate read null
2026-02-03 Contrastive Concept-Tree Search for LLM-Assisted Algorithm Discovery Timothee Leleu et.al. 2602.03132 translate read null
2026-02-03 Understanding Multi-Agent LLM Frameworks: A Unified Benchmark and Experimental Analysis Abdelghny Orogat et.al. 2602.03128 translate read null
2026-02-03 Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost Yinggan Xu et.al. 2602.03120 translate read null
2026-02-03 Digital Lifelong Learning in the Age of AI: Trends and Insights Geeta Puri et.al. 2602.03114 translate read null
2026-02-03 ChemPro: A Progressive Chemistry Benchmark for Large Language Models Aaditya Baranwal et.al. 2602.03108 translate read null
2026-02-03 The Mask of Civility: Benchmarking Chinese Mock Politeness Comprehension in Large Language Models Yitong Zhang et.al. 2602.03107 translate read null
2026-02-03 Task–Specificity Score: Measuring How Much Instructions Really Matter for Supervision Pritam Kadasi et.al. 2602.03103 translate read null
2026-02-03 Consensus Group Relative Policy Optimization for Text Generation Yuki Ichihara et.al. 2602.03102 translate read null
2026-02-03 Risky-Bench: Probing Agentic Safety Risks under Real-World Deployment Jingnan Zheng et.al. 2602.03100 translate read null
2026-02-03 De-conflating Preference and Qualification: Constrained Dual-Perspective Reasoning for Job Recommendation with Large Language Models Bryce Kan et.al. 2602.03097 translate read null
2026-02-03 Test-time Recursive Thinking: Self-Improvement without External Feedback Yufan Zhuang et.al. 2602.03094 translate read null
2026-02-03 AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback Zhitao Gao et.al. 2602.03084 translate read null
2026-02-03 ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution Junjie Huang et.al. 2602.03075 translate read null
2026-02-03 TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT Rana Muhammad Shahroz Khan et.al. 2602.03073 translate read null
2026-02-03 ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling Chao Shen et.al. 2602.03070 translate read null
2026-02-03 Skill-Based Autonomous Agents for Material Creep Database Construction Yue Wu et.al. 2602.03069 translate read null
2026-02-03 ALPBench: A Benchmark for Attribution-level Long-term Personal Behavior Understanding Lu Ren et.al. 2602.03056 translate read null
2026-02-03 MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems Vishal Venkataramani et.al. 2602.03053 translate read null
2026-02-03 SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression Xing Hu et.al. 2602.03051 translate read null
2026-02-03 Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation Bo Yuan et.al. 2602.03045 translate read null
2026-02-03 LatentMem: Customizing Latent Memory for Multi-Agent Systems Muxin Fu et.al. 2602.03036 translate read null
2026-02-03 Generalizable and Interpretable RF Fingerprinting with Shapelet-Enhanced Large Language Models Tianya Zhao et.al. 2602.03035 translate read null
2026-02-03 RC-GRPO: Reward-Conditioned Group Relative Policy Optimization for Multi-Turn Tool Calling Agents Haitian Zhong et.al. 2602.03025 translate read null
2026-02-03 Rethinking Music Captioning with Music Metadata LLMs Irmak Bukey et.al. 2602.03023 translate read null
2026-02-03 STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models Jiliang Ni et.al. 2602.03022 translate read null
2026-02-03 FedKRSO: Communication and Memory Efficient Federated Fine-Tuning of Large Language Models Guohao Yang et.al. 2602.03019 translate read null
2026-02-03 VOILA: Value-of-Information Guided Fidelity Selection for Cost-Aware Multimodal Question Answering Rahul Atul Bhope et.al. 2602.03007 translate read null
2026-02-03 Distilling LLM Reasoning into Graph of Concept Predictors Ziyang Yu et.al. 2602.03006 translate read null
2026-02-03 Methods and Open Problems in Differentiable Social Choice: Learning Mechanisms, Decisions, and Alignment Zhiyu An et.al. 2602.03003 translate read null
2026-02-03 Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation Jiaze Li et.al. 2602.02994 translate read null
2026-02-03 Large Language Models Can Take False First Steps at Inference-time Planning Haijiang Yan et.al. 2602.02991 translate read null
2026-02-03 NLI:Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference Jiangyong Yu et.al. 2602.02988 translate read null
2026-02-03 Large-Scale LLM Inference with Heterogeneous Workloads: Prefill-Decode Contention and Asymptotically Optimal Control Ruihan Lin et.al. 2602.02987 translate read null
2026-02-03 Are LLMs Biased Like Humans? Causal Reasoning as a Function of Prior Knowledge, Irrelevant Information, and Reasoning Budget Hanna M. Dettki et.al. 2602.02983 translate read null
2026-02-03 CPMobius: Iterative Coach-Player Reasoning for Data-Free Reinforcement Learning Ran Li et.al. 2602.02979 translate read null
2026-02-03 Where Norms and References Collide: Evaluating LLMs on Normative Reasoning Mitchell Abrams et.al. 2602.02975 translate read null
2026-02-03 Testing Framework Migration with Large Language Models Altino Alves et.al. 2602.02964 translate read null
2026-02-03 Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth Faye Zhang et.al. 2602.02961 translate read null
2026-02-03 Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning Yihong Huang et.al. 2602.02951 translate read null
2026-02-03 Equal Access, Unequal Interaction: A Counterfactual Audit of LLM Fairness Alireza Amiri-Margavi et.al. 2602.02932 translate read null
2026-02-02 FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights Zhen Wang et.al. 2602.02905 translate read null
2026-02-02 Failure-Aware Enhancements for Large Language Model (LLM) Code Generation: An Empirical Study on Decision Framework Jianru Shen et.al. 2602.02896 translate read null

(<a href=../LLM.md>back to LLM</a>)