LLM - 2026-03

Publish Date Title Authors PDF Translate Read Code
2026-03-31 Reward-Based Online LLM Routing via NeuralUCB Ming-Hua Tsai et.al. 2603.30035 translate read null
2026-03-31 The Triadic Cognitive Architecture: Bounding Autonomous Action via Spatio-Temporal and Epistemic Friction Davide Di Gioia et.al. 2603.30031 translate read null
2026-03-31 Can Commercial LLMs Be Parliamentary Political Companions? Comparing LLM Reasoning Against Romanian Legislative Expuneri de Motive Iulian Lucău et.al. 2603.30028 translate read null
2026-03-31 ContextClaim: A Context-Driven Paradigm for Verifiable Claim Detection Yufeng Li et.al. 2603.30025 translate read null
2026-03-31 Hybrid Framework for Robotic Manipulation: Integrating Reinforcement Learning and Large Language Models Md Saad et.al. 2603.30022 translate read null
2026-03-31 Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks Chong Xiang et.al. 2603.30016 translate read null
2026-03-31 Performative Scenario Optimization Quanyan Zhu et.al. 2603.29982 translate read null
2026-03-31 SurgTEMP: Temporal-Aware Surgical Video Question Answering with Text-guided Visual Memory for Laparoscopic Cholecystectomy Shi Li et.al. 2603.29962 translate read null
2026-03-31 Think Anywhere in Code Generation Xue Jiang et.al. 2603.29957 translate read null
2026-03-31 EC-Bench: Enumeration and Counting Benchmark for Ultra-Long Videos Fumihiko Tsuchiya et.al. 2603.29943 translate read null
2026-03-31 Bethe Ansatz with a Large Language Model Balázs Pozsgay et.al. 2603.29932 translate read null
2026-03-31 SISA: A Scale-In Systolic Array for GEMM Acceleration Luigi Altamura et.al. 2603.29913 translate read null
2026-03-31 C-TRAIL: A Commonsense World Framework for Trajectory Planning in Autonomous Driving Zhihong Cui et.al. 2603.29908 translate read null
2026-03-31 ATP-Bench: Towards Agentic Tool Planning for MLLM Interleaved Generation Yinuo Liu et.al. 2603.29902 translate read null
2026-03-31 ShapE-GRPO: Shapley-Enhanced Reward Allocation for Multi-Candidate LLM Training Rui Ai et.al. 2603.29871 translate read null
2026-03-31 SNEAK: Evaluating Strategic Communication and Information Leakage in Large Language Models Adar Avsian et.al. 2603.29846 translate read null
2026-03-31 Compiling Code LLMs into Lightweight Executables Jieke Shi et.al. 2603.29813 translate read null
2026-03-31 ENEIDE: A High Quality Silver Standard Dataset for Named Entity Recognition and Linking in Historical Italian Cristian Santini et.al. 2603.29801 translate read null
2026-03-31 Training-Free Dynamic Upcycling of Expert Language Models Eros Fanì et.al. 2603.29765 translate read null
2026-03-31 One-for-All: A Lightweight Stabilized and Parameter-Efficient Pre-trained LLM for Time Series Forecasting Prasanjit Dey et.al. 2603.29756 translate read null
2026-03-31 AI-Programmable Wireless Connectivity: Challenges and Research Directions Toward Interactive and Immersive Industry Haris Gacanin et.al. 2603.29752 translate read null
2026-03-31 Spontaneous Functional Differentiation in Large Language Models: A Brain-Like Intelligence Economy Junjie Zhang et.al. 2603.29735 translate read null
2026-03-31 Measuring the metacognition of AI Richard Servajean et.al. 2603.29693 translate read null
2026-03-31 KEditVis: A Visual Analytics System for Knowledge Editing of Large Language Models Zhenning Chen et.al. 2603.29689 translate read null
2026-03-31 Beyond the Steeper Curve: AI-Mediated Metacognitive Decoupling and the Limits of the Dunning-Kruger Metaphor Christopher Koch et.al. 2603.29681 translate read null
2026-03-31 Agenda-based Narrative Extraction: Steering Pathfinding Algorithms with Large Language Models Brian Felipe Keith-Norambuena et.al. 2603.29661 translate read null
2026-03-31 An Empirical Study of Multi-Agent Collaboration for Automated Research Yang Shen et.al. 2603.29632 translate read null
2026-03-31 BigEarthNet.txt: A Large-Scale Multi-Sensor Image-Text Dataset and Benchmark for Earth Observation Johann-Ludwig Herzog et.al. 2603.29630 translate read null
2026-03-31 Enhancing LLM-Based Bug Reproduction for Android Apps via Pre-Assessment of Visual Effects Xiangyang Xiao et.al. 2603.29623 translate read null
2026-03-31 Learning Diagnostic Reasoning for Decision Support in Toxicology Nico Oberländer et.al. 2603.29608 translate read null
2026-03-31 When Can We Trust LLM Graders? Calibrating Confidence for Automated Assessment Robinson Ferrer et.al. 2603.29559 translate read null
2026-03-31 Can LLM Agents Identify Spoken Dialects like a Linguist? Tobias Bystrich et.al. 2603.29541 translate read null
2026-03-31 Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction L. Ghiringhelli et.al. 2603.29529 translate read null
2026-03-31 LLM Probe: Evaluating LLMs for Low-Resource Languages Hailay Kidu Teklehaymanot et.al. 2603.29517 translate read null
2026-03-31 Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries Luoxin Chen et.al. 2603.29500 translate read null
2026-03-31 Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models Gabriel Loiseau et.al. 2603.29497 translate read null
2026-03-31 MemFactory: Unified Inference & Training Framework for Agent Memory Ziliang Guo et.al. 2603.29493 translate read null
2026-03-31 CXLRAMSim v1.0: System-Level Exploration of CXL Memory Expander Cards Karan Pathak et.al. 2603.29483 translate read null
2026-03-31 M-MiniGPT4: Multilingual VLLM Alignment via Translated Data Seung Hun Han et.al. 2603.29467 translate read null
2026-03-31 An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norms Nils Grünefeld et.al. 2603.29466 translate read null
2026-03-31 Authorship Impersonation via LLM Prompting does not Evade Authorship Verification Methods Baoyi Zeng et.al. 2603.29454 translate read null
2026-03-31 SeGPruner: Semantic-Geometric Visual Token Pruner for 3D Question Answering Wenli Li et.al. 2603.29437 translate read null
2026-03-31 Adversarial Prompt Injection Attack on Multimodal Large Language Models Meiwen Ding et.al. 2603.29418 translate read null
2026-03-31 PRISM: PRIor from corpus Statistics for topic Modeling Tal Ishon et.al. 2603.29406 translate read null
2026-03-31 ELT-Bench-Verified: Benchmark Quality Issues Underestimate AI Agent Capabilities Christopher Zanoli et.al. 2603.29399 translate read null
2026-03-31 Is my model perplexed for the right reason? Contrasting LLMs’ Benchmark Behavior with Token-Level Perplexity Zoë Prins et.al. 2603.29396 translate read null
2026-03-31 Assessing Multimodal Chronic Wound Embeddings with Expert Triplet Agreement Fabian Kabus et.al. 2603.29376 translate read null
2026-03-31 Beyond Idealized Patients: Evaluating LLMs under Challenging Patient Behaviors in Medical Consultations Yahan Li et.al. 2603.29373 translate read null
2026-03-31 AI-Generated Prior Authorization Letters: Strong Clinical Content, Weak Administrative Scaffolding Moiz Sadiq Awan et.al. 2603.29366 translate read null
2026-03-31 Self-Improving Code Generation via Semantic Entropy and Behavioral Consensus Huan Zhang et.al. 2603.29292 translate read null
2026-03-31 MELT: Improve Composed Image Retrieval via the Modification Frequentation-Rarity Balance Network Guozhi Qiu et.al. 2603.29291 translate read null
2026-03-31 Sima AIunty: Caste Audit in LLM-Driven Matchmaking Atharva Naik et.al. 2603.29288 translate read null
2026-03-31 Customer Analysis and Text Generation for Small Retail Stores Using LLM-Generated Marketing Presence Shiori Nakamura et.al. 2603.29273 translate read null
2026-03-31 Aligning Multimodal Sequential Recommendations via Robust Direct Preference Optimization with Sparse MoE Hejin Huang et.al. 2603.29259 translate read null
2026-03-31 Scaling the Long Video Understanding of Multimodal Large Language Models via Visual Memory Mechanism Tao Chen et.al. 2603.29252 translate read null
2026-03-31 Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs Zhuowen Liang et.al. 2603.29232 translate read null
2026-03-31 Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition Lukuang Dong et.al. 2603.29217 translate read null
2026-03-31 Software Vulnerability Detection Using a Lightweight Graph Neural Network Miles Farmer et.al. 2603.29216 translate read null
2026-03-31 Route-Induced Density and Stability (RIDE): Controlled Intervention and Mechanism Analysis of Routing-Style Meta Prompts on LLM Internal States Dianxing Zhang et.al. 2603.29206 translate read null
2026-03-31 BiMoE: Brain-Inspired Experts for EEG-Dominant Affective State Recognition Hongyu Zhu et.al. 2603.29205 translate read null
2026-03-31 Developing Adaptive Context Compression Techniques for Large Language Models (LLMs) in Long-Running Interactions Payal Fofadiya et.al. 2603.29193 translate read null
2026-03-31 Webscraper: Leverage Multimodal Large Language Models for Index-Content Web Scraping Guan-Lun Huang et.al. 2603.29161 translate read null
2026-03-31 SimMOF: AI agent for Automated MOF Simulations Jaewoong Lee et.al. 2603.29152 translate read null
2026-03-31 Knowledge database development by large language models for countermeasures against viruses and marine toxins Hung N. Do et.al. 2603.29149 translate read null
2026-03-31 REFINE: Real-world Exploration of Interactive Feedback and Student Behaviour Fares Fawzi et.al. 2603.29142 translate read null
2026-03-31 Modernizing Ground Truth: Four Shifts Toward Improving Reliability and Validity in AI in Education Danielle R. Thomas et.al. 2603.29141 translate read null
2026-03-31 SciVisAgentBench: A Benchmark for Evaluating Scientific Data Analysis and Visualization Agents Kuangshi Ai et.al. 2603.29139 translate read null
2026-03-31 GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification Iordanis Fostiropoulos et.al. 2603.29112 translate read null
2026-03-31 VueBuds: Visual Intelligence with Wireless Earbuds Maruchi Kim et.al. 2603.29095 translate read null
2026-03-31 WybeCoder: Verified Imperative Code Generation Fabian Gloeckle et.al. 2603.29088 translate read null
2026-03-30 HandX: Scaling Bimanual Motion and Interaction Generation Zimu Zhang et.al. 2603.28766 translate read null
2026-03-30 Adaptive Block-Scaled Data Types Jack Cook et.al. 2603.28765 translate read null
2026-03-30 Rethinking Language Model Scaling under Transferable Hypersphere Optimization Liliang Ren et.al. 2603.28743 translate read null
2026-03-30 SAGAI-MID: A Generative AI-Driven Middleware for Dynamic Runtime Interoperability Oliver Aleksander Larsen et.al. 2603.28731 translate read null
2026-03-30 EpiScreen: Early Epilepsy Detection from Electronic Health Records with Large Language Models Shuang Zhou et.al. 2603.28698 translate read null
2026-03-30 AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding Haozhe Qi et.al. 2603.28696 translate read null
2026-03-30 C2RustXW: Program-Structure-Aware C-to-Rust Translation via Program Analysis and LLM Yanyan Yan et.al. 2603.28686 translate read null
2026-03-30 A Techno-Economic Framework for Cost Modeling and Revenue Opportunities in Open and Programmable AI-RAN Gabriele Gemmi et.al. 2603.28680 translate read null
2026-03-30 Safeguarding LLMs Against Misuse and AI-Driven Malware Using Steganographic Canaries Md Raz et.al. 2603.28655 translate read null
2026-03-30 BACE: LLM-based Code Generation through Bayesian Anchored Co-Evolution of Code and Test Populations Kaushitha Silva et.al. 2603.28653 translate read null
2026-03-30 The Ultimate Tutorial for AI-driven Scale Development in Generative Psychometrics: Releasing AIGENIE from its Bottle Lara Russell-Lasalandra et.al. 2603.28643 translate read null
2026-03-30 Seeing with You: Perception-Reasoning Coevolution for Multimodal Reasoning Ziqi Miao et.al. 2603.28618 translate read null
2026-03-30 ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning Huanxuan Liao et.al. 2603.28610 translate read null
2026-03-30 One stout to rule them all: Reconciling artificial intelligence, data science and malted alcoholic beverages Dmitrii Usynin et.al. 2603.28607 translate read null
2026-03-30 Unsafe2Safe: Controllable Image Anonymization for Downstream Utility Mih Dinh et.al. 2603.28605 translate read null
2026-03-30 Moving Beyond Review: Applying Language Models to Planning and Translation in Reflection Seyed Parsa Neshaei et.al. 2603.28596 translate read null
2026-03-30 MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models Han Wang et.al. 2603.28590 translate read null
2026-03-30 Towards a Medical AI Scientist Hongtao Wu et.al. 2603.28589 translate read null
2026-03-30 Tiered Super-Moore’s Law: Price Evolution, Production Frontiers, and Market Competition in Large Language Model Inference Services Mingdeng Du et.al. 2603.28576 translate read null
2026-03-30 CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments Yi Yu et.al. 2603.28569 translate read null
2026-03-30 XSPA: Crafting Imperceptible X-Shaped Sparse Adversarial Perturbations for Transferable Attacks on VLMs Chengyin Hu et.al. 2603.28568 translate read null
2026-03-30 Fine-Tuning Large Language Models for Cooperative Tactical Deconfliction of Small Unmanned Aerial Systems Iman Sharifi et.al. 2603.28561 translate read null
2026-03-30 EarlySciRev: A Dataset of Early-Stage Scientific Revisions Extracted from LaTeX Writing Traces Léane Jourdan et.al. 2603.28515 translate read null
2026-03-30 Generalizable Detection of AI Generated Images with Large Models and Fuzzy Decision Tree Fei Wu et.al. 2603.28508 translate read null
2026-03-30 Courtroom-Style Multi-Agent Debate with Progressive RAG and Role-Switching for Controversial Claim Verification Masnun Nuha Chowdhury et.al. 2603.28488 translate read null
2026-03-30 CiQi-Agent: Aligning Vision, Tools and Aesthetics in Multimodal Agent for Cultural Reasoning on Chinese Porcelains Wenhan Wang et.al. 2603.28474 translate read null
2026-03-30 Evolutionary Discovery of Reinforcement Learning Algorithms via Large Language Models Alkis Sygkounas et.al. 2603.28416 translate read null
2026-03-30 Within the MDT Room: Situated in Multidisciplinary Team-Grounded Agent Debate for Clinical Diagnosis Peng Kuai et.al. 2603.28393 translate read null
2026-03-30 COvolve: Adversarial Co-Evolution of Large-Language-Model-Generated Policies and Environments via Two-Player Zero-Sum Game Alkis Sygkounas et.al. 2603.28386 translate read null
2026-03-30 Using Games to Learn How Large Language Models Work Allison Chen et.al. 2603.28374 translate read null
2026-03-30 Coherent Without Grounding, Grounded Without Success: Observability and Epistemic Failure Camilo Chacón Sartori et.al. 2603.28371 translate read null
2026-03-30 AutoCut: End-to-end advertisement video editing based on multimodal discretization and controllable generation Milton Zhou et.al. 2603.28366 translate read null
2026-03-30 SEA: Evaluating Sketch Abstraction Efficiency via Element-level Commonsense Visual Question Answering Jiho Park et.al. 2603.28363 translate read null
2026-03-30 Deep Research of Deep Research: From Transformer to Agent, From AI to AI for Science Yipeng Yu et.al. 2603.28361 translate read null
2026-03-30 A Multi-Agent Rhizomatic Pipeline for Non-Linear Literature Analysis Julio C. Serrano. Joonas Kevari et.al. 2603.28336 translate read null
2026-03-30 Integrating Multimodal Large Language Model Knowledge into Amodal Completion Heecheol Yun et.al. 2603.28333 translate read null
2026-03-30 Building evidence-based knowledge graphs from full-text literature for disease-specific biomedical reasoning Chang Zong et.al. 2603.28325 translate read null
2026-03-30 VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection Aymen Lassoued et.al. 2603.28309 translate read null
2026-03-30 Evaluating LLMs for Answering Student Questions in Introductory Programming Courses Thomas Van Mullem et.al. 2603.28295 translate read null
2026-03-30 Merge and Conquer: Instructing Multilingual Models by Adding Target Language Weights Eneko Valero et.al. 2603.28263 translate read null
2026-03-30 Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries Jon-Paul Cacioli et.al. 2603.28258 translate read null
2026-03-30 DiffAttn: Diffusion-Based Drivers’ Visual Attention Prediction with LLM-Enhanced Semantic Reasoning Weimin Liu et.al. 2603.28251 translate read null
2026-03-30 \textit{Versteasch du mi?} Computational and Socio-Linguistic Perspectives on GenAI, LLMs, and Non-Standard Language Verena Platzgummer et.al. 2603.28213 translate read null
2026-03-30 Beyond Cosine Similarity: Zero-Initialized Residual Complex Projection for Aspect-Based Sentiment Analysis Yijin Wang et.al. 2603.28205 translate read null
2026-03-30 ERPO: Token-Level Entropy-Regulated Policy Optimization for Large Reasoning Models Song Yu et.al. 2603.28204 translate read null
2026-03-30 EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling Yujie Zhang et.al. 2603.28197 translate read null
2026-03-30 DongYuan: An LLM-Based Framework for Integrative Chinese and Western Medicine Spleen-Stomach Disorders Diagnosis Hua Li et.al. 2603.28191 translate read null
2026-03-30 PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision Zehua Han et.al. 2603.28183 translate read null
2026-03-30 From Reviews to Requirements: Can LLMs Generate Human-Like User Stories? Shadman Sakib et.al. 2603.28163 translate read null
2026-03-30 Reducing Mental Workload through On-Demand Human Assistance for Physical Action Failures in LLM-based Multi-Robot Coordination Shoichi Hasegawa et.al. 2603.28156 translate read null
2026-03-30 ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment Tran Duong Minh Dai et.al. 2603.28128 translate read null
2026-03-30 Compressing Code Context for LLM-based Issue Resolution Haoxiang Jia et.al. 2603.28119 translate read null
2026-03-30 InconLens: Interactive Visual Diagnosis of Behavioral Inconsistencies in LLM-based Agentic Systems Shuo Yan et.al. 2603.28106 translate read null
2026-03-30 DELTA: A DAG-aware Efficient OCS Logical Topology Optimization Framework for AIDCs Niangen Ye et.al. 2603.28096 translate read null
2026-03-30 Can Large Language Models be a Cardinality Estimator? An Empirical study Liangzu Liu et.al. 2603.28080 translate read null
2026-03-30 SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring Yuang Wei et.al. 2603.28062 translate read null
2026-03-30 DAInfer+: Neurosymbolic Inference of API Specifications from Documentation via Embedding Models Maryam Masoudian et.al. 2603.28060 translate read null
2026-03-30 Is One-Shot In-Context Learning Helpful for Data Selection in Task-Specific Fine-Tuning of Multimodal LLMs? Xiao An et.al. 2603.28058 translate read null
2026-03-30 Meta-Harness: End-to-End Optimization of Model Harnesses Yoonho Lee et.al. 2603.28052 translate read null
2026-03-30 Beyond the Answer: Decoding the Behavior of LLMs as Scientific Reasoners Rohan Pandey et.al. 2603.28038 translate read null
2026-03-30 Low-Latency Edge LLM Handover via Joint KV Cache Transfer and Token Prefill Seunghun Lee et.al. 2603.28018 translate read null
2026-03-30 Progressive Prompt-Guided Cross-Modal Reasoning for Referring Image Segmentation Jiachen Li et.al. 2603.27993 translate read null
2026-03-30 ViviDoc: Generating Interactive Documents through Human-Agent Collaboration Yinghao Tang et.al. 2603.27991 translate read null
2026-03-30 Principal Prototype Analysis on Manifold for Interpretable Reinforcement Learning Bodla Krishna Vamshi et.al. 2603.27971 translate read null
2026-03-30 CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs Yongkang Du et.al. 2603.27958 translate read null
2026-03-30 Artificial Intelligence in Science: Returns, Reallocation, and Reorganization Moh Hosseinioun et.al. 2603.27956 translate read null
2026-03-30 EnsemJudge: Enhancing Reliability in Chinese LLM-Generated Text Detection through Diverse Model Ensembles Zhuoshang Wang et.al. 2603.27949 translate read null
2026-03-30 JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding Koki Maeda et.al. 2603.27942 translate read null
2026-03-30 GEAKG: Generative Executable Algorithm Knowledge Graphs Camilo Chacón Sartori et.al. 2603.27922 translate read null
2026-03-30 Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey Bhavuk Jain et.al. 2603.27918 translate read null
2026-03-30 ITQ3_S: High-Fidelity 3-bit LLM Inference via Interleaved Ternary Quantization with Rotation-Domain Smoothing Edward J. Yoon et.al. 2603.27914 translate read null
2026-03-25 Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini Ruofei Du et.al. 2603.24591 translate read null
2026-03-25 MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination Zhuo Li et.al. 2603.24579 translate read null
2026-03-25 LensWalk: Agentic Video Understanding by Planning How You See in Videos Keliang Li et.al. 2603.24558 translate read null
2026-03-25 Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents Samuel Taiwo et.al. 2603.24556 translate read null
2026-03-25 Representation Learning to Study Temporal Dynamics in Tutorial Scaffolding Conrad Borchers et.al. 2603.24535 translate read null
2026-03-25 UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Zichuan Lin et.al. 2603.24533 translate read null
2026-03-25 Video-Only ToM: Enhancing Theory of Mind in Multimodal Large Language Models Siqi Liu et.al. 2603.24484 translate read null
2026-03-25 Mechanic: Sorrifier-Driven Formal Decomposition Workflow for Automated Theorem Proving Ruichen Qiu et.al. 2603.24465 translate read null
2026-03-25 PINGALA: Prosody-Aware Decoding for Sanskrit Poetry Generation Manoj Balaji Jagadeeshan et.al. 2603.24413 translate read null
2026-03-25 AI-Supervisor: Autonomous AI Research Supervision via a Persistent Research World Model Yunbo Long et.al. 2603.24402 translate read null
2026-03-25 3D-Mix for VLA: A Plug-and-Play Module for Integrating VGGT-based 3D Information into Vision-Language-Action Models Bin Yu et.al. 2603.24393 translate read null
2026-03-25 When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools Xingming Li et.al. 2603.24389 translate read null
2026-03-25 MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization Xiangsen Chen et.al. 2603.24382 translate read null
2026-03-25 LATS: Large Language Model Assisted Teacher-Student Framework for Multi-Agent Reinforcement Learning in Traffic Signal Control Yifeng Zhang et.al. 2603.24361 translate read null
2026-03-25 Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin & Neuroplasticity Mechanisms Yupei Li et.al. 2603.24343 translate read null
2026-03-25 Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning Dogan Urgun et.al. 2603.24324 translate read null
2026-03-25 Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition Aleix Sant et.al. 2603.24242 translate read null
2026-03-25 UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking Liren Yu et.al. 2603.24226 translate read null
2026-03-25 Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing Michael Somma et.al. 2603.24221 translate read null
2026-03-25 Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias Mahdi Dehghan et.al. 2603.24218 translate read null
2026-03-25 SumRank: Aligning Summarization Models for Long-Document Listwise Reranking Jincheng Feng et.al. 2603.24204 translate read null
2026-03-25 Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search Yulin Shen et.al. 2603.24203 translate read null
2026-03-25 A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula Cansu Sancaktar et.al. 2603.24202 translate read null
2026-03-25 RefReward-SR: LR-Conditioned Reward Modeling for Preference-Aligned Super-Resolution Yushuai Song et.al. 2603.24198 translate read null
2026-03-25 Unlocking Few-Shot Capabilities in LVLMs via Prompt Conditioning and Head Selection Adhemar de Senneville et.al. 2603.24181 translate read null
2026-03-25 Towards Automated Crowdsourced Testing via Personified-LLM Shengcheng Yu et.al. 2603.24160 translate read null
2026-03-25 Linking Global Science Funding to Research Publications Jacob Aarup Dalsgaard et.al. 2603.24147 translate read null
2026-03-25 Sequence-aware Large Language Models for Explainable Recommendation Gangyi Zhang et.al. 2603.24136 translate read null
2026-03-25 MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare Shubham Kumar Nigam et.al. 2603.24132 translate read null
2026-03-25 Alignment Reduces Expressed but Not Encoded Gender Bias: A Unified Framework and Study Nour Bouchouchi et.al. 2603.24125 translate read null
2026-03-25 Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization Fei Bai et.al. 2603.24093 translate read null
2026-03-25 When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm Ye Leng et.al. 2603.24079 translate read null
2026-03-25 ConceptKT: A Benchmark for Concept-Level Deficiency Prediction in Knowledge Tracing Yu-Chen Kang et.al. 2603.24073 translate read null
2026-03-25 Enhanced Mycelium of Thought (EMoT): A Bio-Inspired Hierarchical Reasoning Architecture with Strategic Dormancy and Mnemonic Encoding Florian Odi Stummer et.al. 2603.24065 translate read null
2026-03-25 SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation Zhuoran Li et.al. 2603.24060 translate read null
2026-03-25 FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval Caishuang Huang et.al. 2603.24051 translate read null
2026-03-25 ACAVCaps: Enabling large-scale training for fine-grained and diverse audio understanding Yadong Niu et.al. 2603.24038 translate read null
2026-03-25 A^3: Towards Advertising Aesthetic Assessment Kaiyuan Ji et.al. 2603.24037 translate read null
2026-03-25 Decompose and Transfer: CoT-Prompting Enhanced Alignment for Open-Vocabulary Temporal Action Detection Sa Zhu et.al. 2603.24030 translate read null
2026-03-25 Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning Kun-Yang Yu et.al. 2603.24004 translate read null
2026-03-25 Forensic Implications of Localized AI: Artifact Analysis of Ollama, LM Studio, and llama.cpp Shariq Murtuza et.al. 2603.23996 translate read null
2026-03-25 Understanding the Challenges in Iterative Generative Optimization with LLMs Allen Nie et.al. 2603.23994 translate read null
2026-03-25 From Untamed Black Box to Interpretable Pedagogical Orchestration: The Ensemble of Specialized LLMs Architecture for Adaptive Tutoring Nizam Kadir et.al. 2603.23990 translate read null
2026-03-25 CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction Kaize Shi et.al. 2603.23989 translate read null
2026-03-25 Can we generate portable representations for clinical time series data using LLMs? Zongliang Ji et.al. 2603.23987 translate read null
2026-03-25 Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score Jimyung Hong et.al. 2603.23985 translate read null
2026-03-25 BRIDG-Q: Barren-Plateau-Resilient Initialisation with Data-Aware LLM-Generated Quantum Circuits Ngoc Nhi Nguyen et.al. 2603.23979 translate read null
2026-03-25 SilLang: Improving Gait Recognition with Silhouette Language Encoding Ruiyi Zhan et.al. 2603.23976 translate read null
2026-03-25 Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith Somaya Eltanbouly et.al. 2603.23972 translate read null
2026-03-25 Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage Rishikesh Sahay et.al. 2603.23966 translate read null
2026-03-25 From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments Lijing Luo et.al. 2603.23964 translate read null
2026-03-25 PointRFT: Explicit Reinforcement Fine-tuning for Point Cloud Few-shot Learning Yankai Wang et.al. 2603.23957 translate read null
2026-03-25 Towards Energy-aware Requirements Dependency Classification: Knowledge-Graph vs. Vector-Retrieval Augmented Inference with SLMs Shreyas Patil et.al. 2603.23954 translate read null
2026-03-25 VOLMO: Versatile and Open Large Models for Ophthalmology Zhenyue Qin et.al. 2603.23953 translate read null
2026-03-25 Argument Mining as a Text-to-Text Generation Task Masayuki Kawarada et.al. 2603.23949 translate read null
2026-03-25 Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development Zongliang Ji et.al. 2603.23937 translate read null
2026-03-25 Self-Distillation for Multi-Token Prediction Guoliang Zhao et.al. 2603.23911 translate read null
2026-03-25 AnalogAgent: Self-Improving Analog Circuit Design Automation with LLM Agents Zhixuan Bao et.al. 2603.23910 translate read null
2026-03-25 DUPLEX: Agentic Dual-System Planning via LLM-Driven Information Extraction Keru Hua et.al. 2603.23909 translate read null
2026-03-25 SiftMoE: Similarity-Aware Energy-Efficient Expert Selection for Wireless Distributed MoE Inference Qian Chen et.al. 2603.23888 translate read null
2026-03-25 Towards Real-World Document Parsing via Realistic Scene Synthesis and Document-Aware Training Gengluo Li et.al. 2603.23885 translate read null
2026-03-25 POSIM: A Multi-Agent Simulation Framework for Social Media Public Opinion Evolution and Governance Yongmao Zhang et.al. 2603.23884 translate read null
2026-03-25 ProcureGym: A Multi-Agent Markov Game Framework for Modeling National Volume-based Drug Procurement Jia Wang et.al. 2603.23880 translate read null
2026-03-25 Self-Evolving Multi-Agent Framework for Efficient Decision Making in Real-Time Strategy Scenarios Li Ma et.al. 2603.23875 translate read null
2026-03-25 HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation Ken Ding et.al. 2603.23871 translate read null
2026-03-25 APISENSOR: Robust Discovery of Web API from Runtime Traffic Logs Yanjing Yang et.al. 2603.23852 translate read null
2026-03-25 VILLA: Versatile Information Retrieval From Scientific Literature Using Large LAnguage Models Blessy Antony et.al. 2603.23849 translate read null
2026-03-25 PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay Rohan Khetan et.al. 2603.23841 translate read null
2026-03-25 Bridging the Interpretation Gap in Accessibility Testing: Empathetic and Legal-Aware Bug Report Generation via Large Language Models Ryoya Koyama et.al. 2603.23828 translate read null
2026-03-25 How Vulnerable Are Edge LLMs? Ao Ding et.al. 2603.23822 translate read null
2026-03-25 How are AI agents used? Evidence from 177,000 MCP tools Merlin Stein et.al. 2603.23802 translate read null
2026-03-25 Human, AI, and Hybrid Ensembles for Detection of Adaptive, RL-based Social Bots Valerio La Gatta et.al. 2603.23796 translate read null
2026-03-24 Sparse Autoencoders for Interpretable Medical Image Representation Learning Philipp Wesp et.al. 2603.23794 translate read null
2026-03-24 The Cognitive Firewall:Securing Browser Based AI Agents Against Indirect Prompt Injection Via Hybrid Edge Cloud Defense Qianlong Lan et.al. 2603.23791 translate read null
2026-03-24 Leveraging Large Language Models for Trustworthiness Assessment of Web Applications Oleksandr Yarotskyi et.al. 2603.23781 translate read null
2026-03-24 Lightweight Fairness for LLM-Based Recommendations via Kernelized Projection and Gated Adapters Nan Cui et.al. 2603.23780 translate read null
2026-03-24 AI-driven Intent-Based Networking Approach for Self-configuration of Next Generation Networks Md. Kamrul Hossain et.al. 2603.23772 translate read null
2026-03-24 IslamicMMLU: A Benchmark for Evaluating LLMs on Islamic Knowledge Ali Abdelaal et.al. 2603.23750 translate read null
2026-03-24 Exploring Self-Tracking Practices of Older Adults with CVD to Inform the Design of LLM-Enabled Health Data Sensemaking Duosi Dai et.al. 2603.23733 translate read null
2026-03-24 LLMs Do Not Grade Essays Like Humans Jerin George Mathew et.al. 2603.23714 translate read null
2026-03-24 The Diminishing Returns of Early-Exit Decoding in Modern LLMs Rui Wei et.al. 2603.23701 translate read null
2026-03-24 Towards Leveraging LLMs to Generate Abstract Penetration Test Cases from Software Architecture Mahdi Jafari et.al. 2603.23698 translate read null
2026-03-24 Assessment Design in the AI Era: A Method for Identifying Items Functioning Differentially for Humans and Chatbots Licol Zeinfeld et.al. 2603.23682 translate read null
2026-03-24 PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation Manjushree B. Aithal et.al. 2603.23678 translate read null
2026-03-24 Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models Mohammad Saleh Vahdatpour et.al. 2603.23668 translate read null
2026-03-24 GTO Wizard Benchmark Marc-Antoine Provost et.al. 2603.23660 translate read null
2026-03-24 Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges Weilun Xu et.al. 2603.23659 translate read null
2026-03-24 Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks Fatih Uenal et.al. 2603.23646 translate read null
2026-03-24 LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load Pranay Tummalapalli et.al. 2603.23640 translate read null
2026-03-24 Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments Yi Han et.al. 2603.23638 translate read null
2026-03-24 Detect–Repair–Verify for LLM-Generated Code: A Multi-Language, Multi-Granularity Empirical Study Cheng Cheng et.al. 2603.23633 translate read null
2026-03-24 Ukrainian Visual Word Sense Disambiguation Benchmark Yurii Laba et.al. 2603.23627 translate read null
2026-03-24 A Theory of LLM Information Susceptibility Zhuo-Yang Song et.al. 2603.23626 translate read null
2026-03-24 Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths Amani Maina-Kilaas et.al. 2603.23624 translate read null
2026-03-24 LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops Ravin Ravi et.al. 2603.23613 translate read null
2026-03-24 LLMORPH: Automated Metamorphic Testing of Large Language Models Steven Cho et.al. 2603.23611 translate read null
2026-03-24 Environment Maps: Structured Environmental Representations for Long-Horizon Agents Yenchia Feng et.al. 2603.23610 translate read null
2026-03-24 The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations Long Zhang et.al. 2603.23577 translate read null
2026-03-24 APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs Meriem Bouzouad et.al. 2603.23575 translate read null
2026-03-23 Mixture of Demonstrations for Textual Graph Understanding and Question Answering Yukun Wu et.al. 2603.23554 translate read null
2026-03-24 MedObvious: Exposing the Medical Moravec’s Paradox in VLMs via Clinical Triage Ufaq Khan et.al. 2603.23501 translate read null
2026-03-24 Failure of contextual invariance in gender inference with large language models Sagar Kumar et.al. 2603.23485 translate read null
2026-03-24 SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Haoyu Huang et.al. 2603.23483 translate read null
2026-03-24 ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains Muhammad Khalid et.al. 2603.23482 translate read null
2026-03-24 UniFunc3D: Unified Active Spatial-Temporal Grounding for 3D Functionality Segmentation Jiaying Lin et.al. 2603.23478 translate read null
2026-03-24 Evidence of political bias in search engines and language models before major elections Íris Damião et.al. 2603.23474 translate read null
2026-03-24 ConceptCoder: Improve Code Reasoning via Concept Learning Md Mahbubur Rahman et.al. 2603.23470 translate read null
2026-03-24 3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding Yiping Chen et.al. 2603.23447 translate read null
2026-03-24 Evaluating LLM-Based Test Generation Under Software Evolution Sabaat Haroon et.al. 2603.23443 translate read null
2026-03-24 SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling Yiqi Zhang et.al. 2603.23414 translate read null
2026-03-24 Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies Hanzhong Zhang et.al. 2603.23406 translate read null
2026-03-24 Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning Jiacheng Hua et.al. 2603.23404 translate read null
2026-03-24 Off-Policy Value-Based Reinforcement Learning for Large Language Models Peng-Yuan Wang et.al. 2603.23355 translate read null
2026-03-24 Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake Early Warnings Hanjing Wang et.al. 2603.23322 translate read null
2026-03-24 ARGENT: Adaptive Hierarchical Image-Text Representations Chuong Huynh et.al. 2603.23311 translate read null
2026-03-24 Curriculum-Driven 3D CT Report Generation via Language-Free Visual Grafting and Zone-Constrained Compression V. K. Cody Bumgardner et.al. 2603.23308 translate read null
2026-03-24 Designing Agentic AI-Based Screening for Portfolio Investment Mehmet Caner et.al. 2603.23300 translate read null
2026-03-24 Emergence of Fragility in LLM-based Social Networks: the Case of Moltbook Luca Sodano et.al. 2603.23279 translate read null
2026-03-24 A Multimodal Framework for Human-Multi-Agent Interaction Shaid Hasan et.al. 2603.23271 translate read null
2026-03-24 Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs Wenyu Chen et.al. 2603.23269 translate read null
2026-03-24 SafeSeek: Universal Attribution of Safety Circuits in Language Models Miao Yu et.al. 2603.23268 translate read null
2026-03-24 Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models Nasser A Alsadhan et.al. 2603.23251 translate read null
2026-03-24 MemCollab: Cross-Agent Memory Collaboration via Contrastive Trajectory Distillation Yurui Chang et.al. 2603.23234 translate read null
2026-03-24 PERMA: Benchmarking Personalized Memory Agents via Event-Driven Preference and Realistic Task Environments Shuochen Liu et.al. 2603.23231 translate read null
2026-03-24 I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes Shijia Zhou et.al. 2603.23229 translate read null
2026-03-24 Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics? Nasser A Alsadhan et.al. 2603.23219 translate read null
2026-03-24 Sparser, Faster, Lighter Transformer Language Models Edoardo Cetin et.al. 2603.23198 translate read null
2026-03-24 ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting Yeonkyung Lee et.al. 2603.23186 translate read null
2026-03-24 Robust Safety Monitoring of Language Models via Activation Watermarking Toluwani Aremu et.al. 2603.23171 translate read null
2026-03-24 Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models Massimiliano Pappa et.al. 2603.23149 translate read null
2026-03-24 Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy Shushanta Pudasaini et.al. 2603.23146 translate read null
2026-03-24 Can Language Models Pass Software Testing Certification Exams? a case study Fitash Ul Haq et.al. 2603.23142 translate read null
2026-03-24 HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Literature Devvrat Joshi et.al. 2603.23136 translate read null
2026-03-24 InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance Dongwei Pan et.al. 2603.23132 translate read null
2026-03-24 SMSP: A Plug-and-Play Strategy of Multi-Scale Perception for MLLMs to Perceive Visual Illusions Jinzhe Tu et.al. 2603.23118 translate read null
2026-03-24 AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection Yangxin Yu et.al. 2603.23115 translate read null
2026-03-24 When Language Models Lose Their Mind: The Consequences of Brain Misalignment Gabriele Merlin et.al. 2603.23091 translate read null
2026-03-24 Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure Michael Klesel et.al. 2603.23075 translate read null
2026-03-24 Can an LLM Detect Instances of Microservice Infrastructure Patterns? Carlos Eduardo Duarte et.al. 2603.23073 translate read null
2026-03-24 MLLM-HWSI: A Multimodal Large Language Model for Hierarchical Whole Slide Image Understanding Basit Alawode et.al. 2603.23067 translate read null
2026-03-24 Post-Selection Distributional Model Evaluation Amirmohammad Farzaneh et.al. 2603.23055 translate read null
2026-03-24 DBAutoDoc: Automated Discovery and Documentation of Undocumented Database Schemas via Statistical Analysis and Iterative LLM Refinement Amith Nagarajan et.al. 2603.23050 translate read null
2026-03-24 PCR: A Prefetch-Enhanced Cache Reuse System for Low-Latency RAG Serving Wenfeng Wang et.al. 2603.23049 translate read null
2026-03-24 Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation Julian Oestreich et.al. 2603.23047 translate read null
2026-03-24 Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps Chanyoung Gwak et.al. 2603.23023 translate read null
2026-03-24 Can Large Language Models Reason and Optimize Under Constraints? Fabien Bernier et.al. 2603.23004 translate read null
2026-03-24 JFTA-Bench: Evaluate LLM’s Ability of Tracking and Analyzing Malfunctions Using Fault Trees Yuhui Wang et.al. 2603.22978 translate read null
2026-03-24 Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy Weijun Li et.al. 2603.22968 translate read null
2026-03-24 Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees Ye Li et.al. 2603.22966 translate read null
2026-03-24 Caption Generation for Dongba Paintings via Prompt Learning and Semantic Fusion Shuangwu Qian et.al. 2603.22946 translate read null
2026-03-24 From Morality Installation in LLMs to LLMs in Morality-as-a-System Gunter Bombaerts et.al. 2603.22944 translate read null
2026-03-24 Optimizing Small Language Models for NL2SQL via Chain-of-Thought Fine-Tuning Anshul Solanki et.al. 2603.22942 translate read null
2026-03-24 Ran Score: a LLM-based Evaluation Score for Radiology Report Generation Ran Zhang et.al. 2603.22935 translate read null
2026-03-24 ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning Xiangyu Yin et.al. 2603.22934 translate read null
2026-03-24 SoK: The Attack Surface of Agentic AI – Tools, and Autonomy Ali Dehghantanha et.al. 2603.22928 translate read null
2026-03-24 Quality Over Clicks: Intrinsic Quality-Driven Iterative Reinforcement Learning for Cold-Start E-Commerce Query Suggestion Qi Sun et.al. 2603.22922 translate read null
2026-03-24 EVA: Efficient Reinforcement Learning for End-to-End Video Agent Yaolun Zhang et.al. 2603.22918 translate read null
2026-03-24 ForestPrune: High-ratio Visual Token Compression for Video Multimodal Large Language Models via Spatial-Temporal Forest Modeling Shaobo Ju et.al. 2603.22911 translate read null
2026-03-24 EchoKV: Efficient KV Cache Compression via Similarity-Based Reconstruction Yixuan Wang et.al. 2603.22910 translate read null
2026-03-24 Separating Diagnosis from Control: Auditable Policy Adaptation in Agent-Based Simulations with LLM-Based Diagnostics Shaoxin Zhong et.al. 2603.22904 translate read null
2026-03-24 VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents Pengsen Liu et.al. 2603.22892 translate read null
2026-03-24 TreeTeaming: Autonomous Red-Teaming of Vision-Language Models via Hierarchical Strategy Exploration Chunxiao Li et.al. 2603.22882 translate read null
2026-03-24 ForeSea: AI Forensic Search with Multi-modal Queries for Video Surveillance Hyojin Park et.al. 2603.22872 translate read null
2026-03-24 Dynamical Systems Theory Behind a Hierarchical Reasoning Model Vasiliy A. Es’kin et.al. 2603.22871 translate read null
2026-03-24 Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories Yang Li et.al. 2603.22869 translate read null
2026-03-24 Aerial Agentic AI: Synergizing LLM and SLM for Low-Altitude Wireless Networks Li Dong et.al. 2603.22866 translate read null
2026-03-24 The Evolution of Tool Use in LLM Agents: From Single-Tool Call to Multi-Tool Orchestration Haoyuan Xu et.al. 2603.22862 translate read null
2026-03-24 Who Sits Where? Automated Detection of Director Interlocks in Indian Companies Prateek Sancheti et.al. 2603.22860 translate read null
2026-03-24 Retrieval-Guided Photovoltaic Inventory Estimation from Satellite Imagery for Distribution Grid Planning Muhao Guo et.al. 2603.22856 translate read null
2026-03-24 Analysing LLM Persona Generation and Fairness Interpretation in Polarised Geopolitical Contexts Maida Aizaz et.al. 2603.22837 translate read null
2026-03-24 Improving Safety Alignment via Balanced Direct Preference Optimization Shiji Zhao et.al. 2603.22829 translate read null
2026-03-24 Focus, Don’t Prune: Identifying Instruction-Relevant Regions for Information-Rich Image Understanding Mincheol Kwon et.al. 2603.22815 translate read null
2026-03-24 Efficient Hallucination Detection: Adaptive Bayesian Estimation of Semantic Entropy with Guided Semantic Exploration Qiyao Sun et.al. 2603.22812 translate read null
2026-03-24 Span Modeling for Idiomaticity and Figurative Language Detection with Span Contrastive Loss Blake Matheny et.al. 2603.22799 translate read null
2026-03-24 Caterpillar of Thoughts: The Optimal Test-Time Algorithm for Large Language Models Amir Azarmehr et.al. 2603.22784 translate read null
2026-03-24 Know3D: Prompting 3D Generation with Knowledge from Vision-Language Models Wenyue Chen et.al. 2603.22782 translate read null
2026-03-24 KARMA: Knowledge-Action Regularized Multimodal Alignment for Personalized Search at Taobao Zhi Sun et.al. 2603.22779 translate read null
2026-03-24 AgriPestDatabase-v1.0: A Structured Insect Dataset for Training Agricultural Large Language Model Yagizhan Bilal Durak et.al. 2603.22777 translate read null
2026-03-24 Characterizing CPU-Induced Slowdowns in Multi-GPU LLM Inference Euijun Chung et.al. 2603.22774 translate read null
2026-03-24 DALDALL: Data Augmentation for Lexical and Semantic Diverse in Legal Domain by leveraging LLM-Persona Janghyeok Choi et.al. 2603.22765 translate read null
2026-03-24 ENC-Bench: A Benchmark for Evaluating Multimodal Large Language Models in Electronic Navigational Chart Understanding Ao Cheng et.al. 2603.22763 translate read null
2026-03-24 MVPBench: A Multi-Video Perception Evaluation Benchmark for Multi-Modal Video Understanding Purui Bai et.al. 2603.22756 translate read null
2026-03-24 PRISM: A Dual View of LLM Reasoning through Semantic Flow and Latent Computation Ruidi Chang et.al. 2603.22754 translate read null
2026-03-24 CIPL: A Target-Independent Framework for Channel-Inversion Privacy Leakage in Agents Tao Huang et.al. 2603.22751 translate read null
2026-03-24 Beyond Binary Correctness: Scaling Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks Abhishek Chandwani et.al. 2603.22744 translate read null
2026-03-24 Explanation Generation for Contradiction Reconciliation with LLMs Jason Chan et.al. 2603.22735 translate read null
2026-03-24 HyFI: Hyperbolic Feature Interpolation for Brain-Vision Alignment Sangmin Jo et.al. 2603.22721 translate read null
2026-03-24 Does Teaming-Up LLMs Improve Secure Code Generation? A Comprehensive Evaluation with Multi-LLMSecCodeEval Bushra Sabir et.al. 2603.22717 translate read null
2026-03-24 Detecting Non-Membership in LLM Training Data via Rank Correlations Pranav Shetty et.al. 2603.22707 translate read null
2026-03-24 Synthetic or Authentic? Building Mental Patient Simulators from Longitudinal Evidence Baihan Li et.al. 2603.22704 translate read null
2026-03-24 GeoTikzBridge: Advancing Multimodal Code Generation for Geometric Perception and Reasoning Jiayin Sun et.al. 2603.22687 translate read null
2026-03-24 Improving LLM Predictions via Inter-Layer Structural Encoders Tom Ulanovski et.al. 2603.22665 translate read null
2026-03-24 Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies Siddhant Kulkarni et.al. 2603.22651 translate read null
2026-03-23 AwesomeLit: Towards Hypothesis Generation with Agent-Supported Literature Research Zefei Xie et.al. 2603.22648 translate read null
2026-03-23 Multi-Method Validation of Large Language Model Medical Translation Across High- and Low-Resource Languages Chukwuebuka Anyaegbuna et.al. 2603.22642 translate read null
2026-03-23 LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation Hailay Teklehaymanot et.al. 2603.22629 translate read null
2026-03-23 To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models OFM Riaz Rahman Aranya et.al. 2603.22623 translate read null
2026-03-23 Emotional Support with Conversational AI: Talking to Machines About Life Olivia Yan Huang et.al. 2603.22618 translate read null
2026-03-23 BioShield: A Context-Aware Firewall for Securing Bio-LLMs Protiva Das et.al. 2603.22612 translate read null
2026-03-23 Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length Jingxuan Chen et.al. 2603.22608 translate read null
2026-03-23 Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models? Richard J. Young et.al. 2603.22582 translate read null
2026-03-23 STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving James Hugglestone et.al. 2603.22577 translate read null
2026-03-23 CAPITU: A Benchmark for Evaluating Instruction-Following in Brazilian Portuguese with Literary Context Giovana Kerche Bonás et.al. 2603.22576 translate read null
2026-03-23 TrustTrade: Human-Inspired Selective Consensus Reduces Decision Uncertainty in LLM Trading Agents Minghan Li et.al. 2603.22567 translate read null
2026-03-23 Reddit After Roe: A Computational Analysis of Abortion Narratives and Barriers in the Wake of Dobbs Aria Pessianzadeh et.al. 2603.22566 translate read null
2026-03-23 Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling Young Hyun Cho et.al. 2603.22563 translate read null
2026-03-23 GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs Achmad Anggawirya Alimin et.al. 2603.22528 translate read null
2026-03-23 LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface Michael Hind et.al. 2603.22519 translate read null
2026-03-23 Generating and Evaluating Sustainable Procurement Criteria for the Swiss Public Sector using In-Context Prompting with Large Language Models Yingqiang Gao et.al. 2603.22513 translate read null
2026-03-23 Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals Ali Safari et.al. 2603.22510 translate read null
2026-03-23 Rashid: A Cipher-Based Framework for Exploring In-Context Language Learning Niyati Bafna et.al. 2603.22497 translate read null
2026-03-23 Tiny Inference-Time Scaling with Latent Verifiers Davide Bucciarelli et.al. 2603.22492 translate read null
2026-03-23 From Brittle to Robust: Improving LLM Annotations for SE Optimization Lohith Senthilkumar et.al. 2603.22474 translate read null
2026-03-23 LLM-guided headline rewriting for clickability enhancement without clickbait Yehudit Aperstein et.al. 2603.22459 translate read null
2026-03-23 Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs Haoming Meng et.al. 2603.22446 translate read null
2026-03-23 From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents Ling Yue et.al. 2603.22386 translate read null
2026-03-23 FAAR: Format-Aware Adaptive Rounding for NVFP4 Hanglin Li et.al. 2603.22370 translate read null
2026-03-23 Reasoner-Executor-Synthesizer: Scalable Agentic Architecture with Static O(1) Context Window Ivan Dobrovolskyi et.al. 2603.22367 translate read null
2026-03-22 Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees Alberlucia Rafael Soarez et.al. 2603.22355 translate read null
2026-03-21 Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study Jenny Gao et.al. 2603.22344 translate read null
2026-03-21 T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search Hyomin Lee et.al. 2603.22341 translate read null
2026-03-21 Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation Chu Zhao et.al. 2603.22335 translate read null
2026-03-20 Large Language Models for Missing Data Imputation: Understanding Behavior, Hallucination Effects, and Control Mechanisms Arthur Dantas Mangussi et.al. 2603.22332 translate read null
2026-03-23 VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding Ruoliu Yang et.al. 2603.22285 translate read null
2026-03-23 3D-Layout-R1: Structured Reasoning for Language-Instructed Spatial Editing Haoyu Zhen et.al. 2603.22279 translate read null
2026-03-23 The Dual Mechanisms of Spatial Reasoning in Vision-Language Models Kelly Cui et.al. 2603.22278 translate read null
2026-03-23 Greater accessibility can amplify discrimination in generative AI Carolin Holtermann et.al. 2603.22260 translate read null
2026-03-23 RotorMap and Quantum Fingerprints of DNA Sequences via Rotary Position Embeddings Danylo Yakymenko et.al. 2603.22245 translate read null
2026-03-23 Gumbel Distillation for Parallel Text Generation Chi Zhang et.al. 2603.22216 translate read null
2026-03-23 Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models Tom Biskupski et.al. 2603.22214 translate read null
2026-03-23 SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection Kexian Tang et.al. 2603.22213 translate read null
2026-03-23 Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement Junrong Guo et.al. 2603.22187 translate read null
2026-03-23 Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation Ireh Kim et.al. 2603.22186 translate read null
2026-03-23 Revisiting Quantum Code Generation: Where Should Domain Knowledge Live? Oscar Novo et.al. 2603.22184 translate read null
2026-03-23 Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning Dmitrii Plotnikov et.al. 2603.22169 translate read null
2026-03-23 Causal Evidence that Language Models use Confidence to Drive Behavior Dharshan Kumaran et.al. 2603.22161 translate read null
2026-03-23 Multimodal Survival Analysis with Locally Deployable Large Language Models Moritz Gögl et.al. 2603.22158 translate read null
2026-03-23 On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Kexin Huang et.al. 2603.22117 translate read null
2026-03-23 Lemma Discovery in Agentic Program Verification Huan Zhao et.al. 2603.22114 translate read null
2026-03-23 SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning Byungwoo Jeon et.al. 2603.22057 translate read null
2026-03-23 Dual-Space Knowledge Distillation with Key-Query Matching for Large Language Models with Vocabulary Mismatch Stella Eva Tsiapali et.al. 2603.22056 translate read null
2026-03-23 Dynamic analysis enhances issue resolution Mingwei Liu et.al. 2603.22048 translate read null
2026-03-23 AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing Peter Pak et.al. 2603.22017 translate read null
2026-03-23 ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention Xinyan Wang et.al. 2603.22016 translate read null
2026-03-23 SecureBreak – A dataset towards safe and secure models Marco Arazzi et.al. 2603.21975 translate read null
2026-03-23 Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe Xixi Wu et.al. 2603.21972 translate read null
2026-03-23 Parameter-Efficient Fine-Tuning for Medical Text Summarization: A Comparative Study of Lora, Prompt Tuning, and Full Fine-Tuning Ulugbek Shernazarov et.al. 2603.21970 translate read null
2026-03-23 Unified Spatiotemporal Token Compression for Video-LLMs at Ultra-Low Retention Junhao Du et.al. 2603.21957 translate read null
2026-03-23 Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection Youbin Kim et.al. 2603.21944 translate read null
2026-03-23 ADaFuSE: Adaptive Diffusion-generated Image and Text Fusion for Interactive Text-to-Image Retrieval Zhuocheng Zhang et.al. 2603.21886 translate read null
2026-03-23 P^2O: Joint Policy and Prompt Optimization Xinyu Lu et.al. 2603.21877 translate read null
2026-03-23 Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization Weilin Wan et.al. 2603.21862 translate read null
2026-03-23 Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models Aryan Kasat et.al. 2603.21854 translate read null
2026-03-23 Asymmetric Dynamics of Partisan Warriors in YouTube Comments Keyeun Lee et.al. 2603.21776 translate read null
2026-03-23 The Presupposition Problem in Representation Genesis Yiling Wu et.al. 2603.21745 translate read null
2026-03-23 EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning Andreas Sauter et.al. 2603.21728 translate read null
2026-03-23 CurvZO: Adaptive Curvature-Guided Sparse Zeroth-Order Optimization for Efficient LLM Fine-Tuning Shuo Wang et.al. 2603.21725 translate read null
2026-03-23 Can a Robot Walk the Robotic Dog: Triple-Zero Collaborative Navigation for Heterogeneous Multi-Agent Systems Yaxuan Wang et.al. 2603.21723 translate read null
2026-03-23 SemEval-2026 Task 12: Abductive Event Reasoning: Towards Real-World Event Causal Inference for Large Language Models Pengfei Cao et.al. 2603.21720 translate read null
2026-03-23 Probing How Scalable Table Data Enhances General Long-Context Reasoning Huaibing Xie et.al. 2603.21719 translate read null
2026-03-23 Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning Xi Wang et.al. 2603.21708 translate read null
2026-03-23 Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs Tian Xia et.al. 2603.21705 translate read null
2026-03-23 Rethinking Token Reduction for Large Vision-Language Models Yi Wang et.al. 2603.21701 translate read null
2026-03-23 Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models Rui Yang Tan et.al. 2603.21697 translate read null
2026-03-23 Deterministic Hallucination Detection in Medical VQA via Confidence-Evidence Bayesian Gain Mohammad Asadi et.al. 2603.21693 translate read null
2026-03-23 AI Token Futures Market: Commoditization of Compute and Derivatives Contract Design Yicai Xing et.al. 2603.21690 translate read null
2026-03-23 Is AI Ready for Multimodal Hate Speech Detection? A Comprehensive Dataset and Benchmark Evaluation Rui Xing et.al. 2603.21686 translate read null
2026-03-23 Optimizing Multi-Agent Weather Captioning via Text Gradient Descent: A Training-Free Approach with Consensus-Aware Gradient Fusion Shixu Liu et.al. 2603.21673 translate read null
2026-03-23 HumanOmni-Speaker: Identifying Who said What and When Detao Bai et.al. 2603.21664 translate read null
2026-03-23 TAMTRL: Teacher-Aligned Reward Reshaping for Multi-Turn Reinforcement Learning in Long-Context Compression Li Wang et.al. 2603.21663 translate read null
2026-03-23 OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging Meilin Liu et.al. 2603.21660 translate read null
2026-03-23 Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks Yanming Mu et.al. 2603.21654 translate read null
2026-03-23 Auditing MCP Servers for Over-Privileged Tool Capabilities Charoes Huang et.al. 2603.21641 translate read null
2026-03-23 Silicon Bureaucracy and AI Test-Oriented Education: Contamination Sensitivity and Score Confidence in LLM Benchmarks Yiliang Song et.al. 2603.21636 translate read null
2026-03-23 AgenticRec: End-to-End Tool-Integrated Policy Optimization for Ranking-Oriented Recommender Agents Tianyi Li et.al. 2603.21613 translate read null
2026-03-23 Riemannian Geometry Speaks Louder Than Words: From Graph Foundation Model to Next-Generation Graph Intelligence Philip S. Yu et.al. 2603.21601 translate read null
2026-03-23 SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models Md Kaykobad Reza et.al. 2603.21584 translate read null
2026-03-23 Overview of TREC 2025 Biomedical Generative Retrieval (BioGen) Track Deepak Gupta et.al. 2603.21582 translate read null
2026-03-23 Mind over Space: Can Multimodal Large Language Models Mentally Navigate? Qihui Zhu et.al. 2603.21577 translate read null
2026-03-23 Adaptive Robust Estimator for Multi-Agent Reinforcement Learning Zhongyi Li et.al. 2603.21574 translate read null
2026-03-23 DATASHI: A Parallel English-Tashlhiyt Corpus for Orthography Normalization and Low-Resource Language Processing Nasser-Eddine Monir et.al. 2603.21571 translate read null
2026-03-23 Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy Andrii Shportko et.al. 2603.21567 translate read null
2026-03-23 Counterfactual Credit Policy Optimization for Multi-Agent Collaboration Zhongyi Li et.al. 2603.21563 translate read null
2026-03-23 AI In Cybersecurity Education – Scalable Agentic CTF Design Principles and Educational Outcomes Haoran Xi et.al. 2603.21551 translate read null
2026-03-23 LLM-Based Test Case Generation in DBMS through Monte Carlo Tree Search Yujia Chen et.al. 2603.21530 translate read null
2026-03-23 SynSym: A Synthetic Data Generation Framework for Psychiatric Symptom Identification Migyeong Kang et.al. 2603.21529 translate read null
2026-03-23 VIGIL: Part-Grounded Structured Reasoning for Generalizable Deepfake Detection Xinghan Li et.al. 2603.21526 translate read null
2026-03-23 CatRAG: Functor-Guided Structural Debiasing with Retrieval Augmentation for Fair LLMs Ravi Ranjan et.al. 2603.21524 translate read null
2026-03-23 SafePilot: A Framework for Assuring LLM-enabled Cyber-Physical Systems Weizhe Xu et.al. 2603.21523 translate read null
2026-03-23 Efficient Failure Management for Multi-Agent Systems with Reasoning Trace Representation Lingzhe Zhang et.al. 2603.21522 translate read null
2026-03-23 Generalizable Self-Evolving Memory for Automatic Prompt Optimization Guanbao Liang et.al. 2603.21520 translate read null
2026-03-23 Triangulating Temporal Dynamics in Multilingual Swiss Online News Bros Victor et.al. 2603.21519 translate read null
2026-03-23 Learning Inflation Narratives from Reddit: How Lightweight LLMs Reveal Forward-Looking Economic Signals Ryuichi Saito et.al. 2603.21501 translate read null
2026-03-23 Agentic Automation of BT-RADS Scoring: End-to-End Multi-Agent System for Standardized Brain Tumor Follow-up Assessment Mohamed Sobhi Jabal et.al. 2603.21494 translate read null
2026-03-23 Learning Trajectory-Aware Multimodal Large Language Models for Video Reasoning Segmentation Jingnan Luo et.al. 2603.21488 translate read null
2026-03-23 TagLLM: A Fine-Grained Tag Generation Approach for Note Recommendation Zhijian Chen et.al. 2603.21481 translate read null
2026-03-23 Beyond Correlation: Refutation-Validated Aspect-Based Sentiment Analysis for Explainable Energy Market Returns Wihan van der Heever et.al. 2603.21473 translate read null
2026-03-23 DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation Siqi Guo et.al. 2603.21465 translate read null
2026-03-22 Deliberative multi-agent large language models improve clinical reasoning in ophthalmology Ehsan Misaghi et.al. 2603.21447 translate read null
2026-03-22 KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning Shuai Wang et.al. 2603.21440 translate read null
2026-03-22 DomAgent: Leveraging Knowledge Graphs and Case-Based Reasoning for Domain-Specific Code Generation Shuai Wang et.al. 2603.21430 translate read null
2026-03-22 Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models Jingchen Sun et.al. 2603.21426 translate read null
2026-03-22 Efficient Fine-Tuning Methods for Portuguese Question Answering: A Comparative Study of PEFT on BERTimbau and Exploratory Evaluation of Generative LLMs Mariela M. Nina et.al. 2603.21418 translate read null
2026-03-22 Enterprise Sales Copilot: Enabling Real-Time AI Support with Automatic Information Retrieval in Live Sales Calls Jielin Qiu et.al. 2603.21416 translate read null
2026-03-22 Silent Commitment Failure in Instruction-Tuned Language Models: Evidence of Governability Divergence Across Architectures Gregory M. Ruddell et.al. 2603.21415 translate read null
2026-03-22 Multi-Perspective LLM Annotations for Valid Analyses in Subjective Tasks Navya Mehrotra et.al. 2603.21404 translate read null
2026-03-22 Persona Vectors in Games: Measuring and Steering Strategies via Activation Vectors Johnathan Sun et.al. 2603.21398 translate read null
2026-03-22 Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models Jinghan Cao et.al. 2603.21389 translate read null
2026-03-22 PLR: Plackett-Luce for Reordering In-Context Learning Examples Pawel Batorski et.al. 2603.21373 translate read null
2026-03-22 TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference Jaber Jaber et.al. 2603.21365 translate read null
2026-03-22 Benchmarking Bengali Dialectal Bias: A Multi-Stage Framework Integrating RAG-Based Translation and Human-Augmented RLAIF K. M. Jubair Sami et.al. 2603.21359 translate read null
2026-03-22 RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models Dongyoung Kim et.al. 2603.21341 translate read null
2026-03-22 COINBench: Moving Beyond Individual Perspectives to Collective Intent Understanding Xiaozhe Li et.al. 2603.21329 translate read null
2026-03-22 Improving Coherence and Persistence in Agentic AI for System Optimization Pantea Karimi et.al. 2603.21321 translate read null
2026-03-22 Enhancing reasoning accuracy in large language models during inference time Vinay Sharma et.al. 2603.21301 translate read null
2026-03-22 When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning Zhengxian Wu et.al. 2603.21289 translate read null
2026-03-22 When the Chain Breaks: Interactive Diagnosis of LLM Chain-of-Thought Reasoning Errors Shiwei Chen et.al. 2603.21286 translate read null
2026-03-22 WARBENCH: A Comprehensive Benchmark for Evaluating LLMs in Military Decision-Making Zongjie Li et.al. 2603.21280 translate read null
2026-03-22 Conversation Tree Architecture: A Structured Framework for Context-Aware Multi-Branch LLM Conversations Pranav Hemanth et.al. 2603.21278 translate read null
2026-03-22 Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity Zihan Fang et.al. 2603.21276 translate read null
2026-03-22 Graph of States: Solving Abductive Tasks with Large Language Models Yu Luo et.al. 2603.21250 translate read null
2026-03-22 Graph Fusion Across Languages using Large Language Models Kaung Myat Kyaw et.al. 2603.21248 translate read null
2026-03-22 ConsRoute:Consistency-Aware Adaptive Query Routing for Cloud-Edge-Device Large Language Models Haoyu Qiao et.al. 2603.21237 translate read null
2026-03-22 QMoP: Query Guided Mixture-of-Projector for Efficient Visual Token Compression Zhongyang Li et.al. 2603.21232 translate read null
2026-03-22 Context Selection for Hypothesis and Statistical Evidence Extraction from Full-Text Scientific Articles Sai Koneru et.al. 2603.21193 translate read null
2026-03-22 DS2SC-Agent: A Multi-Agent Automated Pipeline for Rapid Chiplet Model Generation Yiwei Wu et.al. 2603.21190 translate read null
2026-03-22 GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing Zifeng Zhu et.al. 2603.21176 translate read null
2026-03-22 Reward Sharpness-Aware Fine-Tuning for Diffusion Models Kwanyoung Kim et.al. 2603.21175 translate read null
2026-03-22 Explainable Semantic Textual Similarity via Dissimilar Span Detection Diego Miguel Lozano et.al. 2603.21174 translate read null
2026-03-22 Many Dialects, Many Languages, One Cultural Lens: Evaluating Multilingual VLMs for Bengali Culture Understanding Across Historically Linked Languages and Regional Dialects Nurul Labib Sayeedi et.al. 2603.21165 translate read null
2026-03-22 Revisiting Tree Search for LLMs: Gumbel and Sequential Halving for Budget-Scalable Reasoning Leonid Ugadiarov et.al. 2603.21162 translate read null
2026-03-22 Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs Zihui Chen et.al. 2603.21155 translate read null
2026-03-22 TRACE: A Multi-Agent System for Autonomous Physical Reasoning in Seismological Feng Liu et.al. 2603.21152 translate read null
2026-03-22 ORACLE: Optimizing Reasoning Abilities of Large Language Models via Constraint-Led Synthetic Data Elicitation Zhuojie Yang et.al. 2603.21140 translate read null
2026-03-22 CVT-Bench: Counterfactual Viewpoint Transformations Reveal Unstable Spatial Representations in Multimodal LLMs Shanmukha Vellamcheti et.al. 2603.21114 translate read null
2026-03-22 Evaluating Reasoning-Based Scaffolds for Human-AI Co-Annotation: The ReasonAlign Annotation Protocol Smitha Muthya Sudheendra et.al. 2603.21094 translate read null
2026-03-22 CoVFT: Context-aware Visual Fine-tuning for Multimodal Large Language Models Nan Zhou et.al. 2603.21077 translate read null
2026-03-22 When Minor Edits Matter: LLM-Driven Prompt Attack for Medical VLM Robustness in Ultrasound Yasamin Medghalchi et.al. 2603.21047 translate read null
2026-03-22 Left Behind: Cross-Lingual Transfer as a Bridge for Low-Resource Languages in Large Language Models Abdul-Salem Beibitkhan et.al. 2603.21036 translate read null
2026-03-22 KLDrive: Fine-Grained 3D Scene Reasoning for Autonomous Driving based on Knowledge Graph Ye Tian et.al. 2603.21029 translate read null
2026-03-22 SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration Zihan Guo et.al. 2603.21019 translate read null
2026-03-22 Mitigating Selection Bias in Large Language Models via Permutation-Aware GRPO Jinquan Zheng et.al. 2603.21016 translate read null
2026-03-22 CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs Florent Draye et.al. 2603.21014 translate read null
2026-03-22 ECI: Effective Contrastive Information to Evaluate Hard-Negatives Aarush Sinha et.al. 2603.20990 translate read null
2026-03-22 Can we automatize scientific discovery in the cognitive sciences? Akshay K. Jagadish et.al. 2603.20988 translate read null
2026-03-21 Detection of adversarial intent in Human-AI teams using LLMs Abed K. Musaffar et.al. 2603.20976 translate read null
2026-03-21 Learning to Aggregate Zero-Shot LLM Agents for Corporate Disclosure Classification Kemal Kirtac et.al. 2603.20965 translate read null
2026-03-21 Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models Xinyue Liu et.al. 2603.20957 translate read null
2026-03-21 User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction Yuren Hao et.al. 2603.20939 translate read null
2026-03-21 AC4A: Access Control for Agents Reshabh K Sharma et.al. 2603.20933 translate read null
2026-03-21 Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues Tai-Quan Peng et.al. 2603.20911 translate read null
2026-03-21 LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models Amirmohammad Ziaei Bideh et.al. 2603.20910 translate read null
2026-03-21 Mitigating Shortcut Reasoning in Language Models: A Gradient-Aware Training Approach Hongyu Cao et.al. 2603.20899 translate read null
2026-03-21 AcoustEmo: Open-Vocabulary Emotion Reasoning via Utterance-Aware Acoustic Q-Former Liyun Zhang et.al. 2603.20894 translate read null
2026-03-21 RubricRAG: Towards Interpretable and Reliable LLM Evaluation via Domain Knowledge Retrieval for Rubric Generation Kaustubh D. Dhole et.al. 2603.20882 translate read null
2026-03-21 Engineering Pitfalls in AI Coding Tools: An Empirical Study of Bugs in Claude Code, Codex, and Gemini CLI Ruixin Zhang et.al. 2603.20847 translate read null
2026-03-21 Predictive Regularization Against Visual Representation Degradation in Multimodal Large Language Models Enguang Wang et.al. 2603.20808 translate read null
2026-03-21 BenchBench: Benchmarking Automated Benchmark Generation Yandan Zheng et.al. 2603.20807 translate read null
2026-03-21 RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution Kaiyuan Li et.al. 2603.20799 translate read null
2026-03-21 The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing Yuan Cao et.al. 2603.20795 translate read null
2026-03-21 Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge Enhancement Jiang Liu et.al. 2603.20781 translate read null
2026-03-21 SATTC: Structure-Aware Label-Free Test-Time Calibration for Cross-Subject EEG-to-Image Retrieval Qunjie Huang et.al. 2603.20738 translate read null
2026-03-21 MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages Anri Lombard et.al. 2603.20732 translate read null
2026-03-21 Premier: Personalized Preference Modulation with Learnable User Embedding in Text-to-Image Generation Zihao Wang et.al. 2603.20725 translate read null
2026-03-21 Cross-modal Fuzzy Alignment Network for Text-Aerial Person Retrieval and A Large-scale Benchmark Yifei Deng et.al. 2603.20721 translate read null
2026-03-21 NDT: Non-Differential Transformer and Its Application to Sentiment Analysis Soudeep Ghoshal et.al. 2603.20704 translate read null
2026-03-21 Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs Huan Zheng et.al. 2603.20698 translate read null
2026-03-21 AI-Driven Multi-Agent Simulation of Stratified Polyamory Systems: A Computational Framework for Optimizing Social Reproductive Efficiency Yicai Xing et.al. 2603.20678 translate read null
2026-03-21 Towards Intelligent Geospatial Data Discovery: a knowledge graph-driven multi-agent framework powered by large language models Ruixiang Liu et.al. 2603.20670 translate read null
2026-03-21 WWW.Serve: Interconnecting Global LLM Services through Decentralization Huanyu Wang et.al. 2603.20661 translate read null
2026-03-21 A Multihead Continual Learning Framework for Fine-Grained Fashion Image Retrieval with Contrastive Learning and Exponential Moving Average Distillation Ling Xiao et.al. 2603.20648 translate read null
2026-03-21 Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention Manh Nguyen et.al. 2603.20640 translate read null
2026-03-21 OmniCodec: Low Frame Rate Universal Audio Codec with Semantic-Acoustic Disentanglement Jingbin Hu et.al. 2603.20638 translate read null
2026-03-21 AEGIS: From Clues to Verdicts – Graph-Guided Deep Vulnerability Reasoning via Dialectics and Meta-Auditing Sen Fang et.al. 2603.20637 translate read null
2026-03-21 A Modular LLM Framework for Explainable Price Outlier Detection Shadi Sartipi et.al. 2603.20636 translate read null
2026-03-21 Optimal low-rank stochastic gradient estimation for LLM training Zehao Li et.al. 2603.20632 translate read null
2026-03-21 Evaluating LLM-generated code for domain-specific languages: molecular dynamics with LAMMPS Ethan Holbrook et.al. 2603.20630 translate read null
2026-03-21 The Art of Midwifery in LLMs: Optimizing Role Personas for Large Language Models as Moral Assistants Yangyi Wu et.al. 2603.20626 translate read null
2026-03-21 JUBAKU: An Adversarial Benchmark for Exposing Culturally Grounded Stereotypes in Japanese LLMs Taihei Shiotani et.al. 2603.20581 translate read null
2026-03-21 Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems Zihua Wu et.al. 2603.20578 translate read null
2026-03-21 LJ-Bench: Ontology-Based Benchmark for U.S. Crime Hung Yun Tseng et.al. 2603.20572 translate read null
2026-03-20 Permutation-Consensus Listwise Judging for Robust Factuality Evaluation Tianyi Huang et.al. 2603.20562 translate read null
2026-03-20 Understanding Behavior Cloning with Action Quantization Haoqun Cao et.al. 2603.20538 translate read null
2026-03-20 RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization Shenyang Deng et.al. 2603.20527 translate read null
2026-03-20 Evaluating Large Language Models on Historical Health Crisis Knowledge in Resource-Limited Settings: A Hybrid Multi-Metric Study Mohammed Rakibul Hasan et.al. 2603.20514 translate read null
2026-03-20 AE-LLM: Adaptive Efficiency Optimization for Large Language Models Kaito Tanaka et.al. 2603.20492 translate read null
2026-03-20 Developing an ESG-Oriented Large Language Model through ESG Practices Gabriel Assis et.al. 2603.20480 translate read null
2026-03-20 Diffutron: A Masked Diffusion Language Model for Turkish Language Şuayp Talha Kocabay et.al. 2603.20466 translate read null
2026-03-20 Solver-Aided Verification of Policy Compliance in Tool-Augmented LLM Agents Cailin Winston et.al. 2603.20449 translate read null
2026-03-20 A Training-Free Regeneration Paradigm: Contrastive Reflection Memory Guided Self-Verification and Self-Improvement Yuran Li et.al. 2603.20441 translate read null
2026-03-20 Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health Jingwei Huang et.al. 2603.20435 translate read null
2026-03-20 Coding Agents are Effective Long-Context Processors Weili Cao et.al. 2603.20432 translate read null
2026-03-20 KV Cache Optimization Strategies for Scalable and Efficient LLM Inference Yichun Xu et.al. 2603.20397 translate read null
2026-03-20 The production of meaning in the processing of natural language Christopher J. Agostino et.al. 2603.20381 translate read null
2026-03-20 LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation Jiazheng Xing et.al. 2603.20192 translate read null
2026-03-20 IndoorR2X: Indoor Robot-to-Everything Coordination with LLM-Driven Planning Fan Yang et.al. 2603.20182 translate read null
2026-03-20 AI Agents Can Already Autonomously Perform Experimental High Energy Physics Eric A. Moreno et.al. 2603.20179 translate read null
2026-03-20 Learning Dynamic Belief Graphs for Theory-of-mind Reasoning Ruxiao Chen et.al. 2603.20170 translate read null
2026-03-20 Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models Qi Cao et.al. 2603.20161 translate read null
2026-03-20 Enhancing Hyperspace Analogue to Language (HAL) Representations via Attention-Based Pooling for Text Classification Ali Sakour et.al. 2603.20149 translate read null
2026-03-12 MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning Haozhan Shen et.al. 2603.12266 translate read null
2026-03-12 Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously Yiran Guan et.al. 2603.12262 translate read null
2026-03-12 Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Baifeng Shi et.al. 2603.12254 translate read null
2026-03-12 EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models Xuanlang Dai et.al. 2603.12252 translate read null
2026-03-12 Language Model Teams as Distributed Systems Elizabeth Mieczkowski et.al. 2603.12229 translate read null
2026-03-12 Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration Priyanka Kargupta et.al. 2603.12226 translate read null
2026-03-12 ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models Yingxin Lai et.al. 2603.12208 translate read null
2026-03-12 CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks Alexandre Le Mercier et.al. 2603.12206 translate read null
2026-03-12 IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse Yushi Bai et.al. 2603.12201 translate read null
2026-03-12 Long-Context Encoder Models for Polish Language Understanding Sławomir Dadas et.al. 2603.12191 translate read null
2026-03-12 LatentGeo: Learnable Auxiliary Constructions in Latent Space for Multimodal Geometric Reasoning Haiying Xu et.al. 2603.12166 translate read null
2026-03-12 LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation Feiyu Duan et.al. 2603.12152 translate read null
2026-03-12 IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL Zhoujun Cheng et.al. 2603.12151 translate read null
2026-03-12 Linking Perception, Confidence and Accuracy in MLLMs Yuetian Du et.al. 2603.12149 translate read null
2026-03-12 EgoIntent: An Egocentric Step-level Benchmark for Understanding What, Why, and Next Ye Pan et.al. 2603.12147 translate read null
2026-03-12 TopoBench: Benchmarking LLMs on Hard Topological Reasoning Mayug Maniparambil et.al. 2603.12133 translate read null
2026-03-12 Hoi3DGen: Generating High-Quality Human-Object-Interactions in 3D Agniv Sharma et.al. 2603.12126 translate read null
2026-03-12 Cross-Context Review: Improving LLM Output Quality by Separating Production and Review Sessions Tae-Eun Song et.al. 2603.12123 translate read null
2026-03-12 SommBench: Assessing Sommelier Expertise of Language Models William Brach et.al. 2603.12117 translate read null
2026-03-12 On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents Deyu Zou et.al. 2603.12109 translate read null
2026-03-12 EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation Yan Li et.al. 2603.12108 translate read null
2026-03-12 To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times Thomas Hikaru Clark et.al. 2603.12105 translate read null
2026-03-12 Human-Centred LLM Privacy Audits: Findings and Frictions Dimitri Staufer et.al. 2603.12094 translate read null
2026-03-12 Resource-Efficient Iterative LLM-Based NAS with Feedback Memory Xiaojie Gu et.al. 2603.12091 translate read null
2026-03-12 Cascade: Composing Software-Hardware Attack Gadgets for Adversarial Threat Amplification in Compound AI Systems Sarbartha Banerjee et.al. 2603.12023 translate read null
2026-03-12 BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs Ilias Aarab et.al. 2603.11991 translate read null
2026-03-12 LABSHIELD: A Multimodal Benchmark for Safety-Critical Reasoning and Planning in Scientific Laboratories Qianpu Sun et.al. 2603.11987 translate read null
2026-03-12 CHiL(L)Grader: Calibrated Human-in-the-Loop Short-Answer Grading Pranav Raikote et.al. 2603.11957 translate read null
2026-03-12 PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents Minjia Wang et.al. 2603.11955 translate read null
2026-03-12 MobileKernelBench: Can LLMs Write Efficient Kernels for Mobile Devices? Xingze Zou et.al. 2603.11935 translate read null
2026-03-12 Chem4DLLM: 4D Multimodal LLMs for Chemical Dynamics Understanding Xinyu Li et.al. 2603.11924 translate read null
2026-03-12 CoMMET: To What Extent Can LLMs Perform Theory of Mind Tasks? Ruirui Chen et.al. 2603.11915 translate read null
2026-03-12 Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks Junjie Chu et.al. 2603.11914 translate read null
2026-03-12 Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models Lu Wang et.al. 2603.11896 translate read null
2026-03-12 QUARE: Multi-Agent Negotiation for Balancing Quality Attributes in Requirements Engineering Haowei Cheng et.al. 2603.11890 translate read null
2026-03-12 Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language Remigiusz Kinas et.al. 2603.11881 translate read null
2026-03-12 Silent Speech Interfaces in the Era of Large Language Models: A Comprehensive Taxonomy and Systematic Review Kele Xu et.al. 2603.11877 translate read null
2026-03-12 AdaFuse: Accelerating Dynamic Adapter Inference via Token-Level Pre-Gating and Fused Kernel Optimization Qiyang Li et.al. 2603.11873 translate read null
2026-03-12 ZeroSense:How Vision matters in Long Context Compression Yonghan Gao et.al. 2603.11846 translate read null
2026-03-12 DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining Yutong Yan et.al. 2603.11838 translate read null
2026-03-12 Towards High-Fidelity CAD Generation via LLM-Driven Program Generation and Text-Based B-Rep Primitive Grounding Jiahao Li et.al. 2603.11831 translate read null
2026-03-12 Large language models for optical network O&M: Agent-embedded workflow for automation Shengnan Li et.al. 2603.11828 translate read null
2026-03-12 OMNIA: Closing the Loop by Leveraging LLMs for Knowledge Graph Completion Frédéric Ieng et.al. 2603.11820 translate read null
2026-03-12 RADAR: Closed-Loop Robotic Data Generation via Semantic Planning and Autonomous Causal Environment Reset Yongzhong Wang et.al. 2603.11811 translate read null
2026-03-12 Automating Skill Acquisition through Large-Scale Mining of Open-Source Agentic Repositories: A Framework for Multi-Agent Procedural Knowledge Extraction Shuzhen Bi et.al. 2603.11808 translate read null
2026-03-12 DocSage: An Information Structuring Agent for Multi-Doc Multi-Entity Question Answering Teng Lin et.al. 2603.11798 translate read null
2026-03-12 Language Generation with Replay: A Learning-Theoretic View of Model Collapse Giorgio Racca et.al. 2603.11784 translate read null
2026-03-12 Large Language Models for Biomedical Article Classification Jakub Proboszcz et.al. 2603.11780 translate read null
2026-03-12 Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents Yaocong Li et.al. 2603.11772 translate read null
2026-03-12 Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework Chingkwun Lam et.al. 2603.11768 translate read null
2026-03-12 Gender Bias in Generative AI-assisted Recruitment Processes Martina Ullasci et.al. 2603.11736 translate read null
2026-03-12 When OpenClaw Meets Hospital: Toward an Agentic Operating System for Dynamic Clinical Workflows Wenxian Yang et.al. 2603.11721 translate read null
2026-03-12 Scaling Laws for Educational AI Agents Mengsong Wu et.al. 2603.11709 translate read null
2026-03-12 OSCBench: Benchmarking Object State Change in Text-to-Video Generation Xianjing Han et.al. 2603.11698 translate read null
2026-03-12 Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks Mei Chee Leong et.al. 2603.11689 translate read null
2026-03-12 SemBench: A Universal Semantic Framework for LLM Evaluation Mikel Zubillaga et.al. 2603.11687 translate read null
2026-03-12 From Control to Foresight: Simulation as a New Paradigm for Human-Agent Collaboration Gaole He et.al. 2603.11677 translate read null
2026-03-12 Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge Junjie Wu et.al. 2603.11665 translate read null
2026-03-12 Resonate: Reinforcing Text-to-Audio Generation via Online Feedback from Large Audio Language Models Xiquan Li et.al. 2603.11661 translate read null
2026-03-12 Tokenization Allows Multimodal Large Language Models to Understand, Generate and Edit Architectural Floor Plans Sizhong Qin et.al. 2603.11640 translate read null
2026-03-12 VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought Eunsoo Lee et.al. 2603.11631 translate read null
2026-03-12 Sema: A High-performance System for LLM-based Semantic Query Processing Kangkang Qi et.al. 2603.11622 translate read null
2026-03-12 Taming OpenClaw: Security Analysis and Mitigation of Autonomous LLM Agent Threats Xinhao Deng et.al. 2603.11619 translate read null
2026-03-12 LaMoGen: Language to Motion Generation Through LLM-Guided Symbolic Inference Junkun Jiang et.al. 2603.11605 translate read null
2026-03-12 Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese Masataka Kawai et.al. 2603.11597 translate read null
2026-03-12 Leveraging Large Language Models and Survival Analysis for Early Prediction of Chemotherapy Outcomes Muhammad Faisal Shahid et.al. 2603.11594 translate read null
2026-03-12 UtilityMax Prompting: A Formal Framework for Multi-Objective Large Language Model Optimization Ofir Marom et.al. 2603.11583 translate read null
2026-03-12 Streaming Translation and Transcription Through Speech-to-Text Causal Alignment Roman Koshkin et.al. 2603.11578 translate read null
2026-03-12 Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries Zhenxu Tian et.al. 2603.11564 translate read null
2026-03-12 AI Knows What’s Wrong But Cannot Fix It: Helicoid Dynamics in Frontier LLMs Under High-Stakes Decisions Alejandro R Jadad et.al. 2603.11559 translate read null
2026-03-12 FBCIR: Balancing Cross-Modal Focuses in Composed Image Retrieval Chenchen Zhao et.al. 2603.11520 translate read null
2026-03-12 Multi-Agent Collaboration for Automated Design Exploration on High Performance Computing Systems Harshitha Menon et.al. 2603.11515 translate read null
2026-03-12 KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation Qizhi Chen et.al. 2603.11501 translate read null
2026-03-12 Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs Kunfeng Chen et.al. 2603.11495 translate read null
2026-03-12 PRMB: Benchmarking Reward Models in Long-Horizon CBT-based Counseling Dialogue Yougen Zhou et.al. 2603.11494 translate read null
2026-03-12 AutoVeriFix+: High-Correctness RTL Generation via Trace-Aware Causal Fix and Semantic Redundancy Pruning Yan Tan et.al. 2603.11489 translate read null
2026-03-12 Quantized Inference for OneRec-V2 Yi Su et.al. 2603.11486 translate read null
2026-03-12 INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs Junqi Yang et.al. 2603.11481 translate read null
2026-03-12 Deep Learning Network-Temporal Models For Traffic Prediction Yufeng Xin et.al. 2603.11475 translate read null
2026-03-12 CoViLLM: An Adaptive Human-Robot Collaborative Assembly Framework Using Large Language Models for Manufacturing Jiabao Zhao et.al. 2603.11461 translate read null
2026-03-12 LLM-Assisted Causal Structure Disambiguation and Factor Extraction for Legal Judgment Prediction Yuzhi Liang et.al. 2603.11446 translate read null
2026-03-12 BLooP: Zero-Shot Abstractive Summarization using Large Language Models with Bigram Lookahead Promotion Varun Iyer et.al. 2603.11415 translate read null
2026-03-12 MaterialFigBENCH: benchmark dataset with figures for evaluating college-level materials science problem-solving abilities of multimodal large language models Michiko Yoshitake et.al. 2603.11414 translate read null
2026-03-12 Algorithmic Consequences of Particle Filters for Sentence Processing: Amplified Garden-Paths and Digging-In Effects Amani Maina-Kilaas et.al. 2603.11412 translate read null
2026-03-12 Speak or Stay Silent: Context-Aware Turn-Taking in Multi-Party Dialogue Kratika Bhagtani et.al. 2603.11409 translate read null
2026-03-12 Beyond Polarity: Multi-Dimensional LLM Sentiment Signals for WTI Crude Oil Futures Return Prediction Dehao Dai et.al. 2603.11408 translate read null
2026-03-12 Stop Listening to Me! How Multi-turn Conversations Can Degrade Diagnostic Reasoning Kevin H. Guo et.al. 2603.11394 translate read null
2026-03-12 To Believe or Not To Believe: Comparing Supporting Information Tools to Aid Human Judgments of AI Veracity Jessica Irons et.al. 2603.11393 translate read null
2026-03-12 Agentic AI for Embodied-enhanced Beam Prediction in Low-Altitude Economy Networks Min Hao et.al. 2603.11392 translate read null
2026-03-12 BEACON: Budget-Aware Entity Matching Across Domains (Extended Technical Report) Nicholas Pulsone et.al. 2603.11391 translate read null
2026-03-12 Deactivating Refusal Triggers: Understanding and Mitigating Overrefusal in Safety Alignment Zhiyu Xue et.al. 2603.11388 translate read null
2026-03-11 DriveXQA: Cross-modal Visual Question Answering for Adverse Driving Scene Understanding Mingzhe Tao et.al. 2603.11380 translate read null
2026-03-11 Resolving Java Code Repository Issues with iSWE Agent Jatin Ganhotra et.al. 2603.11356 translate read null
2026-03-11 Novelty Adaptation Through Hybrid Large Language Model (LLM)-Symbolic Planning and LLM-guided Reinforcement Learning Hong Lu et.al. 2603.11351 translate read null
2026-03-11 FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles Arun Vignesh Malarkkan et.al. 2603.11339 translate read null
2026-03-11 LLM-Augmented Digital Twin for Policy Evaluation in Short-Video Platforms Haoting Zhang et.al. 2603.11333 translate read null
2026-03-11 Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover Indranil Halder et.al. 2603.11331 translate read null
2026-03-11 Bridging the Cognitive Gap: Co-Designing and Evaluating a Voice-Enabled Community Chatbot for Older Adults Feng Chen et.al. 2603.11303 translate read null
2026-03-11 Counterweights and Complementarities: The Convergence of AI and Blockchain Powering a Decentralized Future Yibai Li et.al. 2603.11299 translate read null
2026-03-11 Temporal Text Classification with Large Language Models Nishat Raihan et.al. 2603.11295 translate read null
2026-03-11 AI Psychometrics: Evaluating the Psychological Reasoning of Large Language Models with Psychometric Validities Yibai Li et.al. 2603.11279 translate read null
2026-03-11 COMPASS: The explainable agentic framework for Sovereignty, Sustainability, Compliance, and Ethics Jean-Sébastien et.al. 2603.11277 translate read null
2026-03-11 The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning Raj Sanjay Shah et.al. 2603.11266 translate read null
2026-03-11 Artificial Intelligence for Sentiment Analysis of Persian Poetry Arash Zargar et.al. 2603.11254 translate read null
2026-03-11 LLMs Can Infer Political Alignment from Online Conversations Byunghwee Lee et.al. 2603.11253 translate read null
2026-03-11 Reversible Lifelong Model Editing via Semantic Routing-Based LoRA Haihua Luo et.al. 2603.11239 translate read null
2026-03-11 Markovian Generation Chains in Large Language Models Mingmeng Geng et.al. 2603.11228 translate read null
2026-03-11 Security-by-Design for LLM-Based Code Generation: Leveraging Internal Representations for Concept-Driven Steering Mechanisms Maximilian Wendlinger et.al. 2603.11212 translate read null
2026-03-11 Can LLMs Help Localize Fake Words in Partially Fake Speech? Lin Zhang et.al. 2603.11205 translate read null
2026-03-11 DeReason: A Difficulty-Aware Curriculum Improves Decoupled SFT-then-RL Training for General Reasoning Hanxu Hu et.al. 2603.11193 translate read null
2026-03-11 Systematic Scaling Analysis of Jailbreak Attacks in Large Language Models Xiangwen Wang et.al. 2603.11149 translate read null
2026-03-11 H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code Amit Singh et.al. 2603.11139 translate read null
2026-03-11 Enhancing Value Alignment of LLMs with Multi-agent system and Combinatorial Fusion Yuanhong Wu et.al. 2603.11126 translate read null
2026-03-11 Uni-ASR: Unified LLM-Based Architecture for Non-Streaming and Streaming Automatic Speech Recognition Yinfeng Xia et.al. 2603.11123 translate read null
2026-03-11 Task-Conditioned Routing Signatures in Sparse Mixture-of-Experts Transformers Mynampati Sri Ranganadha Avinash et.al. 2603.11114 translate read null
2026-03-11 Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining Zhiyuan Zeng et.al. 2603.11103 translate read null
2026-03-11 Graph Tokenization for Bridging Graphs and Transformers Zeyuan Guo et.al. 2603.11099 translate read null
2026-03-11 The Attack and Defense Landscape of Agentic AI: A Comprehensive Survey Juhee Kim et.al. 2603.11088 translate read null
2026-03-10 Quality-Driven Agentic Reasoning for LLM-Assisted Software Design: Questions-of-Thoughts (QoT) as a Time-Series Self-QA Chain Yen-Ku Liu et.al. 2603.11082 translate read null
2026-03-10 CR-Bench: Evaluating the Real-World Utility of AI Code Review Agents Kristen Pereira et.al. 2603.11078 translate read null
2026-03-10 Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation Jingtao Wang et.al. 2603.11067 translate read null
2026-03-11 LLMGreenRec: LLM-Based Multi-Agent Recommender System for Sustainable E-Commerce Hao N. Nguyen et.al. 2603.11025 translate read null
2026-03-11 Does AI See like Art Historians? Interpreting How Vision Language Models Recognize Artistic Style Marvin Limpijankit et.al. 2603.11024 translate read null
2026-03-11 Leech Lattice Vector Quantization for Efficient LLM Compression Tycho F. A. van der Ouderaa et.al. 2603.11021 translate read null
2026-03-11 A Systematic Study of Pseudo-Relevance Feedback with LLMs Nour Jedidi et.al. 2603.11008 translate read null
2026-03-11 TOSSS: a CVE-based Software Security Benchmark for Large Language Models Marc Damie et.al. 2603.10969 translate read null
2026-03-11 LLM2Vec-Gen: Generative Embeddings from Large Language Models Parishad BehnamGhader et.al. 2603.10913 translate read null
2026-03-11 When Fine-Tuning Fails and when it Generalises: Role of Data Diversity and Mixed Training in LLM-based TTS Anupam Purwar et.al. 2603.10904 translate read null
2026-03-11 LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation Jinwoo Ahn et.al. 2603.10899 translate read null
2026-03-11 A Hybrid Knowledge-Grounded Framework for Safety and Traceability in Prescription Verification Yichi Zhu et.al. 2603.10891 translate read null
2026-03-11 Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models Yixiu Mao et.al. 2603.10887 translate read null
2026-03-11 Exploring Indicators of Developers’ Sentiment Perceptions in Student Software Projects Martin Obaidi et.al. 2603.10864 translate read null
2026-03-11 Beyond Sequential Distance: Inter-Modal Distance Invariant Position Encoding Lin Chen et.al. 2603.10863 translate read null
2026-03-11 OSUM-Pangu: An Open-Source Multidimension Speech Understanding Foundation Model Built upon OpenPangu on Ascend NPUs Yujie Liao et.al. 2603.10862 translate read null
2026-03-11 Towards Cold-Start Drafting and Continual Refining: A Value-Driven Memory Approach with Application to NPU Kernel Synthesis Yujie Zheng et.al. 2603.10846 translate read null
2026-03-11 PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words Yuzhi Liang et.al. 2603.10842 translate read null
2026-03-11 Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation Thomas Thebaud et.al. 2603.10827 translate read null
2026-03-11 Nurture-First Agent Development: Building Domain-Expert AI Agents Through Conversational Knowledge Crystallization Linghao Zhang et.al. 2603.10808 translate read null
2026-03-11 Risk-Adjusted Harm Scoring for Automated Red Teaming for LLMs in Financial Services Fabrizio Dimino et.al. 2603.10807 translate read null
2026-03-11 Semantic Satellite Communications for Synchronized Audiovisual Reconstruction Fangyu Liu et.al. 2603.10791 translate read null
2026-03-11 Taking Shortcuts for Categorical VQA Using Super Neurons Pierre Musacchio et.al. 2603.10781 translate read null
2026-03-11 Large Language Models as Annotators for Machine Translation Quality Estimation Sidi Wang et.al. 2603.10775 translate read null
2026-03-11 Word Recovery in Large Language Models Enables Character-Level Tokenization Robustness Zhipeng Yang et.al. 2603.10771 translate read null
2026-03-11 mAceReason-Math: A Dataset of High-Quality Multilingual Math Problems Ready For RLVR Konstantin Dobler et.al. 2603.10767 translate read null
2026-03-11 CodePercept: Code-Grounded Visual STEM Perception for MLLMs Tongkun Guan et.al. 2603.10757 translate read null
2026-03-11 CacheSolidarity: Preventing Prefix Caching Side Channels in Multi-tenant LLM Serving Systems Panagiotis Georgios Pennas et.al. 2603.10726 translate read null
2026-03-11 UAV traffic scene understanding: A cross-spectral guided approach and a unified benchmark Yu Zhang et.al. 2603.10722 translate read null
2026-03-11 Prism- $Δ$ : Differential Subspace Steering for Prompt Highlighting in Large Language Models Yuyao Ge et.al. 2603.10705 translate read null
2026-03-11 Breaking User-Centric Agency: A Tri-Party Framework for Agent-Based Recommendation Yaxin Gong et.al. 2603.10673 translate read null
2026-03-11 ESG Reporting Lifecycle Management with Large Language Models and AI Agents Thong Hoang et.al. 2603.10646 translate read null
2026-03-11 Making Bielik LLM Reason (Better): A Field Report Adam Trybus et.al. 2603.10640 translate read null
2026-03-11 Reinforcement Learning with Conditional Expectation Reward Changyi Xiao et.al. 2603.10624 translate read null
2026-03-11 Disentangling Similarity and Relatedness in Topic Models Hanlin Xiao et.al. 2603.10619 translate read null
2026-03-11 MUNIChus: Multilingual News Image Captioning Benchmark Yuji Chen et.al. 2603.10613 translate read null
2026-03-11 Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning Zhaowei Zhang et.al. 2603.10588 translate read null
2026-03-11 Distilling LLM Semantic Priors into Encoder-Only Multi-Talker ASR with Talker-Count Routing Hao Shi et.al. 2603.10587 translate read null
2026-03-11 End-to-End Chatbot Evaluation with Adaptive Reasoning and Uncertainty Filtering Nhi Dang et.al. 2603.10570 translate read null
2026-03-11 Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents Yuanhao Li et.al. 2603.10564 translate read null
2026-03-11 PET-F2I: A Comprehensive Benchmark and Parameter-Efficient Fine-Tuning of LLMs for PET/CT Report Impression Generation Yuchen Liu et.al. 2603.10560 translate read null
2026-03-11 Automatic End-to-End Data Integration using Large Language Models Aaron Steiner et.al. 2603.10547 translate read null
2026-03-11 Resource-constrained Amazons chess decision framework integrating large language models and graph attention Tianhao Qian et.al. 2603.10512 translate read null
2026-03-11 IMTBench: A Multi-Scenario Cross-Modal Collaborative Evaluation Benchmark for In-Image Machine Translation Jiahao Lyu et.al. 2603.10495 translate read null
2026-03-11 Human-AI Co-reasoning for Clinical Diagnosis with Evidence-Integrated Language Agent Zhongzhen Huang et.al. 2603.10492 translate read null
2026-03-11 PEEM: Prompt Engineering Evaluation Metrics for Interpretable Joint Evaluation of Prompts and Responses Minki Hong et.al. 2603.10477 translate read null
2026-03-11 Learning to Negotiate: Multi-Agent Deliberation for Collective Value Alignment in LLMs Panatchakorn Anantaprayoon et.al. 2603.10476 translate read null
2026-03-11 Aligning Large Language Models with Searcher Preferences Wei Wu et.al. 2603.10473 translate read null
2026-03-11 Fighting Hallucinations with Counterfactuals: Diffusion-Guided Perturbations for LVLM Hallucination Suppression Hamidreza Dastmalchi et.al. 2603.10470 translate read null
2026-03-11 The Curse and Blessing of Mean Bias in FP4-Quantized LLM Training Hengjie Cao et.al. 2603.10444 translate read null
2026-03-11 Designing Service Systems from Textual Evidence Ruicheng Ao et.al. 2603.10400 translate read null
2026-03-11 Verbalizing LLM’s Higher-order Uncertainty via Imprecise Probabilities Anita Yang et.al. 2603.10396 translate read null
2026-03-11 Don’t Let the Claw Grip Your Hand: A Security Analysis and Defense Framework for OpenClaw Zhengyang Shan et.al. 2603.10387 translate read null
2026-03-11 Speech Codec Probing from Semantic and Phonetic Perspectives Xuan Shi et.al. 2603.10371 translate read null
2026-03-11 GeoSense: Internalizing Geometric Necessity Perception for Multimodal Reasoning Ruiheng Liu et.al. 2603.10370 translate read null
2026-03-11 Utility Function is All You Need: LLM-based Congestion Control Neta Rozen-Schiff et.al. 2603.10357 translate read null
2026-03-11 S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load Balance Di Liu et.al. 2603.10353 translate read null
2026-03-11 Mitigating Translationese Bias in Multilingual LLM-as-a-Judge via Disentangled Information Bottleneck Hongbin Zhang et.al. 2603.10351 translate read null
2026-03-11 Multi-Modal Intelligent Channel Modeling: From Fine-tuned LLMs to Pre-trained Foundation Models Lu Bai et.al. 2603.10343 translate read null
2026-03-11 AgentServe: Algorithm-System Co-Design for Efficient Agentic AI Serving on a Consumer-Grade GPU Yuning Zhang et.al. 2603.10342 translate read null
2026-03-11 Large language models can disambiguate opioid slang on social media Kristy A. Carpenter et.al. 2603.10313 translate read null
2026-03-11 Is this Idea Novel? An Automated Benchmark for Judgment of Research Ideas Tim Schopf et.al. 2603.10303 translate read null
2026-03-11 Regime-aware financial volatility forecasting via in-context learning Saba Asaad et.al. 2603.10299 translate read null
2026-03-11 GaLoRA: Parameter-Efficient Graph-Aware LLMs for Node Classification Mayur Choudhary et.al. 2603.10298 translate read null
2026-03-11 Simulation-in-the-Reasoning (SiR): A Conceptual Framework for Empirically Grounded AI in Autonomous Transportation Wuping Xin et.al. 2603.10294 translate read null
2026-03-11 Conversational AI-Enhanced Exploration System to Query Large-Scale Digitised Collections of Natural History Museums Yiyuan Wang et.al. 2603.10285 translate read null
2026-03-10 SpecOps: A Fully Automated AI Agent Testing Framework in Real-World GUI Environments Syed Yusuf Ahmed et.al. 2603.10268 translate read null
2026-03-10 GR-SAP: Generative Replay for Safety Alignment Preservation during Fine-Tuning Zhouxiang Fang et.al. 2603.10243 translate read null
2026-03-10 S-GRADES – Studying Generalization of Student Response Assessments in Diverse Evaluative Settings Tasfia Seuti et.al. 2603.10233 translate read null
2026-03-10 Hierarchical Task Model Predictive Control for Sequential Mobile Manipulation Tasks Xintong Du et.al. 2603.10232 translate read null
2026-03-10 Paladin: A Policy Framework for Securing Cloud APIs by Combining Application Context with Generative AI Shriti Priya et.al. 2603.10228 translate read null
2026-03-10 Rethinking the Harmonic Loss via Non-Euclidean Distance Layers Maxwell Miller-Golub et.al. 2603.10225 translate read null
2026-03-10 Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models Eric Yocam et.al. 2603.10195 translate read null
2026-03-10 MCP-in-SoS: Risk assessment framework for open-source MCP servers Pratyay Kumar et.al. 2603.10194 translate read null
2026-03-10 Calibration-Reasoning Framework for Descriptive Speech Quality Assessment Elizaveta Kostenok et.al. 2603.10175 translate read null
2026-03-10 Omics Data Discovery Agents Alexandre Hutton et.al. 2603.10161 translate read null
2026-03-10 Social Knowledge for Cross-Domain User Preference Modeling Nir Lotan et.al. 2603.10148 translate read null
2026-03-10 Reason and Verify: A Framework for Faithful Retrieval-Augmented Generation Eeham Khan et.al. 2603.10143 translate read null
2026-03-10 The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory Romain Peyrichou et.al. 2603.10139 translate read null
2026-03-10 CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR Sijia Cui et.al. 2603.10101 translate read null
2026-03-10 Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models Daniel Hennes et.al. 2603.10098 translate read null
2026-03-10 Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference Fan Yang et.al. 2603.10091 translate read null
2026-03-10 ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping Zijian Zhu et.al. 2603.10088 translate read null
2026-03-10 Pooling Engram Conditional Memory in Large Language Models using CXL Ruiyang Ma et.al. 2603.10087 translate read null
2026-03-10 KernelSkill: A Multi-Agent Framework for GPU Kernel Optimization Qitong Sun et.al. 2603.10085 translate read null
2026-03-10 Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models Ali Raza et.al. 2603.10080 translate read null
2026-03-10 Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation Amir Al-Maamari et.al. 2603.10072 translate read null
2026-03-10 ADVERSA: Measuring Multi-Turn Guardrail Degradation and Judge Reliability in Large Language Models Harry Owiredu-Ashley et.al. 2603.10068 translate read null
2026-03-09 Training Language Models via Neural Cellular Automata Dan Lee et.al. 2603.10055 translate read null
2026-03-08 Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction Brian Freeman et.al. 2603.10047 translate read null
2026-03-10 Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People Jazmin Collins et.al. 2603.09964 translate read null
2026-03-10 Think Before You Lie: How Reasoning Improves Honesty Ann Yuan et.al. 2603.09957 translate read null
2026-03-10 Towards a Neural Debugger for Python Maximilian Beck et.al. 2603.09951 translate read null
2026-03-10 PathMem: Toward Cognition-Aligned Memory Transformation for Pathology MLLMs Jinyue Li et.al. 2603.09943 translate read null
2026-03-10 Model Merging in the Era of Large Language Models: Methods, Applications, and Future Directions Mingyang Song et.al. 2603.09938 translate read null
2026-03-10 WikiCLIP: An Efficient Contrastive Baseline for Open-domain Visual Entity Recognition Shan Ning et.al. 2603.09921 translate read null
2026-03-10 MSSR: Memory-Aware Adaptive Replay for Continual LLM Fine-Tuning Yiyang Lu et.al. 2603.09892 translate read null
2026-03-10 Influencing LLM Multi-Agent Dialogue via Policy-Parameterized Prompts Hongbo Bo et.al. 2603.09890 translate read null
2026-03-10 Benchmarking Political Persuasion Risks Across Frontier Large Language Models Zhongren Chen et.al. 2603.09884 translate read null
2026-03-10 Do What I Say: A Spoken Prompt Dataset for Instruction-Following Maike Züfle et.al. 2603.09881 translate read null
2026-03-10 InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Changyao Tian et.al. 2603.09877 translate read null
2026-03-10 MissBench: Benchmarking Multimodal Affective Analysis under Imbalanced Missing Modalities Tien Anh Pham et.al. 2603.09874 translate read null
2026-03-10 GAST: Gradient-aligned Sparse Tuning of Large Language Models with Data-layer Selection Kai Yao et.al. 2603.09865 translate read null
2026-03-10 SCENEBench: An Audio Understanding Benchmark Grounded in Assistive and Industrial Use Cases Laya Iyer et.al. 2603.09853 translate read null
2026-03-10 RecThinker: An Agentic Framework for Tool-Augmented Reasoning in Recommendation Haobo Zhang et.al. 2603.09843 translate read null
2026-03-10 One-Eval: An Agentic System for Automated and Traceable LLM Evaluation Chengyu Shen et.al. 2603.09821 translate read null
2026-03-10 Good Reasoning Makes Good Demonstrations: Implicit Reasoning Quality Supervision via In-Context Reinforcement Learning Tiehua Mei et.al. 2603.09803 translate read null
2026-03-10 MITRA: An AI Assistant for Knowledge Retrieval in Physics Collaborations Abhishikth Mallampalli et.al. 2603.09800 translate read null
2026-03-10 Quantifying the Necessity of Chain of Thought through Opaque Serial Depth Jonah Brown-Cohen et.al. 2603.09786 translate read null
2026-03-10 LogoDiffuser: Training-Free Multilingual Logo Generation and Stylization via Letter-Aware Attention Control Mingyu Kang et.al. 2603.09759 translate read null
2026-03-10 Beyond Fine-Tuning: Robust Food Entity Linking under Ontology Drift with FoodOntoRAG Jan Drole et.al. 2603.09758 translate read null
2026-03-10 Epistemic Closure: Autonomous Mechanism Completion for Physically Consistent Simulation Yue Wua et.al. 2603.09756 translate read null
2026-03-10 Let’s Reward Step-by-Step: Step-Aware Contrastive Alignment for Vision-Language Navigation in Continuous Environments Haoyuan Li et.al. 2603.09740 translate read null
2026-03-10 FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis Xiaotian Hu et.al. 2603.09733 translate read null
2026-03-10 EXPLORE-Bench: Egocentric Scene Prediction with Long-Horizon Reasoning Chengjun Yu et.al. 2603.09731 translate read null
2026-03-10 WVA: A Global Optimization Control Plane for llmd Abhishek Malvankar et.al. 2603.09730 translate read null
2026-03-10 RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation Sihong Wu et.al. 2603.09723 translate read null
2026-03-10 OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences Ming Wen et.al. 2603.09706 translate read null
2026-03-10 Evaluation of LLMs in retrieving food and nutritional context for RAG systems Maks Požarnik Vavken et.al. 2603.09704 translate read null
2026-03-10 An Empirical Study of Interaction Smells in Multi-Turn Human-LLM Collaborative Code Generation Binquan Zhang et.al. 2603.09701 translate read null
2026-03-10 ActiveUltraFeedback: Efficient Preference Data Generation using Active Learning Davit Melikidze et.al. 2603.09692 translate read null
2026-03-10 ESAinsTOD: A Unified End-to-End Schema-Aware Instruction-Tuning Framework for Task-Oriented Dialog Modeling Dechuan Teng et.al. 2603.09691 translate read null
2026-03-10 AutoViVQA: A Large-Scale Automatically Constructed Dataset for Vietnamese Visual Question Answering Nguyen Anh Tuong et.al. 2603.09689 translate read null
2026-03-10 Automatic Cardiac Risk Management Classification using large-context Electronic Patients Health Records Jacopo Vitale et.al. 2603.09685 translate read null
2026-03-10 EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages Aman Sharma et.al. 2603.09678 translate read null
2026-03-10 MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants Zuhao Zhang et.al. 2603.09652 translate read null
2026-03-10 Tracking Cancer Through Text: Longitudinal Extraction From Radiology Reports Using Open-Source Large Language Models Luc Builtjes et.al. 2603.09638 translate read null
2026-03-10 Grounding Synthetic Data Generation With Vision and Language Models Ümit Mert Çağlar et.al. 2603.09625 translate read null
2026-03-10 Compartmentalization-Aware Automated Program Repair Jia Hu et.al. 2603.09544 translate read null
2026-03-10 Dynamic Multimodal Expression Generation for LLM-Driven Pedagogical Agents: From User Experience Perspective Ninghao Wan et.al. 2603.09536 translate read null
2026-03-10 Enhancing Debunking Effectiveness through LLM-based Personality Adaptation Pietro Dell’Oglio et.al. 2603.09533 translate read null
2026-03-10 EmbC-Test: How to Speed Up Embedded Software Testing Using LLMs and RAG Maximilian Harnot et.al. 2603.09497 translate read null
2026-03-10 GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models Andrew Murray et.al. 2603.09481 translate read null
2026-03-10 CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research? Xiangsen Chen et.al. 2603.09452 translate read null
2026-03-10 AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems Athanasios Davvetas et.al. 2603.09435 translate read null
2026-03-10 Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs Saugata Purkayastha et.al. 2603.09434 translate read null
2026-03-10 Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health Trung Hieu Ngo et.al. 2603.09416 translate read null
2026-03-10 Quantifying and extending the coverage of spatial categorization data sets Wanchun Li et.al. 2603.09373 translate read null
2026-03-10 The Virtuous Cycle: AI-Powered Vector Search and Vector Search-Augmented AI Jiuqi Wei et.al. 2603.09347 translate read null
2026-03-10 TaSR-RAG: Taxonomy-guided Structured Reasoning for Retrieval-Augmented Generation Jiashuo Sun et.al. 2603.09341 translate read null
2026-03-10 Beyond Scaling: Assessing Strategic Reasoning and Rapid Decision-Making Capability of LLMs in Zero-sum Environments Yang Li et.al. 2603.09337 translate read null
2026-03-10 Can ChatGPT Generate Realistic Synthetic System Requirement Specifications? Results of a Case Study Alex R. Mattukat et.al. 2603.09335 translate read null
2026-03-10 OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in Multimodal Large Language Models Tengjin Weng et.al. 2603.09326 translate read null
2026-03-10 Curveball Steering: The Right Direction To Steer Isn’t Always Linear Shivam Raval et.al. 2603.09313 translate read null
2026-03-10 Investor risk profiles of large language models Hanyong Cho et.al. 2603.09303 translate read null
2026-03-10 Constructing a Portfolio Optimization Benchmark Framework for Evaluating Large Language Models Hanyong Cho et.al. 2603.09301 translate read null
2026-03-10 TA-Mem: Tool-Augmented Autonomous Memory Retrieval for LLM in Long-Term Conversational QA Mengwei Yuan et.al. 2603.09297 translate read null
2026-03-10 ToolRosetta: Bridging Open-Source Repositories and Large Language Model Agents through Automated Tool Standardization Shimin Di et.al. 2603.09290 translate read null
2026-03-10 Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval Yingyi Zhang et.al. 2603.09250 translate read null
2026-03-10 Social-R1: Towards Human-like Social Reasoning in LLMs Jincenzi Wu et.al. 2603.09249 translate read null
2026-03-10 Cognitively Layered Data Synthesis for Domain Adaptation of LLMs to Space Situational Awareness Ding Linghu et.al. 2603.09231 translate read null
2026-03-10 TubeMLLM: A Foundation Model for Topology Knowledge Exploration in Vessel-like Anatomy Yaoyu Liu et.al. 2603.09217 translate read null
2026-03-10 PIM-SHERPA: Software Method for On-device LLM Inference by Resolving PIM Memory Attribute and Layout Inconsistencies Sunjung Lee et.al. 2603.09216 translate read null
2026-03-10 Acoustic and Semantic Modeling of Emotion in Spoken Language Soumya Dutta et.al. 2603.09212 translate read null
2026-03-10 MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Zongxia Li et.al. 2603.09206 translate read null
2026-03-10 Emotion is Not Just a Label: Latent Emotional Factors in LLM Processing Benjamin Reichman et.al. 2603.09205 translate read null
2026-03-10 The Reasoning Trap – Logical Reasoning as a Mechanistic Pathway to Situational Awareness Subramanyam Sahoo et.al. 2603.09200 translate read null
2026-03-10 DEO: Training-Free Direct Embedding Optimization for Negation-Aware Retrieval Taegyeong Lee et.al. 2603.09185 translate read null
2026-03-10 Evaluating the Practical Effectiveness of LLM-Driven Index Tuning with Microsoft Database Tuning Advisor Xiaoying Wang et.al. 2603.09181 translate read null
2026-03-10 Point Cloud as a Foreign Language for Multi-modal Large Language Model Sneha Paul et.al. 2603.09173 translate read null
2026-03-10 Wrong Code, Right Structure: Learning Netlist Representations from Imperfect LLM-Generated RTL Siyang Cai et.al. 2603.09161 translate read null
2026-03-10 RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning Tzu-Heng Huang et.al. 2603.09160 translate read null
2026-03-10 Real-Time Trust Verification for Safe Agentic Actions using TrustBench Tavishi Sharma et.al. 2603.09157 translate read null
2026-03-10 Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety Trent R Northen et.al. 2603.09154 translate read null
2026-03-10 DataFactory: Collaborative Multi-Agent Framework for Advanced Table Question Answering Tong Wang et.al. 2603.09152 translate read null
2026-03-10 Deep Tabular Research via Continual Experience-Driven Execution Junnan Dong et.al. 2603.09151 translate read null
2026-03-10 QUSR: Quality-Aware and Uncertainty-Guided Image Super-Resolution Diffusion Model Junjie Yin et.al. 2603.09125 translate read null
2026-03-10 Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Zhengzhao Ma et.al. 2603.09117 translate read null
2026-03-10 Progressive Representation Learning for Multimodal Sentiment Analysis with Incomplete Modalities Jindi Bao et.al. 2603.09111 translate read null
2026-03-10 VIVID-Med: LLM-Supervised Structured Pretraining for Deployable Medical ViTs Xiyao Wang et.al. 2603.09109 translate read null
2026-03-10 Composed Vision-Language Retrieval for Skin Cancer Case Search via Joint Alignment of Global and Local Representations Yuheng Wang et.al. 2603.09108 translate read null
2026-03-10 Class Model Generation from Requirements using Large Language Models Jackson Nguyen et.al. 2603.09100 translate read null
2026-03-10 Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Kaiser Sun et.al. 2603.09095 translate read null
2026-03-10 Chain of Event-Centric Causal Thought for Physically Plausible Video Generation Zixuan Wang et.al. 2603.09094 translate read null
2026-03-10 Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting Alvaro Paredes Amorin et.al. 2603.09085 translate read null
2026-03-10 Learning Adaptive LLM Decoding Chloe H. Su et.al. 2603.09065 translate read null
2026-03-10 FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation Yinpeng Wu et.al. 2603.09046 translate read null
2026-03-09 Automating Detection and Root-Cause Analysis of Flaky Tests in Quantum Software Janakan Sivaloganathan et.al. 2603.09029 translate read null
2026-03-09 The Missing Memory Hierarchy: Demand Paging for LLM Context Windows Tony Mason et.al. 2603.09023 translate read null
2026-03-09 Meissa: Multi-modal Medical Agentic Intelligence Yixiong Chen et.al. 2603.09018 translate read null
2026-03-09 Learning When to Sample: Confidence-Aware Self-Consistency for Efficient LLM Chain-of-Thought Reasoning Juming Xiong et.al. 2603.08999 translate read null
2026-03-09 MAPLE: Elevating Medical Reasoning from Statistical Consensus to Process-Led Alignment Kailong Fan et.al. 2603.08987 translate read null
2026-03-09 GenAI Is No Silver Bullet for Qualitative Research in Software Engineering Neil A. Ernst et.al. 2603.08951 translate read null
2026-03-09 AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem Rui Liu et.al. 2603.08938 translate read null
2026-03-09 VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs Hezhao Zhang et.al. 2603.08936 translate read null
2026-03-09 PathoScribe: Transforming Pathology Data into a Living Library with a Unified LLM-Driven Framework for Semantic Retrieval and Clinical Integration Abdul Rehman Akbar et.al. 2603.08935 translate read null
2026-03-09 MEGC2026: Micro-Expression Grand Challenge on Visual Question Answering Xinqi Fan et.al. 2603.08927 translate read null
2026-03-09 ConFu: Contemplate the Future for Better Speculative Sampling Zongyue Qin et.al. 2603.08899 translate read null
2026-03-09 A Decentralized Frontier AI Architecture Based on Personal Instances, Synthetic Data, and Collective Context Synchronization Jacek Małecki et.al. 2603.08893 translate read null
2026-03-09 LLM-Agent Interactions on Markets with Information Asymmetries Alexander Erlei et.al. 2603.08853 translate read null
2026-03-09 Investigating the Effects of LLM Use on Critical Thinking Under Time Constraints: Access Timing and Time Availability Jiayin Zhi et.al. 2603.08849 translate read null
2026-03-09 HMR-1: Hierarchical Massage Robot with Vision-Language-Model for Embodied Healthcare Rongtao Xu et.al. 2603.08817 translate read null
2026-03-09 Scale-Plan: Scalable Language-Enabled Task Planning for Heterogeneous Multi-Robot Teams Piyush Gupta et.al. 2603.08814 translate read null
2026-03-09 Large Language Model-Assisted Superconducting Qubit Experiments Shiheng Li et.al. 2603.08801 translate read null
2026-03-09 Granulon: Awakening Pixel-Level Visual Encoders with Adaptive Multi-Granularity Semantics for MLLM Junyuan Mao et.al. 2603.08800 translate read null
2026-03-09 Agentic Critical Training Weize Liu et.al. 2603.08706 translate read null
2026-03-09 Evaluating Financial Intelligence in Large Language Models: Benchmarking SuperInvesting AI with LLM Engines Akshay Gulati et.al. 2603.08704 translate read null
2026-03-09 UNBOX: Unveiling Black-box visual models with Natural-language Simone Carnemolla et.al. 2603.08639 translate read null
2026-03-09 Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations Jiangye Yuan et.al. 2603.08592 translate read null
2026-03-09 RetroAgent: From Solving to Evolving via Retrospective Dual Intrinsic Feedback Xiaoying Zhang et.al. 2603.08561 translate read null
2026-03-09 SecAgent: Efficient Mobile GUI Agent with Semantic Context Yiping Xie et.al. 2603.08533 translate read null
2026-03-09 SCAFFOLD-CEGIS: Preventing Latent Security Degradation in LLM-Driven Iterative Code Refinement Yi Chen et.al. 2603.08520 translate read null
2026-03-09 AtomVLA: Scalable Post-Training for Robotic Manipulation via Predictive Latent World Models Xiaoquan Sun et.al. 2603.08519 translate read null
2026-03-09 Fanar-Sadiq: A Multi-Agent Architecture for Grounded Islamic QA Ummar Abbas et.al. 2603.08501 translate read null
2026-03-09 Visual Self-Fulfilling Alignment: Shaping Safety-Oriented Personas via Threat-Related Images Qishun Yang et.al. 2603.08486 translate read null
2026-03-09 Behavioral Generative Agents for Power Dispatch and Auction Shaoze Li et.al. 2603.08477 translate read null
2026-03-09 R2F: Repurposing Ray Frontiers for LLM-free Object Navigation Francesco Argenziano et.al. 2603.08475 translate read null
2026-03-09 LycheeCluster: Efficient Long-Context Inference with Structure-Aware Chunking and Hierarchical KV Indexing Dongfang Li et.al. 2603.08453 translate read null
2026-03-09 A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic Peter Brodeur et.al. 2603.08448 translate read null
2026-03-09 LLM-Driven Online Aggregation for Unstructured Text Analytics Chao Hui et.al. 2603.08443 translate read null
2026-03-09 Sandpiper: Orchestrated AI-Annotation for Educational Discourse at Scale Daryl Hedley et.al. 2603.08406 translate read null
2026-03-09 Revealing Behavioral Plasticity in Large Language Models: A Token-Conditional Perspective Liyuan Mao et.al. 2603.08398 translate read null
2026-03-09 COACH meets QUORUM: A Framework and Pipeline for Aligning User, Expert and Developer Perspectives in LLM-generated Health Counselling Yee Man Ng et.al. 2603.08392 translate read null
2026-03-09 AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition Zhishu Liu et.al. 2603.08387 translate read null
2026-03-09 M $^3$ -ACE: Rectifying Visual Perception in Multimodal Math Reasoning via Multi-Agentic Context Engineering Peijin Xie et.al. 2603.08369 translate read null
2026-03-09 SPD-RAG: Sub-Agent Per Document Retrieval-Augmented Generation Yagiz Can Akay et.al. 2603.08329 translate read null
2026-03-09 Agentic Neurosymbolic Collaboration for Mathematical Discovery: A Case Study in Combinatorial Design Hai Xia et.al. 2603.08322 translate read null
2026-03-09 CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support Liuyi Xu et.al. 2603.08321 translate read null
2026-03-09 AdaCultureSafe: Adaptive Cultural Safety Grounded by Cultural Knowledge in Large Language Models Hankun Kang et.al. 2603.08275 translate read null
2026-03-09 How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms JV Roig et.al. 2603.08274 translate read null
2026-03-09 Towards a more efficient bias detection in financial language models Firas Hadj Kacem et.al. 2603.08267 translate read null
2026-03-09 FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use Jiaxuan Lu et.al. 2603.08262 translate read null
2026-03-09 NCL-UoR at SemEval-2026 Task 5: Embedding-Based Methods, Fine-Tuning, and LLMs for Word Sense Plausibility Rating Tong Wu et.al. 2603.08256 translate read null
2026-03-09 Fibration Policy Optimization Chang Li et.al. 2603.08239 translate read null
2026-03-09 The Struggle Between Continuation and Refusal: A Mechanistic Analysis of the Continuation-Triggered Jailbreak in LLMs Yonghong Deng et.al. 2603.08234 translate read null
2026-03-09 Supporting Workflow Reproducibility by Linking Bioinformatics Tools across Papers and Executable Code Clémence Sebe et.al. 2603.08195 translate read null
2026-03-09 SERQ: Saliency-Aware Low-Rank Error Reconstruction for LLM Quantization Yeonsik Park et.al. 2603.08185 translate read null
2026-03-09 TildeOpen LLM: Leveraging Curriculum Learning to Achieve Equitable Language Representation Toms Bergmanis et.al. 2603.08182 translate read null
2026-03-09 AutoAdapt: An Automated Domain Adaptation Framework for LLMs Sidharth Sinha et.al. 2603.08181 translate read null
2026-03-09 MERLIN: Building Low-SNR Robust Multimodal LLMs for Electromagnetic Signals Junyu Shen et.al. 2603.08174 translate read null
2026-03-09 RexDrug: Reliable Multi-Drug Combination Extraction through Reasoning-Enhanced LLMs Zhijun Wang et.al. 2603.08166 translate read null
2026-03-09 The Differential Effects of Agreeableness and Extraversion on Older Adults’ Perceptions of Conversational AI Explanations in Assistive Settings Niharika Mathur et.al. 2603.08164 translate read null
2026-03-09 Gender Bias in MT for a Genderless Language: New Benchmarks for Basque Amaia Murillo et.al. 2603.08153 translate read null
2026-03-09 Gradually Excavating External Knowledge for Implicit Complex Question Answering Chang Liu et.al. 2603.08148 translate read null
2026-03-09 EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery Yougang Lyu et.al. 2603.08127 translate read null
2026-03-09 SAMoE-VLA: A Scene Adaptive Mixture-of-Experts Vision-Language-Action Model for Autonomous Driving Zihan You et.al. 2603.08113 translate read null
2026-03-09 Invisible Safety Threat: Malicious Finetuning for LLM via Steganography Guangnian Wan et.al. 2603.08104 translate read null
2026-03-09 Toward Robust LLM-Based Judges: Taxonomic Bias Evaluation and Debiasing Optimization Hongli Zhou et.al. 2603.08091 translate read null
2026-03-09 EAGLE-Pangu: Accelerator-Safe Tree Speculative Decoding on Ascend NPUs Chang Han et.al. 2603.08088 translate read null
2026-03-09 From Reactive to Map-Based AI: Tuned Local LLMs for Semantic Zone Inference in Object-Goal Navigation Yudai Noda et.al. 2603.08086 translate read null
2026-03-09 The AI Amplifier Effect: Defining Human-AI Intimacy and Romantic Relationships with Conversational AI Ching Christie Pang et.al. 2603.08084 translate read null
2026-03-09 High-Fidelity Pruning for Large Language Models Yijun Zhu et.al. 2603.08083 translate read null
2026-03-09 Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval Matei Benescu et.al. 2603.08077 translate read null
2026-03-09 Synthetic Defect Image Generation for Power Line Insulator Inspection Using Multimodal Large Language Models Xuesong Wang et.al. 2603.08069 translate read null
2026-03-09 In-Context Reinforcement Learning for Tool Use in Large Language Models Yaoqi Ye et.al. 2603.08068 translate read null
2026-03-09 Deterministic Differentiable Structured Pruning for Large Language Models Weiyu Huang et.al. 2603.08065 translate read null
2026-03-09 CinemaWorld: Generative Augmented Reality with LLMs and 3D Scene Generation for Movie Augmentation Keiichi Ihara et.al. 2603.08060 translate read null
2026-03-09 Stabilized Fine-Tuning with LoRA in Federated Learning: Mitigating the Side Effect of Client Size and Rank via the Scaling Factor Jiayu Huang et.al. 2603.08058 translate read null
2026-03-09 S2S-FDD: Bridging Industrial Time Series and Natural Language for Explainable Zero-shot Fault Diagnosis Baoxue Li et.al. 2603.08048 translate read null
2026-03-09 CDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward Modeling Dengcan Liu et.al. 2603.08035 translate read null
2026-03-09 ConflictBench: Evaluating Human-AI Conflict via Interactive and Visually Grounded Environments Weixiang Zhao et.al. 2603.08024 translate read null
2026-03-09 Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization Jingwei Li et.al. 2603.08022 translate read null
2026-03-09 Missing No More: Dictionary-Guided Cross-Modal Image Fusion under Missing Infrared Yafei Zhang et.al. 2603.08018 translate read null
2026-03-09 FedMomentum: Preserving LoRA Training Momentum in Federated Fine-Tuning Peishen Yan et.al. 2603.08014 translate read null
2026-03-09 PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents Yuxiang Chai et.al. 2603.08013 translate read null
2026-03-09 SmartThinker: Progressive Chain-of-Thought Length Calibration for Efficient Large Language Model Reasoning Chenzhi Hu et.al. 2603.08000 translate read null
2026-03-09 CMMR-VLN: Vision-and-Language Navigation via Continual Multimodal Memory Retrieval Haozhou Li et.al. 2603.07997 translate read null
2026-03-09 AutoTraces: Autoregressive Trajectory Forecasting via Multimodal Large Language Models Teng Wang et.al. 2603.07989 translate read null
2026-03-09 Adaptive Collaboration with Humans: Metacognitive Policy Optimization for Multi-Agent LLMs with Continual Learning Wei Yang et.al. 2603.07972 translate read null
2026-03-09 GOMA: Geometrically Optimal Mapping via Analytical Modeling for Spatial Accelerators Wulve Yang et.al. 2603.07962 translate read null
2026-03-09 SGG-R $^{\rm 3}$ : From Next-Token Prediction to End-to-End Unbiased Scene Graph Generation Jiaye Feng et.al. 2603.07961 translate read null
2026-03-09 ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework Yusong Wang et.al. 2603.07946 translate read null
2026-03-09 AI Agents, Language, Deep Learning and the Next Revolution in Science Ke Li et.al. 2603.07940 translate read null
2026-03-09 Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis Ethan Young et.al. 2603.07936 translate read null
2026-03-09 BRIDGE: Benchmark for multi-hop Reasoning In long multimodal Documents with Grounded Evidence Biao Xiang et.al. 2603.07931 translate read null
2026-03-09 SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training Xin-Cheng Wen et.al. 2603.07927 translate read null
2026-03-09 LeJOT-AutoML: LLM-Driven Feature Engineering for Job Execution Time Prediction in Databricks Cost Optimization Lizhi Ma et.al. 2603.07897 translate read null
2026-03-09 Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference Noah Golowich et.al. 2603.07887 translate read null
2026-03-09 CCR-Bench: A Comprehensive Benchmark for Evaluating LLMs on Complex Constraints, Control Flows, and Real-World Cases Xiaona Xue et.al. 2603.07886 translate read null
2026-03-09 What Do AI Agents Talk About? Emergent Communication Structure in the First AI-Only Social Network Taksch Dube et.al. 2603.07880 translate read null
2026-03-09 Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision-Language Models Jeongwoo Lee et.al. 2603.07868 translate read null
2026-03-08 An Efficient and Effective Evaluator for Text2SQL Models on Unseen and Unlabeled Data Trinh Pham et.al. 2603.07841 translate read null
2026-03-08 AI Steerability 360: A Toolkit for Steering Large Language Models Erik Miehling et.al. 2603.07837 translate read null
2026-03-08 AI Misuse in Education Is a Measurement Problem: Toward a Learning Visibility Framework Eduardo Davalos et.al. 2603.07834 translate read null
2026-03-08 Benchmarking Large Language Models for Quebec Insurance: From Closed-Book to Retrieval-Augmented Generation David Beauchemin et.al. 2603.07825 translate read null
2026-03-08 Reasoning Knowledge-Gap in Drone Planning via LLM-based Active Elicitation Zeyu Fang et.al. 2603.07824 translate read null
2026-03-08 Temperature-Aware Scheduling of LLM Inference in Large-Scale Geo-Distributed Edge Data Centers with Distributed Optimization Arash Khalatbarisoltani et.al. 2603.07810 translate read null
2026-03-08 Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context Ashish Pandey et.al. 2603.07792 translate read null
2026-03-08 ArcLight: A Lightweight LLM Inference Architecture for Many-Core CPUs Yuzhuang Xu et.al. 2603.07770 translate read null
2026-03-08 MedQ-Deg: A Multidimensional Benchmark for Evaluating MLLMs Across Medical Image Quality Degradations Jiyao Liu et.al. 2603.07769 translate read null
2026-03-08 QuadAI at SemEval-2026 Task 3: Ensemble Learning of Hybrid RoBERTa and LLMs for Dimensional Aspect-Based Sentiment Analysis A. J. W. de Vink et.al. 2603.07766 translate read null
2026-03-08 3ViewSense: Spatial and Mental Perspective Reasoning from Orthographic Views in Vision-Language Models Shaoxiong Zhan et.al. 2603.07751 translate read null
2026-03-02 Symbol-Equivariant Recurrent Reasoning Models Richard Freinschlag et.al. 2603.02193 translate read null
2026-03-02 Multi-Head Low-Rank Attention Songtao Liu et.al. 2603.02188 translate read null
2026-03-02 How Small Can 6G Reason? Scaling Tiny Language Models for AI-Native Networks Mohamed Amine Ferrag et.al. 2603.02156 translate read null
2026-03-02 Zero- and Few-Shot Named-Entity Recognition: Case Study and Dataset in the Crime Domain (CrimeNER) Miguel Lopez-Duran et.al. 2603.02150 translate read null
2026-03-02 LongRLVR: Long-Context Reinforcement Learning Requires Verifiable Context Rewards Guanzheng Chen et.al. 2603.02146 translate read null
2026-03-02 LLMs as Strategic Actors: Behavioral Alignment, Risk Calibration, and Argumentation Framing in Geopolitical Simulations Veronika Solopova et.al. 2603.02128 translate read null
2026-03-02 Pencil Puzzle Bench: A Benchmark for Multi-Step Verifiable Reasoning Justin Waugh et.al. 2603.02119 translate read null
2026-03-02 Recursive Think-Answer Process for LLMs and VLMs Byung-Kwan Lee et.al. 2603.02099 translate read null
2026-03-02 OmniRet: Efficient and High-Fidelity Omni Modality Retrieval Chuong Huynh et.al. 2603.02098 translate read null
2026-03-02 ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels Xiang Zheng et.al. 2603.02097 translate read null
2026-03-02 Adam Converges Without Any Modification On Update Rules Yushun Zhang et.al. 2603.02092 translate read null
2026-03-02 Learning from Synthetic Data Improves Multi-hop Reasoning Anmol Kabra et.al. 2603.02091 translate read null
2026-03-02 GenDB: The Next Generation of Query Processing – Synthesized, Not Engineered Jiale Lao et.al. 2603.02081 translate read null
2026-03-02 Trident: Adaptive Scheduling for Heterogeneous Multimodal Data Pipelines Ding Pan et.al. 2603.02075 translate read null
2026-03-02 Exploring Plan Space through Conversation: An Agentic Framework for LLM-Mediated Explanations in Planning Guilhem Fouilhé et.al. 2603.02070 translate read null
2026-03-02 Beyond Microservices: Testing Web-Scale RCA Methods on GPU-Driven LLM Workloads Dominik Scheinert et.al. 2603.02057 translate read null
2026-03-02 Expanding LLM Agent Boundaries with Strategy-Guided Exploration Andrew Szot et.al. 2603.02045 translate read null
2026-03-02 EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training Aleksei Dorkin et.al. 2603.02041 translate read null
2026-03-02 LAD-Drive: Bridging Language and Trajectory with Action-Aware Diffusion Transformers Fabian Schmidt et.al. 2603.02035 translate read null
2026-03-02 MetaRCA: A Generalizable Root Cause Analysis Framework for Cloud-Native Systems Powered by Meta Causal Knowledge Shuai Liang et.al. 2603.02032 translate read null
2026-03-02 Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT Simon Ging et.al. 2603.02026 translate read null
2026-03-02 MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning Jiachun Li et.al. 2603.02024 translate read link
2026-03-02 CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production Yixin Nie et.al. 2603.01973 translate read null
2026-03-02 LiveCultureBench: a Multi-Agent, Multi-Cultural Benchmark for Large Language Models in Dynamic Social Simulations Viet-Thanh Pham et.al. 2603.01952 translate read null
2026-03-02 When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation Thibault Prouteau et.al. 2603.01945 translate read null
2026-03-02 Ignore All Previous Instructions: Jailbreaking as a de-escalatory peace building practise to resist LLM social media bots Huw Day et.al. 2603.01942 translate read null
2026-03-02 Real Money, Fake Models: Deceptive Model Claims in Shadow APIs Yage Zhang et.al. 2603.01919 translate read null
2026-03-02 AdaPonderLM: Gated Pondering Language Models with Token-Wise Adaptive Depth Shixiang Song et.al. 2603.01914 translate read null
2026-03-02 Efficient RLVR Training via Weighted Mutual Information Data Selection Xinyu Zhou et.al. 2603.01907 translate read null
2026-03-02 VietSuperSpeech: A Large-Scale Vietnamese Conversational Speech Dataset for ASR Fine-Tuning in Chatbot, Customer Support, and Call Center Applications Loan Do et.al. 2603.01894 translate read null
2026-03-02 KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models Songming Zhang et.al. 2603.01875 translate read null
2026-03-02 Let the Agent Search: Autonomous Exploration Beats Rigid Workflows in Temporal Question Answering Xufei Lv et.al. 2603.01853 translate read null
2026-03-02 Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions Vineeth Venugopal et.al. 2603.01834 translate read null
2026-03-02 OpenAutoNLU: Open Source AutoML Library for NLU Grigory Arshinov et.al. 2603.01824 translate read null
2026-03-02 Emerging Human-like Strategies for Semantic Memory Foraging in Large Language Models Eric Lacosse et.al. 2603.01822 translate read null
2026-03-02 Voices, Faces, and Feelings: Multi-modal Emotion-Cognition Captioning for Mental Health Understanding Zhiyuan Zhou et.al. 2603.01816 translate read null
2026-03-02 Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition Mingwei Liu et.al. 2603.01814 translate read null
2026-03-02 ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs Xunlei Chen et.al. 2603.01792 translate read null
2026-03-02 nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models Nils Constantin Hellwig et.al. 2603.01788 translate read null
2026-03-02 Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution Guoxin Shi et.al. 2603.01784 translate read null
2026-03-02 GAM-RAG: Gain-Adaptive Memory for Evolving Retrieval in Retrieval-Augmented Generation Yifan Wang et.al. 2603.01783 translate read null
2026-03-02 LLM-as-an-Annotator: Training Lightweight Models with LLM-Annotated Examples for Aspect Sentiment Tuple Prediction Nils Constantin Hellwig et.al. 2603.01778 translate read null
2026-03-02 FreeAct: Freeing Activations for LLM Quantization Xiaohao Liu et.al. 2603.01776 translate read null
2026-03-02 Beyond the Resumé: A Rubric-Aware Automatic Interview System for Information Elicitation Harry Stuart et.al. 2603.01775 translate read null
2026-03-02 AnnoABSA: A Web-Based Annotation Tool for Aspect-Based Sentiment Analysis with Retrieval-Augmented Suggestions Nils Constantin Hellwig et.al. 2603.01773 translate read null
2026-03-02 Bootstrapping Embeddings for Low Resource Languages Merve Basoz et.al. 2603.01732 translate read null
2026-03-02 Learning Domain-Aware Task Prompt Representations for Multi-Domain All-in-One Image Restoration Guanglu Dong et.al. 2603.01725 translate read null
2026-03-02 GMP: A Benchmark for Content Moderation under Co-occurring Violations and Dynamic Rules Houde Dong et.al. 2603.01724 translate read null
2026-03-02 Changes in Manuscript Length, Research Team Size, and International Collaboration in the Post-2022 Period: Evidence from PLOS ONE Yossi Ben-Zion et.al. 2603.01718 translate read null
2026-03-02 FT-Dojo: Towards Autonomous LLM Fine-Tuning with Language Agents Qizheng Li et.al. 2603.01712 translate read null
2026-03-02 Cross-modal Identity Mapping: Minimizing Information Loss in Modality Conversion via Reinforcement Learning Haonan Jia et.al. 2603.01696 translate read null
2026-03-02 Building a Strong Instruction Language Model for a Less-Resourced Language Domen Vreš et.al. 2603.01691 translate read null
2026-03-02 Surgical Post-Training: Cutting Errors, Keeping Knowledge Wenye Lin et.al. 2603.01683 translate read null
2026-03-02 CeProAgents: A Hierarchical Agents System for Automated Chemical Process Development Yuhang Yang et.al. 2603.01654 translate read null
2026-03-02 LexChronos: An Agentic Framework for Structured Event Timeline Extraction in Indian Jurisprudence Anka Chandrahas Tummepalli et.al. 2603.01651 translate read null
2026-03-02 Learning Structured Reasoning via Tractable Trajectory Control Po-Nien Kung et.al. 2603.01641 translate read null
2026-03-02 Learning to Draft: Adaptive Speculative Decoding with Reinforcement Learning Jiebin Zhang et.al. 2603.01639 translate read null
2026-03-02 Who Explains Privacy Policies to Me? Embodied and Textual LLM-Powered Privacy Assistants in Virtual Reality Vincent Freiberger et.al. 2603.01638 translate read null
2026-03-02 DriveCombo: Benchmarking Compositional Traffic Rule Reasoning in Autonomous Driving Enhui Ma et.al. 2603.01637 translate read null
2026-03-02 Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum Raphael Hoheisel et.al. 2603.01624 translate read null
2026-03-02 IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs Yubin Zhang et.al. 2603.01590 translate read null
2026-03-02 SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond Xiangyang Zhu et.al. 2603.01589 translate read null
2026-03-02 DualSentinel: A Lightweight Framework for Detecting Targeted Attacks in Black-box LLM via Dual Entropy Lull Pattern Xiaoyi Pang et.al. 2603.01574 translate read null
2026-03-02 Investigating Group Relative Policy Optimization for Diffusion Transformer based Text-to-Audio Generation Yi Gu et.al. 2603.01565 translate read null
2026-03-02 From Secure Agentic AI to Secure Agentic Web: Challenges, Threats, and Future Directions Zhihang Deng et.al. 2603.01564 translate read null
2026-03-02 LFPO: Likelihood-Free Policy Optimization for Masked Diffusion Models Chenxing Wei et.al. 2603.01563 translate read null
2026-03-02 RubricBench: Aligning Model-Generated Rubrics with Human Standards Qiyuan Zhang et.al. 2603.01562 translate read link
2026-03-02 Benchmarking LLM Summaries of Multimodal Clinical Time Series for Remote Monitoring Aditya Shukla et.al. 2603.01557 translate read null
2026-03-02 S5-HES Agent: Society 5.0-driven Agentic Framework to Democratize Smart Home Environment Simulation Akila Siriweera et.al. 2603.01554 translate read null
2026-03-02 Extracting Training Dialogue Data from Large Language Model based Task Bots Shuo Zhang et.al. 2603.01550 translate read null
2026-03-02 Training-Free Spatio-temporal Decoupled Reasoning Video Segmentation with Adaptive Object Memory Zhengtong Zhu et.al. 2603.01545 translate read null
2026-03-02 FATE: Closed-Loop Feasibility-Aware Task Generation with Active Repair for Physically Grounded Robotic Curricula Bingchuan Wei et.al. 2603.01505 translate read null
2026-03-02 GAC: Stabilizing Asynchronous RL Training for LLMs via Gradient Alignment Control Haofeng Xu et.al. 2603.01501 translate read null
2026-03-02 Towards Privacy-Preserving LLM Inference via Collaborative Obfuscation (Technical Report) Yu Lin et.al. 2603.01499 translate read null
2026-03-02 Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision Manisha Mukherjee et.al. 2603.01494 translate read null
2026-03-02 LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning Chang Yao et.al. 2603.01488 translate read null
2026-03-02 Harmonizing Dense and Sparse Signals in Multi-turn RL: Dual-Horizon Credit Assignment for Industrial Sales Agents Haojin Yang et.al. 2603.01481 translate read null
2026-03-02 SFCo-Nav: Efficient Zero-Shot Visual Language Navigation via Collaboration of Slow LLM and Fast Attributed Graph Alignment Chaoran Xiong et.al. 2603.01477 translate read null
2026-03-02 Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality Jiahan Chen et.al. 2603.01471 translate read null
2026-03-02 ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning Congying Liu et.al. 2603.01464 translate read null
2026-03-02 Production-Grade AI Coding System for Client-Side Development Ruihan Wang et.al. 2603.01460 translate read null
2026-03-02 From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents Niu Lian et.al. 2603.01455 translate read null
2026-03-02 VidDoS: Universal Denial-of-Service Attack on Video-based Large Language Models Duoxun Tang et.al. 2603.01454 translate read null
2026-03-02 Enhancing Persona Following at Decoding Time via Dynamic Importance Estimation for Role-Playing Agents Yuxin Liu et.al. 2603.01438 translate read null
2026-03-02 Decoding Answers Before Chain-of-Thought: Evidence from Pre-CoT Probes and Activation Steering Kyle Cox et.al. 2603.01437 translate read null
2026-03-02 LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval Jiajie Jin et.al. 2603.01425 translate read null
2026-03-02 Quantifying Conversational Reliability of Large Language Models under Multi-Turn Interaction Jiyoon Myung et.al. 2603.01423 translate read null
2026-03-02 SciDER: Scientific Data-centric End-to-end Researcher Ke Lin et.al. 2603.01421 translate read null
2026-03-02 ReFeed: Retrieval Feedback-Guided Dataset Construction for Style-Aware Query Rewriting Jiyoon Myung et.al. 2603.01417 translate read null
2026-03-02 Jailbreaking Embodied LLMs via Action-level Manipulation Xinyu Huang et.al. 2603.01414 translate read null
2026-03-02 When Humans Don’t Feel Like an Option: Contextual Factors That Shape When Older Adults Turn to Conversational AI for Emotional Support Mengqi Shi et.al. 2603.01413 translate read null
2026-03-02 GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning Yuchen Ying et.al. 2603.01410 translate read null
2026-03-02 MIST-RL: Mutation-based Incremental Suite Testing via Reinforcement Learning Sicheng Zhu et.al. 2603.01409 translate read null
2026-03-02 Token Reduction via Local and Global Contexts Optimization for Efficient Video Large Language Models Jinlong Li et.al. 2603.01400 translate read null
2026-03-02 Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification Guang Huang et.al. 2603.01399 translate read null
2026-03-02 Toward Graph-Tokenizing Large Language Models with Reconstructive Graph Instruction Tuning Zhongjian Zhang et.al. 2603.01385 translate read null
2026-03-02 3BASiL: An Algorithmic Framework for Sparse plus Low-Rank Compression of LLMs Mehdi Makni et.al. 2603.01376 translate read null
2026-03-02 Words & Weights: Streamlining Multi-Turn Interactions via Co-Adaptation Chenxing Wei et.al. 2603.01375 translate read null
2026-03-02 PanCanBench: A Comprehensive Benchmark for Evaluating Large Language Models in Pancreatic Oncology Yimin Zhao et.al. 2603.01343 translate read null
2026-03-02 Structural Hallucination in Large Language Models: A Network-Based Evaluation of Knowledge Organization and Citation Integrity Moses Boudourides et.al. 2603.01341 translate read null
2026-03-01 SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution Kang He et.al. 2603.01327 translate read null
2026-03-01 Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning Hamed Damirchi et.al. 2603.01326 translate read null
2026-03-01 Caught in a Mafia Romance: How Users Explore Intimate Roleplay and Narrative Exploration with Chatbots Julia Kieserman et.al. 2603.01319 translate read null
2026-03-01 Actor’s Note: Examining the Role of AI-Generated Questions in Character Journaling for Actor Training Sora Kang et.al. 2603.01314 translate read null
2026-03-01 Theoretical Perspectives on Data Quality and Synergistic Effects in Pre- and Post-Training Reasoning Models Adel Javanmard et.al. 2603.01293 translate read null
2026-03-01 JailNewsBench: Multi-Lingual and Regional Benchmark for Fake News Generation under Jailbreak Attacks Masahiro Kaneko et.al. 2603.01291 translate read null
2026-03-01 Individual Turing Test: A Case Study of LLM-based Simulation Using Longitudinal Personal Data Minghao Guo et.al. 2603.01289 translate read null
2026-03-01 Attention Smoothing Is All You Need For Unlearning Saleh Zare Zade et.al. 2603.01285 translate read null
2026-03-01 GlassMol: Interpretable Molecular Property Prediction with Concept Bottleneck Models Oscar Rivera et.al. 2603.01274 translate read null
2026-03-01 NeuroSCA: Neuro-Symbolic Constraint Abstraction for Smart Contract Hybrid Fuzzing Haochen Liang et.al. 2603.01272 translate read null
2026-03-01 MOSAIC: A Unified Platform for Cross-Paradigm Comparison and Evaluation of Homogeneous and Heterogeneous Multi-Agent RL, LLM, VLM, and Human Decision-Makers Abdulhamid M. Mousa et.al. 2603.01260 translate read null
2026-03-01 A Systematic Study of LLM-Based Architectures for Automated Patching Qingxiao Xu et.al. 2603.01257 translate read null
2026-03-01 Linking Knowledge to Care: Knowledge Graph-Augmented Medical Follow-Up Question Generation Liwen Sun et.al. 2603.01252 translate read null
2026-03-01 Defensive Refusal Bias: How Safety Alignment Fails Cyber Defenders David Campbell et.al. 2603.01246 translate read null
2026-03-01 Suffix-Constrained Greedy Search Algorithms for Causal Language Models Ayoub Hammal et.al. 2603.01243 translate read null
2026-03-01 Self-Anchoring Calibration Drift in Large Language Models: How Multi-Turn Conversations Reshape Model Confidence Harshavardhan et.al. 2603.01239 translate read null
2026-03-01 The Lattice Representation Hypothesis of Large Language Models Bo Xiong et.al. 2603.01227 translate read null
2026-03-01 Can Thinking Models Think to Detect Hateful Memes? Mohamed Bayan Kmainasi et.al. 2603.01225 translate read null
2026-03-01 Generative AI & Fictionality: How Novels Power Large Language Models Edwin Roland et.al. 2603.01220 translate read null
2026-03-01 Reasoning Boosts Opinion Alignment in LLMs Frédéric Berdoz et.al. 2603.01214 translate read null
2026-03-01 Can AI Agents Agree? Frédéric Berdoz et.al. 2603.01213 translate read null
2026-03-01 Token-level Data Selection for Safe LLM Fine-tuning Yanping Li et.al. 2603.01185 translate read null
2026-03-01 HAVEN: High-Bandwidth Flash Augmented Vector Engine for Large-Scale Approximate Nearest-Neighbor Search Acceleration Po-Kai Hsu et.al. 2603.01175 translate read null
2026-03-01 DEP: A Decentralized Large Language Model Evaluation Protocol Jianxiang Peng et.al. 2603.01167 translate read null
2026-03-01 Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic Hongyi Zhou et.al. 2603.01162 translate read null
2026-03-01 Semantic XPath: Structured Agentic Memory Access for Conversational AI Yifan Simon Liu et.al. 2603.01160 translate read null
2026-03-01 vEcho: A Paradigm Shift from Vulnerability Verification to Proactive Discovery with Large Language Models Mingcheng Jiang et.al. 2603.01154 translate read null
2026-03-01 ArtLLM: Generating Articulated Assets via 3D LLM Penghao Wang et.al. 2603.01142 translate read null
2026-03-01 FCN-LLM: Empower LLM for Brain Functional Connectivity Network Understanding via Graph-level Multi-task Instruction Tuning Xingcan Hu et.al. 2603.01135 translate read null
2026-03-01 MedCollab: Causal-Driven Multi-Agent Collaboration for Full-Cycle Clinical Diagnosis via IBIS-Structured Argumentation Yuqi Zhan et.al. 2603.01131 translate read null
2026-03-01 From Dialogue to Execution: Mixture-of-Agents Assisted Interactive Planning for Behavior Tree-Based Long-Horizon Robot Execution Kanata Suzuki et.al. 2603.01113 translate read null
2026-03-01 DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage Haowen Gao et.al. 2603.01106 translate read null
2026-03-01 Egocentric Co-Pilot: Web-Native Smart-Glasses Agents for Assistive Egocentric AI Sicheng Yang et.al. 2603.01104 translate read null
2026-03-01 Understanding LoRA as Knowledge Memory: An Empirical Analysis Seungju Back et.al. 2603.01097 translate read null
2026-03-01 Alien Science: Sampling Coherent but Cognitively Unavailable Research Directions from Idea Atoms Alejandro H. Artiles et.al. 2603.01092 translate read null
2026-03-01 CARD: Towards Conditional Design of Multi-agent Topological Structures Tongtong Wu et.al. 2603.01089 translate read null
2026-03-01 Beyond Global Similarity: Towards Fine-Grained, Multi-Condition Multimodal Retrieval Xuan Lu et.al. 2603.01082 translate read null
2026-03-01 How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning Xiangxiang Zhang et.al. 2603.01070 translate read null
2026-03-01 GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant Zhuokang Shen et.al. 2603.01059 translate read null
2026-03-01 MMCOMET: A Large-Scale Multimodal Commonsense Knowledge Graph for Contextual Reasoning Eileen Wang et.al. 2603.01055 translate read null
2026-03-01 CelloAI Benchmarks: Toward Repeatable Evaluation of AI Assistants Mohammad Atif et.al. 2603.01051 translate read null
2026-03-01 Silo-Bench: A Scalable Environment for Evaluating Distributed Coordination in Multi-Agent LLM Systems Yuzhe Zhang et.al. 2603.01045 translate read null
2026-03-01 Thoth: Mid-Training Bridges LLMs to Time Series Understanding Jiafeng Lin et.al. 2603.01042 translate read link
2026-03-01 One-Token Verification for Reasoning Correctness Estimation Zhan Zhuang et.al. 2603.01025 translate read null
2026-03-01 GeoMCP: A Trustworthy Framework for AI-Assisted Analytical Geotechnical Engineering Yared W. Bekele et.al. 2603.01022 translate read null
2026-03-01 FastCode: Fast and Cost-Efficient Code Understanding and Reasoning Zhonghang Li et.al. 2603.01012 translate read null
2026-03-01 CollabEval: Enhancing LLM-as-a-Judge via Multi-Agent Collaboration Yiyue Qian et.al. 2603.00993 translate read null
2026-03-01 Sustainable Code Generation Using Large Language Models: A Systematic Literature Review Sabiya Banu Masthan Ali et.al. 2603.00989 translate read null
2026-03-01 HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents Hongbo Jin et.al. 2603.00977 translate read null
2026-03-01 Stabilizing Policy Optimization via Logits Convexity Hongzhan Chen et.al. 2603.00963 translate read null
2026-03-01 S-VoCAL: A Dataset and Evaluation Framework for Inferring Speaking Voice Character Attributes in Literature Abigail Berthe-Pardo et.al. 2603.00958 translate read null
2026-03-01 Seeing Beyond 8bits: Subjective and Objective Quality Assessment of HDR-UGC Videos Shreshth Saini et.al. 2603.00938 translate read null
2026-03-01 Learning to Weigh Waste: A Physics-Informed Multimodal Fusion Framework and Large-Scale Dataset for Commercial and Industrial Applications Md. Adnanul Islam et.al. 2603.00931 translate read null
2026-03-01 Conformal Prediction for Risk-Controlled Medical Entity Extraction Across Clinical Domains Manil Shrestha et.al. 2603.00924 translate read null
2026-03-01 Hybrid Neural-LLM Pipeline for Morphological Glossing in Endangered Language Documentation: A Case Study of Jungar Tuvan Siyu Liang et.al. 2603.00923 translate read null
2026-03-01 DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving Zhiye Wang et.al. 2603.00919 translate read null
2026-03-01 Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment Shravani Hariprasad et.al. 2603.00917 translate read null
2026-03-01 Curvature-Weighted Capacity Allocation: A Minimum Description Length Framework for Layer-Adaptive Large Language Model Optimization Theophilus Amaefuna et.al. 2603.00910 translate read null
2026-03-01 KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging Lianjun Liu et.al. 2603.00907 translate read null
2026-03-01 pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning Zhanpeng Luo et.al. 2603.00905 translate read null
2026-03-01 Detect Repair Verify for Securing LLM Generated Code: A Multi-Language Empirical Study Cheng Cheng et.al. 2603.00897 translate read null
2026-03-01 Evaluating AI Grading on Real-World Handwritten College Mathematics: A Large-Scale Study Toward a Benchmark Zhiqi Yu et.al. 2603.00895 translate read null
2026-03-01 CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning Xinyu Zhu et.al. 2603.00889 translate read null
2026-03-01 BioProAgent: Neuro-Symbolic Grounding for Constrained Scientific Planning Yuyang Liu et.al. 2603.00876 translate read null
2026-03-01 MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains Xuying Ning et.al. 2603.00873 translate read null
2026-03-01 PARCER as an Operational Contract to Reduce Variance, Cost, and Risk in LLM Systems Elzo Brito dos Santos Filho et.al. 2603.00856 translate read null
2026-03-01 Tiny-Critic RAG: Empowering Agentic Fallback with Parameter-Efficient Small Language Models Yichao Wu et.al. 2603.00846 translate read null

(<a href=../LLM.md>back to LLM</a>)