LLM - 2025-12

Publish Date Title Authors PDF Translate Read Code
2025-12-31 Constructing a Neuro-Symbolic Mathematician from First Principles Keqin Xie et.al. 2601.00125 translate read null
2025-12-31 Ask, Clarify, Optimize: Human-LLM Agent Collaboration for Smarter Inventory Control Yaqi Duan et.al. 2601.00121 translate read null
2025-12-31 CTMap: LLM-Enabled Connectivity-Aware Path Planning in Millimeter-Wave Digital Twin Networks Md Salik Parwez et.al. 2601.00110 translate read null
2025-12-31 Mortar: Evolving Mechanics for Automatic Game Design Muhammad U. Nasir et.al. 2601.00105 translate read null
2025-12-31 The Agentic Leash: Extracting Causal Feedback Fuzzy Cognitive Maps with LLMs Akash Kumar Panda et.al. 2601.00097 translate read null
2025-12-31 Universal Adaptive Constraint Propagation: Scaling Structured Inference for Large Language Models via Meta-Reinforcement Learning Ibne Farabi Shihab et.al. 2601.00095 translate read null
2025-12-31 Spatial4D-Bench: A Versatile 4D Spatial Intelligence Benchmark Pan Wang et.al. 2601.00092 translate read null
2025-12-31 Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery Junqi Qu et.al. 2601.00088 translate read null
2025-12-31 RIMRULE: Improving Tool-Using Language Agents via MDL-Guided Rule Learning Xiang Gao et.al. 2601.00086 translate read null
2025-12-31 Vulcan: Instance-Optimal Systems Heuristics Through LLM-Driven Search Rohit Dwivedula et.al. 2512.25065 translate read null
2025-12-31 Many Minds from One Model: Bayesian Transformers for Population Intelligence Diji Yang et.al. 2512.25063 translate read null
2025-12-31 Context-aware LLM-based AI Agents for Human-centered Energy Management Systems in Smart Buildings Tianzhi He et.al. 2512.25055 translate read null
2025-12-31 MAMA-Memeia! Multi-Aspect Multi-Agent Collaboration for Depressive Symptoms Identification in Memes Siddhant Agarwal et.al. 2512.25015 translate read null
2025-12-31 Efficiently Estimating Data Efficiency for Language Model Fine-tuning Gyung Hyun Je et.al. 2512.24991 translate read null
2025-12-31 PhysTalk: Language-driven Real-time Physics in 3D Gaussian Scenes Luca Collorone et.al. 2512.24986 translate read null
2025-12-31 Large language models and the entropy of English Colin Scheibner et.al. 2512.24969 translate read null
2025-12-31 The Impact of LLMs on Online News Consumption and Production Hangcheng Zhao et.al. 2512.24968 translate read null
2025-12-31 AMAP Agentic Planning Technical Report Yulan Hu et.al. 2512.24957 translate read null
2025-12-31 CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement Wentao Zhang et.al. 2512.24947 translate read null
2025-12-31 RAIR: A Rule-Aware Benchmark Uniting Challenging Long-Tail and Visual Salience Subset for E-commerce Relevance Assessment Chenji Lu et.al. 2512.24943 translate read null
2025-12-31 Iterative Deployment Improves Planning Skills in LLMs Augusto B. Corrêa et.al. 2512.24940 translate read null
2025-12-31 Vibe Coding, Interface Flattening Hongrui Jin et.al. 2512.24939 translate read null
2025-12-31 Adaptive Dependency-aware Prompt Optimization Framework for Multi-Step LLM Pipeline Minjun Zhao et.al. 2512.24933 translate read null
2025-12-31 FinMMDocR: Benchmarking Financial Multimodal Reasoning with Scenario Awareness, Document Understanding, and Multi-Step Computation Zichen Tang et.al. 2512.24903 translate read null
2025-12-31 Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements Yiming Liang et.al. 2512.24867 translate read null
2025-12-31 VLN-MME: Diagnosing MLLMs as Language-guided Visual Navigation agents Xunyi Zhao et.al. 2512.24851 translate read null
2025-12-31 GenZ: Foundational models as latent variable generators within traditional statistical models Marko Jojic et.al. 2512.24834 translate read null
2025-12-31 Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback Shulun Chen et.al. 2512.24818 translate read null
2025-12-31 LeanCat: A Benchmark Suite for Formal Category Theory in Lean (Part I: 1-Categories) Rongge Xu et.al. 2512.24796 translate read null
2025-12-31 Compute-Accuracy Pareto Frontiers for Open-Source Reasoning Large Language Models Ákos Prucs et.al. 2512.24776 translate read null
2025-12-31 Analyzing Communication Predictability in LLM Training Wenxue Li et.al. 2512.24750 translate read null
2025-12-31 BIOME-Bench: A Benchmark for Biomolecular Interaction Inference and Multi-Omics Pathway Mechanism Elucidation from Scientific Literature Sibo Wei et.al. 2512.24733 translate read null
2025-12-31 FPGA Co-Design for Efficient N:M Sparse and Quantized Model Inference Fen-Yu Hsieh et.al. 2512.24713 translate read null
2025-12-31 MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints Kangyang Luo et.al. 2512.24711 translate read null
2025-12-31 MUSIC: MUlti-Step Instruction Contrast for Multi-Turn Reward Models Wenzhe Li et.al. 2512.24693 translate read null
2025-12-31 Quantum Visual Word Sense Disambiguation: Unraveling Ambiguities Through Quantum Inference Model Wenbo Qiao et.al. 2512.24687 translate read null
2025-12-31 BatteryAgent: Synergizing Physics-Informed Interpretation with LLM Reasoning for Intelligent Battery Fault Diagnosis Songqi Zhou et.al. 2512.24686 translate read null
2025-12-31 Do Large Language Models Know What They Are Capable Of? Casey O. Barkan et.al. 2512.24661 translate read null
2025-12-31 DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information Zhili Huang et.al. 2512.24635 translate read null
2025-12-31 How Do Agentic AI Systems Address Performance Optimizations? A BERTopic-Based Analysis of Pull Requests Md Nahidul Islam Opu et.al. 2512.24630 translate read null
2025-12-31 Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Junru Lu et.al. 2512.24618 translate read link
2025-12-31 Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Xingwei Qu et.al. 2512.24617 translate read null
2025-12-31 Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Yuchen Shi et.al. 2512.24615 translate read link
2025-12-31 Chat-Driven Optimal Management for Virtual Network Services Yuya Miyaoka et.al. 2512.24614 translate read null
2025-12-31 Group Deliberation Oriented Multi-Agent Conversational Model for Complex Reasoning Zheyu Shi et.al. 2512.24613 translate read null
2025-12-31 Reinforcement Learning-Augmented LLM Agents for Collaborative Decision Making and Performance Optimization Dong Qiu et.al. 2512.24609 translate read null
2025-12-31 Recursive Language Models Alex L. Zhang et.al. 2512.24601 translate read link
2025-12-31 A Tale of 1001 LoC: Potential Runtime Error-Guided Specification Synthesis for Verifying Large-Scale Programs Zhongyi Wang et.al. 2512.24594 translate read null
2025-12-31 Improving Few-Shot Change Detection Visual Question Answering via Decision-Ambiguity-guided Reinforcement Fine-Tuning Fuyu Dong et.al. 2512.24591 translate read null
2025-12-31 MultiRisk: Multiple Risk Control via Iterative Score Thresholding Sunay Joshi et.al. 2512.24587 translate read null
2025-12-31 Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time Zhenyu Zhang et.al. 2512.24574 translate read null
2025-12-31 SynRAG: A Large Language Model Framework for Executable Query Generation in Heterogeneous SIEM System Md Hasan Saju et.al. 2512.24571 translate read null
2025-12-31 On the Effectiveness of Training Data Optimization for LLM-based Code Generation: An Empirical Study Shiqi Kuang et.al. 2512.24570 translate read null
2025-12-31 MCPAgentBench: A Real-world Task Benchmark for Evaluating LLM Agent MCP Tool Use Wenrui Liu et.al. 2512.24565 translate read null
2025-12-31 HaluNet: Multi-Granular Uncertainty Modeling for Efficient Hallucination Detection in LLM Question Answering Chaodong Tong et.al. 2512.24562 translate read null
2025-12-31 Localized Calibrated Uncertainty in Code Language Models David Gros et.al. 2512.24560 translate read null
2025-12-31 Safe in the Future, Dangerous in the Past: Dissecting Temporal and Linguistic Vulnerabilities in LLMs Muhammad Abdullahi Said et.al. 2512.24556 translate read null
2025-12-31 More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization Yuma Ichikawa et.al. 2512.24545 translate read null
2025-12-31 From Building Blocks to Planning: Multi-Step Spatial Reasoning in LLMs with Reinforcement Learning Amir Tahmasbi et.al. 2512.24532 translate read null
2025-12-31 Generative AI-enhanced Sector-based Investment Portfolio Construction Alina Voronina et.al. 2512.24526 translate read null
2025-12-30 Using Large Language Models To Translate Machine Results To Human Results Trishna Niraula et.al. 2512.24518 translate read null
2025-12-30 Paragraph Segmentation Revisited: Towards a Standard Task for Structuring Speech Fabian Retkowski et.al. 2512.24517 translate read null
2025-12-30 Evaluating the Reasoning Abilities of LLMs on Underrepresented Mathematics Competition Problems Samuel Golladay et.al. 2512.24505 translate read null
2025-12-30 HOLOGRAPH: Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors Hyunjun Kim et.al. 2512.24478 translate read null
2025-12-30 PackKV: Reducing KV Cache Memory Footprint through LLM-Aware Lossy Compression Bo Jiang et.al. 2512.24449 translate read null
2025-12-30 Towards mechanistic understanding in a data-driven weather model: internal activations reveal interpretable physical features Theodore MacMillan et.al. 2512.24440 translate read null
2025-12-30 Comparing Approaches to Automatic Summarization in Less-Resourced Languages Chester Palen-Michel et.al. 2512.24410 translate read null
2025-12-30 World model inspired sarcasm reasoning with large language model agents Keito Inoshita et.al. 2512.24329 translate read null
2025-12-30 QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs Shupeng Li et.al. 2512.24314 translate read null
2025-12-30 Automated Analysis of Sustainability Reports: Using Large Language Models for the Extraction and Prediction of EU Taxonomy-Compliant KPIs Jonathan Schmoll et.al. 2512.24289 translate read null
2025-12-30 Taming Hallucinations: Boosting MLLMs’ Video Understanding via Counterfactual Video Generation Zhe Huang et.al. 2512.24271 translate read link
2025-12-30 RAGPart & RAGMask: Retrieval-Stage Defenses Against Corpus Poisoning in Retrieval-Augmented Generation Pankayaraj Pathmanathan et.al. 2512.24268 translate read null
2025-12-30 Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning Ziqing Fan et.al. 2512.24265 translate read null
2025-12-30 GPT-like transformer model for silicon tracking detector simulation Tadej Novak et.al. 2512.24254 translate read null
2025-12-30 MedKGI: Iterative Differential Diagnosis with Medical Knowledge Graphs and Information-Guided Inquiring Qipeng Wang et.al. 2512.24181 translate read null
2025-12-30 DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models Zefeng He et.al. 2512.24165 translate read link
2025-12-30 Training Report of TeleChat3-MoE Xinzhang Liu et.al. 2512.24157 translate read null
2025-12-30 Large Emotional World Model Changhao Song et.al. 2512.24149 translate read null
2025-12-30 Activation Steering for Masked Diffusion Language Models Adi Shnaidman et.al. 2512.24143 translate read null
2025-12-30 Bridging Visual Intuition and Chemical Expertise: An Autonomous Analysis Framework for Nonadiabatic Dynamics Simulations via Mentor-Engineer-Student Collaboration Yifei Zhu et.al. 2512.24133 translate read null
2025-12-30 OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization Advait Gadhikar et.al. 2512.24124 translate read null
2025-12-30 Enhancing LLM-Based Neural Network Generation: Few-Shot Prompting and Efficient Validation for Automated Architecture Design Chandini Vysyaraju et.al. 2512.24120 translate read null
2025-12-30 CogRec: A Cognitive Recommender Agent Fusing Large Language Models and Soar for Explainable Recommendation Jiaxin Hu et.al. 2512.24113 translate read null
2025-12-30 Training a Huggingface Model on AWS Sagemaker (Without Tears) Liling Tan et.al. 2512.24098 translate read null
2025-12-30 LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm Chunhui Wan et.al. 2512.24077 translate read link
2025-12-30 How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns Haoyue Bai et.al. 2512.24063 translate read null
2025-12-30 Beyond Hallucinations: A Composite Score for Measuring Reliability in Open-Source Large Language Models Rohit Kumar Salla et.al. 2512.24058 translate read null
2025-12-30 Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race? Yuan Xin et.al. 2512.24044 translate read null
2025-12-30 ROAD: Reflective Optimization via Automated Debugging for Zero-Shot Agent Alignment Natchaya Temyingyong et.al. 2512.24040 translate read null
2025-12-30 RSAgent: Learning to Reason and Act for Text-Guided Segmentation via Multi-Turn Tool Invocations Xingqi He et.al. 2512.24023 translate read null
2025-12-30 FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing Yunkai Dang et.al. 2512.24022 translate read link
2025-12-30 iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning Sijia Chen et.al. 2512.24014 translate read null
2025-12-30 SPARK: Search Personalization via Agent-Driven Retrieval and Knowledge-sharing Gaurab Chhetri et.al. 2512.24008 translate read null
2025-12-30 RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress Ruixuan Huang et.al. 2512.23995 translate read null
2025-12-30 Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process Zhenyu Zhang et.al. 2512.23988 translate read null
2025-12-30 Coding With AI: From a Reflection on Industrial Practices to Future Computer Science and Software Engineering Education Hung-Fu Chang et.al. 2512.23982 translate read null
2025-12-30 Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Chulun Zhou et.al. 2512.23959 translate read link
2025-12-30 A Proof-of-Concept for Explainable Disease Diagnosis Using Large Language Models and Answer Set Programming Ioanna Gemou et.al. 2512.23932 translate read null
2025-12-29 Scaling Remote Sensing Foundation Models: Data Domain Tradeoffs at the Peta-Scale Charith Wickrema et.al. 2512.23903 translate read null
2025-12-29 How Large Language Models Systematically Misrepresent American Climate Opinions Sola Kim et.al. 2512.23889 translate read null
2025-12-29 Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack Roee Ziv et.al. 2512.23881 translate read null
2025-12-29 CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution Xu Huang et.al. 2512.23880 translate read null
2025-12-29 Seeking Late Night Life Lines: Experiences of Conversational AI Use in Mental Health Crisis Leah Hope Ajmani et.al. 2512.23859 translate read null
2025-12-29 Integrating Domain Knowledge for Financial QA: A Multi-Retriever RAG Approach with LLMs Yukun Zhang et.al. 2512.23848 translate read null
2025-12-29 A Test of Lookahead Bias in LLM Forecasts Zhenyu Gao et.al. 2512.23847 translate read null
2025-12-29 From Correctness to Collaboration: Toward a Human-Centered Framework for Evaluating AI Agent Behavior in Software Engineering Tao Dong et.al. 2512.23844 translate read null
2025-12-29 Retrieval Augmented Question Answering: When Should LLMs Admit Ignorance? Dingmin Wang et.al. 2512.23836 translate read null
2025-12-29 Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark Manu et.al. 2512.23779 translate read null
2025-12-29 Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning Tiancheng Su et.al. 2512.23765 translate read null
2025-12-28 Audited Skill-Graph Self-Improvement for Agentic LLMs via Verifiable Rewards, Experience Synthesis, and Continual Memory Ken Huang et.al. 2512.23760 translate read null
2025-12-29 Eliciting Behaviors in Multi-Turn Conversations Jing Huang et.al. 2512.23701 translate read null
2025-12-29 Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing Panagiotis Theocharopoulos et.al. 2512.23684 translate read null
2025-12-29 Web World Models Jichen Feng et.al. 2512.23676 translate read link
2025-12-29 OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding Keda Tao et.al. 2512.23646 translate read null
2025-12-29 BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization Iris Xu et.al. 2512.23631 translate read link
2025-12-29 Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing Yuwen Li et.al. 2512.23611 translate read null
2025-12-29 The Big Three in Marriage Talk: LLM-Assisted Analysis of Moral Ethics and Sentiment on Weibo and Xiaohongshu Frank Tian-Fang Ye et.al. 2512.23609 translate read null
2025-12-29 Divergent-Convergent Thinking in Large Language Models for Creative Problem Generation Manh Hung Nguyen et.al. 2512.23601 translate read null
2025-12-29 Can AI Recognize Its Own Reflection? Self-Detection Performance of LLMs in Computing Education Christopher Burger et.al. 2512.23587 translate read null
2025-12-29 Instruction-Following Evaluation of Large Vision-Language Models Daiki Shiono et.al. 2512.23572 translate read null
2025-12-29 ThinkGen: Generalized Thinking for Visual Generation Siyu Jiao et.al. 2512.23568 translate read link
2025-12-29 RxnBench: A Multimodal Benchmark for Evaluating Large Language Models on Chemical Reaction Understanding from Scientific Literature Hanzheng Li et.al. 2512.23565 translate read null
2025-12-29 Toward Trustworthy Agentic AI: A Multimodal Framework for Preventing Prompt Injection Attacks Toqeer Ali Syed et.al. 2512.23557 translate read null
2025-12-29 Trustworthy Machine Learning under Distribution Shifts Zhuo Huang et.al. 2512.23524 translate read null
2025-12-29 Single LLM Debate, MoLaCE: Mixture of Latent Concept Experts Against Confirmation Bias Hazel Kim et.al. 2512.23518 translate read null
2025-12-29 Alpha-R1: Alpha Screening with LLM Reasoning via Reinforcement Learning Zuoyou Jiang et.al. 2512.23515 translate read link
2025-12-29 Beyond Correctness: Exposing LLM-generated Logical Flaws in Reasoning via Multi-step Automated Theorem Proving Xinyi Zheng et.al. 2512.23511 translate read null
2025-12-29 Hierarchical Decision Mamba Meets Agentic AI: A Novel Approach for RAN Slicing in 6G Md Arafat Habib et.al. 2512.23502 translate read null
2025-12-29 The Gaining Paths to Investment Success: Information-Driven LLM Graph Reasoning for Venture Capital Prediction Haoyu Pei et.al. 2512.23489 translate read null
2025-12-29 Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation Toqeer Ali Syed et.al. 2512.23480 translate read null
2025-12-29 Semantic Tree Inference on Text Corpa using a Nested Density Approach together with Large Language Model Embeddings Thomas Haschka et.al. 2512.23471 translate read null
2025-12-29 Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance Zhuo Li et.al. 2512.23461 translate read null
2025-12-29 Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following Kongcheng Zhang et.al. 2512.23457 translate read null
2025-12-29 ClinDEF: A Dynamic Evaluation Framework for Large Language Models in Clinical Reasoning Yuqi Tang et.al. 2512.23440 translate read null
2025-12-29 C2PO: Diagnosing and Disentangling Bias Shortcuts in LLMs Xuan Feng et.al. 2512.23430 translate read null
2025-12-29 Entropy-Guided Token Dropout: Training Autoregressive Language Models with Limited Domain Data Jiapeng Wang et.al. 2512.23422 translate read null
2025-12-29 MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning Jiawei Chen et.al. 2512.23412 translate read null
2025-12-29 Theoretical Foundations of Scaling Law in Familial Models Huan Song et.al. 2512.23407 translate read null
2025-12-29 Securing the AI Supply Chain: What Can We Learn From Developer-Reported Security Issues and Solutions of AI Projects? The Anh Nguyen et.al. 2512.23385 translate read null
2025-12-29 A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers Mohammad Nasirzadeh et.al. 2512.23380 translate read link
2025-12-29 Post-Training Quantization of OpenPangu Models for Efficient Deployment on Atlas A2 Yilun Luo et.al. 2512.23367 translate read null
2025-12-29 SpatialMosaic: A Multiview VLM Dataset for Partial Visibility Kanghee Lee et.al. 2512.23365 translate read null
2025-12-29 A Stepwise-Enhanced Reasoning Framework for Large Language Models Based on External Subgraph Generation Xin Zhang et.al. 2512.23356 translate read null
2025-12-29 The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models Dakuan Lu et.al. 2512.23340 translate read null
2025-12-29 CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations Huan-ang Gao et.al. 2512.23328 translate read null
2025-12-29 Flexible Keyword-Aware Top- $k$ Route Search Ziqiang Yu et.al. 2512.23319 translate read null
2025-12-29 Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL Abolfazl Younesi et.al. 2512.23310 translate read null
2025-12-29 MedGemma vs GPT-4: Open-Source and Proprietary Zero-shot Medical Disease Classification from Images Md. Sazzadul Islam Prottasha et.al. 2512.23304 translate read null
2025-12-29 AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration Minjiang Huang et.al. 2512.23300 translate read null
2025-12-29 Agentic AI-Enhanced Semantic Communications: Foundations, Architecture, and Applications Haixiao Gao et.al. 2512.23294 translate read null
2025-12-29 Chinese Morph Resolution in E-commerce Live Streaming Scenarios Jiahao Zhu et.al. 2512.23280 translate read null
2025-12-29 Interpretable Safety Alignment via SAE-Constructed Low-Rank Subspace Adaptation Dianyun Wang et.al. 2512.23260 translate read null
2025-12-29 Multimodal Interpretation of Remote Sensing Images: Dynamic Resolution Input Strategy and Multi-scale Vision-Language Alignment Mechanism Siyu Zhang et.al. 2512.23243 translate read null
2025-12-29 Anomaly Detection by Effectively Leveraging Synthetic Images Sungho Kang et.al. 2512.23227 translate read null
2025-12-29 Bridging Your Imagination with Audio-Video Generation via a Unified Director Jiaxu Zhang et.al. 2512.23222 translate read null
2025-12-29 MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios? Shiqi Dai et.al. 2512.23219 translate read null
2025-12-29 TCEval: Using Thermal Comfort to Assess Cognitive and Perceptual Abilities of AI Jingming Li et.al. 2512.23217 translate read null
2025-12-29 Anka: A Domain-Specific Language for Reliable LLM Code Generation Saif Khalfan Saif Al Mazrouei et.al. 2512.23214 translate read null
2025-12-29 Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process Zhijun Chen et.al. 2512.23213 translate read null
2025-12-29 Not too long do read: Evaluating LLM-generated extreme scientific summaries Zhuoqi Lyu et.al. 2512.23206 translate read null
2025-12-29 From Model Choice to Model Belief: Establishing a New Measure for LLM-Based Research Hongshen Sun et.al. 2512.23184 translate read null
2025-12-29 EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion Zhen Liang et.al. 2512.23173 translate read null
2025-12-29 REVEALER: Reinforcement-Guided Visual Reasoning for Element-Level Text-Image Alignment Evaluation Fulin Shi et.al. 2512.23169 translate read null
2025-12-29 SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search Yifan Zhang et.al. 2512.23167 translate read null
2025-12-29 Reservoir Computing inspired Matrix Multiplication-free Language Model Takumi Shiratsuchi et.al. 2512.23145 translate read null
2025-12-29 Understanding EFL Learners’ Code-Switching and Teachers’ Pedagogical Approaches in LLM-Supported Speaking Practice Junyeong Park et.al. 2512.23136 translate read null
2025-12-29 It’s a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents Karolina Korgul et.al. 2512.23128 translate read null
2025-12-29 InSPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization Yu Li et.al. 2512.23126 translate read null
2025-12-28 A Note on Hybrid Online Reinforcement and Imitation Learning for LLMs: Formulations and Algorithms Yingru Li et.al. 2512.23097 translate read null
2025-12-28 Benchmark Success, Clinical Failure: When Reinforcement Learning Optimizes for Benchmarks, Not Patients Armin Berger et.al. 2512.23090 translate read null
2025-12-28 Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning Yingru Li et.al. 2512.23087 translate read null
2025-12-28 Trust Region Masking for Long-Horizon LLM Reinforcement Learning Yingru Li et.al. 2512.23075 translate read null
2025-12-28 Accelerating Language Model Workflows with Prompt Choreography TJ Bai et.al. 2512.23049 translate read null
2025-12-28 Problems With Large Language Models for Learner Modelling: Why LLMs Alone Fall Short for Responsible Tutoring in K–12 Education Danial Hooshyar et.al. 2512.23036 translate read null
2025-12-28 Viability and Performance of a Private LLM Server for SMBs: A Benchmark Analysis of Qwen3-30B on Consumer-Grade Hardware Alex Khalil et.al. 2512.23029 translate read null
2025-12-28 With Great Context Comes Great Prediction Power: Classifying Objects via Geo-Semantic Scene Graphs Ciprian Constantinescu et.al. 2512.23024 translate read null
2025-12-28 Merge before Forget: A Single LoRA Continual Learning via Continual Merging Fuli Qiao et.al. 2512.23017 translate read null
2025-12-28 Improving Generalization in LLM Structured Pruning via Function-Aware Neuron Grouping Tao Yu et.al. 2512.23014 translate read null
2025-12-28 Masgent: An AI-assisted Materials Simulation Agent Guanghen Liu et.al. 2512.23010 translate read null
2025-12-28 Prompt engineering does not universally improve Large Language Model performance across clinical decision-making tasks Mengdi Chai et.al. 2512.22966 translate read null
2025-12-28 Diversity or Precision? A Deep Dive into Next Token Prediction Haoyuan Wu et.al. 2512.22955 translate read null
2025-12-28 Multimodal Fact-Checking: An Agent-based Approach Danni Xu et.al. 2512.22933 translate read null
2025-12-28 Argus: Token Aware Distributed LLM Inference Optimization Panlong Wu et.al. 2512.22925 translate read null
2025-12-28 JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation Kai Liu et.al. 2512.22905 translate read link
2025-12-28 Debugging Tabular Log as Dynamic Graphs Chumeng Liang et.al. 2512.22903 translate read null
2025-12-28 HiSciBench: A Hierarchical Multi-disciplinary Benchmark for Scientific Intelligence from Reading to Discovery Yaping Zhang et.al. 2512.22899 translate read null
2025-12-28 Theory and Algorithms for Learning with Multi-Class Abstention and Multi-Expert Deferral Anqi Mao et.al. 2512.22886 translate read null
2025-12-28 Agentic AI for Cyber Resilience: A New Security Paradigm and Its System-Theoretic Foundations Tao Li et.al. 2512.22883 translate read null
2025-12-28 FasterPy: An LLM-based Code Execution Efficiency Optimization Framework Yue Wu et.al. 2512.22827 translate read null
2025-12-28 NepEMO: A Multi-Label Emotion and Sentiment Analysis on Nepali Reddit with Linguistic Insights and Temporal Trends Sameer Sitoula et.al. 2512.22823 translate read null
2025-12-28 VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM Jingchao Wang et.al. 2512.22799 translate read link
2025-12-28 CNSight: Evaluation of Clinical Note Segmentation Tools Risha Surana et.al. 2512.22795 translate read null
2025-12-28 ChatGraPhT: A Visual Conversation Interface for Multi-Path Reflection with Agentic LLM Support Geoff Kimm et.al. 2512.22790 translate read null
2025-12-28 Understanding the Mechanisms of Fast Hyperparameter Transfer Nikhil Ghosh et.al. 2512.22768 translate read null
2025-12-28 Bridging Global Intent with Local Details: A Hierarchical Representation Approach for Semantic Validation in Text-to-SQL Rihong Qiu et.al. 2512.22744 translate read null
2025-12-28 Robust LLM-based Column Type Annotation via Prompt Augmentation with LoRA Tuning Hanze Meng et.al. 2512.22742 translate read null
2025-12-28 Text-Routed Sparse Mixture-of-Experts Model with Explanation and Temporal Alignment for Multi-Modal Sentiment Analysis Dongning Rao et.al. 2512.22741 translate read null
2025-12-28 Harnessing Large Language Models for Biomedical Named Entity Recognition Jian Chen et.al. 2512.22738 translate read null
2025-12-28 WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference Aiwei Liu et.al. 2512.22737 translate read null
2025-12-28 FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents Jiaqi Shao et.al. 2512.22733 translate read link
2025-12-27 Mitigating Social Desirability Bias in Random Silicon Sampling Sashank Chapala et.al. 2512.22725 translate read null
2025-12-27 Cyber Resilience in Next-Generation Networks: Threat Landscape, Theoretical Foundations, and Design Paradigms Junaid Farooq et.al. 2512.22721 translate read null
2025-12-27 Memento-II: Learning by Stateful Reflective Memory Jun Wang et.al. 2512.22716 translate read null
2025-12-27 Beg to Differ: Understanding Reasoning-Answer Misalignment Across Languages Anaelia Ovalle et.al. 2512.22712 translate read null
2025-12-27 Modality Inflation: Energy Characterization and Optimization Opportunities for MLLM Inference Mona Moghadampanah et.al. 2512.22695 translate read null
2025-12-27 Conformal Prediction Sets for Next-Token Prediction in Large Language Models: Balancing Coverage Guarantees with Set Efficiency Yoshith Roy Kotla et.al. 2512.22682 translate read null
2025-12-27 CritiFusion: Semantic Critique and Spectral Alignment for Faithful Text-to-Image Generation ZhenQi Chen et.al. 2512.22681 translate read null
2025-12-27 From Electrochemical Energy Storage to Next-Generation Intelligent Battery Technologies for Electric Vehicles: A Survey Abderaouf Bahi et.al. 2512.22680 translate read null
2025-12-27 TravelBench: A Real-World Benchmark for Multi-Turn and Tool-Augmented Travel Planning Xiang Cheng et.al. 2512.22673 translate read null
2025-12-23 Making Large Language Models Efficient Dense Retrievers Yibin Lei et.al. 2512.20612 translate read null
2025-12-23 MoE-DiffuSeq: Enhancing Long-Document Diffusion Models with Sparse Attention and Mixture of Experts Alexandros Christoforos et.al. 2512.20604 translate read null
2025-12-23 Cube Bench: A Benchmark for Spatial Visual Reasoning in MLLMs Dhruv Anand et.al. 2512.20595 translate read null
2025-12-23 Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent Humza Nusrat et.al. 2512.20586 translate read null
2025-12-23 Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Amirhosein Ghasemabadi et.al. 2512.20578 translate read null
2025-12-23 Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs Rui Pan et.al. 2512.20573 translate read null
2025-12-23 LLM-Based Authoring of Agent-Based Narratives through Scene Descriptions Vinayak Regmi et.al. 2512.20550 translate read null
2025-12-23 Advancing Multimodal Teacher Sentiment Analysis:The Large-Scale T-MED Dataset & The Effective AAM-TSA Model Zhiyi Duan et.al. 2512.20548 translate read null
2025-12-23 Benchmarking LLMs for Predictive Applications in the Intensive Care Units Chehak Malhotra et.al. 2512.20520 translate read null
2025-12-23 Coherence in the brain unfolds across separable temporal regimes Davide Stauba et.al. 2512.20481 translate read null
2025-12-23 UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images Yiming Zhao et.al. 2512.20479 translate read null
2025-12-23 Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register Shuting Wang et.al. 2512.20458 translate read null
2025-12-23 Topic-informed dynamic mixture model for occupational heterogeneity in health risk behaviors Lorenzo Schiavon et.al. 2512.20408 translate read null
2025-12-23 ChatGPT: Excellent Paper! Accept It. Editor: Imposter Found! Review Rejected Kanchon Gharami et.al. 2512.20405 translate read null
2025-12-23 CRAFT: Continuous Reasoning and Agentic Feedback Tuning for Multimodal Text-to-Image Generation V. Kovalev et.al. 2512.20362 translate read null
2025-12-23 A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice Yaowei Bai et.al. 2512.20344 translate read null
2025-12-23 Comment Traps: How Defective Commented-out Code Augment Defects in AI-Assisted Code Generation Yuan Huang et.al. 2512.20334 translate read null
2025-12-23 SynCraft: Guiding Large Language Models to Predict Edit Sequences for Molecular Synthesizability Optimization Junren Li et.al. 2512.20333 translate read null
2025-12-23 Toward Explaining Large Language Models in Software Engineering Tasks Antonio Vitale et.al. 2512.20328 translate read null
2025-12-23 Can LLMs Solve My Grandma’s Riddle? Evaluating Multilingual Large Language Models on Reasoning Traditional Bangla Tricky Riddles Nurul Labib Sayeedi et.al. 2512.20324 translate read null
2025-12-23 TableGPT-R1: Advancing Tabular Reasoning Through Reinforcement Learning Saisai Yang et.al. 2512.20312 translate read null
2025-12-23 Structured Visualization Design Knowledge for Grounding Generative Reasoning and Situated Feedback Péter Ferenc Gyarmati et.al. 2512.20306 translate read null
2025-12-23 AprielGuard Jaykumar Kasundra et.al. 2512.20293 translate read null
2025-12-23 Synthesizing Procedural Memory: Challenges and Architectures in Automated Workflow Generation Nishant Gaurav et.al. 2512.20278 translate read null
2025-12-23 Graph-Symbolic Policy Enforcement and Control (G-SPEC): A Neuro-Symbolic Framework for Safe Agentic AI in 5G Autonomous Networks Divya Vijay et.al. 2512.20275 translate read null
2025-12-23 Memory as Resonance: A Biomimetic Architecture for Infinite Context Memory on Ergodic Phonetic Manifolds Tarik Houichime et.al. 2512.20245 translate read null
2025-12-23 MemR $^3$ : Memory Retrieval via Reflective Reasoning for LLM Agents Xingbo Du et.al. 2512.20237 translate read null
2025-12-23 Quantitative Financial Modeling for Sri Lankan Markets: Approach Combining NLP, Clustering and Time-Series Forecasting Linuk Perera et.al. 2512.20216 translate read null
2025-12-23 Predictive-LoRA: A Proactive and Fragmentation-Aware Serverless Inference System for LLMs Yinan Ni et.al. 2512.20210 translate read null
2025-12-23 TongSIM: A General Platform for Simulating Intelligent Machines Zhe Sun et.al. 2512.20206 translate read null
2025-12-23 Corpus of Cross-lingual Dialogues with Minutes and Detection of Misunderstandings Marko Čechovič et.al. 2512.20204 translate read null
2025-12-23 Well Begun is Half Done: Location-Aware and Trace-Guided Iterative Automated Vulnerability Repair Zhenlei Ye et.al. 2512.20203 translate read null
2025-12-23 Designing Spatial Architectures for Sparse Attention: STAR Accelerator via Cross-Stage Tiling Huizheng Wang et.al. 2512.20198 translate read null
2025-12-23 FaithLens: Detecting and Explaining Faithfulness Hallucination Shuzheng Si et.al. 2512.20182 translate read null
2025-12-23 Optimistic TEE-Rollups: A Hybrid Architecture for Scalable and Verifiable Generative AI Inference on Blockchain Aaron Chan et.al. 2512.20176 translate read null
2025-12-23 Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark Hao Guo et.al. 2512.20174 translate read null
2025-12-23 Learning to Reason in LLMs by Expectation Maximization Junghyun Lee et.al. 2512.20169 translate read null
2025-12-23 Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography Songze Li et.al. 2512.20168 translate read null
2025-12-23 AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications Honglin Mu et.al. 2512.20164 translate read null
2025-12-23 Concept Generalization in Humans and Large Language Models: Insights from the Number Game Arghavan Bazigaran et.al. 2512.20162 translate read null
2025-12-23 AXIOM: Benchmarking LLM-as-a-Judge for Code via Rule-Based Perturbation and Multisource Quality Calibration Ruiqi Wang et.al. 2512.20159 translate read null
2025-12-23 Multi-hop Reasoning via Early Knowledge Alignment Yuxin Wang et.al. 2512.20144 translate read null
2025-12-23 Enhancing Zero-Shot Time Series Forecasting in Off-the-Shelf LLMs via Noise Injection Xingyou Yin et.al. 2512.20140 translate read null
2025-12-23 M $^3$ KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation Hyeongcheol Park et.al. 2512.20136 translate read null
2025-12-23 A Novel Graph-Sequence Learning Model for Inductive Text Classification Zuo Wang et.al. 2512.20097 translate read null
2025-12-23 QE-Catalytic: A Graph-Language Multimodal Base Model for Relaxed-Energy Prediction in Catalytic Adsorption Yanjie Li et.al. 2512.20084 translate read null
2025-12-23 Adaptive Financial Sentiment Analysis for NIFTY 50 via Instruction-Tuned LLMs , RAG and Reinforcement Learning Approaches Chaithra et.al. 2512.20082 translate read null
2025-12-23 Reason2Decide: Rationale-Driven Multi-Task Learning H M Quamran Hasan et.al. 2512.20074 translate read null
2025-12-23 On the Effectiveness of Instruction-Tuning Local LLMs for Identifying Software Vulnerabilities Sangryu Park et.al. 2512.20062 translate read null
2025-12-23 Scaling Reinforcement Learning for Content Moderation with Large Language Models Hamed Firooz et.al. 2512.20061 translate read null
2025-12-23 Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva Nguyen Lam Phu Quy et.al. 2512.20042 translate read null
2025-12-23 VSA:Visual-Structural Alignment for UI-to-Code Xian Wu et.al. 2512.20034 translate read null
2025-12-23 VALLR-Pin: Dual-Decoding Visual Speech Recognition for Mandarin with Pinyin-Guided LLM Refinement Chang Sun et.al. 2512.20032 translate read null
2025-12-23 LLM-Assisted Abstract Screening with OLIVER: Evaluating Calibration and Single-Model vs. Actor-Critic Configurations in Literature Reviews Kian Godhwani et.al. 2512.20022 translate read null
2025-12-23 Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems Qiushuo Hou et.al. 2512.20012 translate read null
2025-12-23 LoFT-LLM: Low-Frequency Time-Series Forecasting with Large Language Models Jiacheng You et.al. 2512.20002 translate read null
2025-12-23 Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models Ming Li et.al. 2512.19995 translate read null
2025-12-23 S $^3$ IT: A Benchmark for Spatially Situated Social Intelligence Test Zhe Sun et.al. 2512.19992 translate read null
2025-12-23 Bias Beneath the Tone: Empirical Characterisation of Tone Bias in LLM-Driven UX Systems Heet Bodara et.al. 2512.19950 translate read null
2025-12-23 Interpolative Decoding: Exploring the Spectrum of Personality Traits in LLMs Eric Yeh et.al. 2512.19937 translate read null
2025-12-22 Conditional Adversarial Fragility in Financial Machine Learning under Macroeconomic Stress Samruddhi Baviskar et.al. 2512.19935 translate read null
2025-12-22 PRISM: A Personality-Driven Multi-Agent Framework for Social Media Simulation Zhixiang Lu et.al. 2512.19933 translate read null
2025-12-22 Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs Houston H. Zhang et.al. 2512.19918 translate read null
2025-12-22 Demystifying LLM-as-a-Judge: Analytically Tractable Model for Inference-Time Scaling Indranil Halder et.al. 2512.19905 translate read null
2025-12-22 How well do Large Language Models Recognize Instructional Moves? Establishing Baselines for Foundation Models in Educational Discourse Kirk Vanacore et.al. 2512.19903 translate read null
2025-12-22 Larger Is Not Always Better: Leveraging Structured Code Diffs for Comment Inconsistency Detection Phong Nguyen et.al. 2512.19883 translate read null
2025-12-22 Fine-Tuned In-Context Learners for Efficient Adaptation Jorg Bornschein et.al. 2512.19879 translate read null
2025-12-22 CS-Guide: Leveraging LLMs and Student Reflections to Provide Frequent, Scalable Academic Monitoring Feedback to Computer Science Students Samuel Jacob Chacko et.al. 2512.19866 translate read null
2025-12-22 HARMON-E: Hierarchical Agentic Reasoning for Multimodal Oncology Notes to Extract Structured Data Shashi Kant Gupta et.al. 2512.19864 translate read null
2025-12-22 From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs Mingrui Wu et.al. 2512.19683 translate read null
2025-12-22 GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators Jiacheng Guo et.al. 2512.19682 translate read link
2025-12-22 Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918) Niclas Griesshaber et.al. 2512.19675 translate read null
2025-12-22 Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Yuqiao Tan et.al. 2512.19673 translate read link
2025-12-22 Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis Argha Kamal Samanta et.al. 2512.19663 translate read null
2025-12-22 Exploring Zero-Shot ACSA with Unified Meaning Representation in Chain-of-Thought Prompting Filippos Ventirozos et.al. 2512.19651 translate read null
2025-12-22 Exploring the features used for summary evaluation by Human and GPT Zahra Sadeghi et.al. 2512.19620 translate read null
2025-12-22 MapTrace: Scalable Data Generation for Route Tracing on Maps Artemis Panagopoulou et.al. 2512.19609 translate read null
2025-12-22 RAPID-LLM: Resilience-Aware Performance analysis of Infrastructure for Distributed LLM Training and Inference George Karfakis et.al. 2512.19606 translate read null
2025-12-22 Increasing the Thinking Budget is Not All You Need Ignacio Iacobacci et.al. 2512.19585 translate read null
2025-12-22 The Epistemological Consequences of Large Language Models: Rethinking collective intelligence and institutional knowledge Angjelin Hila et.al. 2512.19570 translate read null
2025-12-22 Algerian Dialect Zakaria Benmounah et.al. 2512.19543 translate read null
2025-12-22 Event Extraction in Large Language Model Bobo Li et.al. 2512.19537 translate read null
2025-12-22 Learning Continuous Solvent Effects from Transient Flow Data: A Graph Neural Network Benchmark on Catechol Rearrangement Hongsheng Xing et.al. 2512.19530 translate read null
2025-12-22 Anatomy-R1: Enhancing Anatomy Reasoning in Multimodal Large Language Models via Anatomical Similarity Curriculum and Group Diversity Augmentation Ziyang Song et.al. 2512.19512 translate read null
2025-12-22 Structured Event Representation and Stock Return Predictability Gang Li et.al. 2512.19484 translate read null
2025-12-22 A Dataset and Preliminary Study of Using GPT-5 for Code-change Impact Analysis Katharina Stengg et.al. 2512.19481 translate read null
2025-12-22 A Large-Language-Model Framework for Automated Humanitarian Situation Reporting Ivan Decostanzi et.al. 2512.19475 translate read null
2025-12-22 Epistemological Fault Lines Between Human and Artificial Intelligence Walter Quattrociocchi et.al. 2512.19466 translate read null
2025-12-22 An Agentic Framework for Autonomous Materials Computation Zeyu Xia et.al. 2512.19458 translate read null
2025-12-22 Activations as Features: Probing LLMs for Generalizable Essay Scoring Representations Jinwei Chi et.al. 2512.19456 translate read null
2025-12-22 SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation Thittipat Pairatsuppawat et.al. 2512.19455 translate read null
2025-12-22 D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning Evelyn Zhang et.al. 2512.19443 translate read null
2025-12-22 dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Yi Xin et.al. 2512.19433 translate read link
2025-12-22 CodeSimpleQA: Scaling Factuality in Code Large Language Models Jian Yang et.al. 2512.19424 translate read null
2025-12-22 From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions Jiaren Peng et.al. 2512.19414 translate read null
2025-12-22 Brain-Grounded Axes for Reading and Steering LLM States Sandro Andric et.al. 2512.19399 translate read link
2025-12-22 HATS: High-Accuracy Triple-Set Watermarking for Large Language Models Zhiqing Hu et.al. 2512.19378 translate read null
2025-12-22 Generative vector search to improve pathology foundation models across multimodal vision-language tasks Markus Ekvall et.al. 2512.19360 translate read null
2025-12-22 ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining Zhenyang Huang et.al. 2512.19354 translate read null
2025-12-22 PENDULUM: A Benchmark for Assessing Sycophancy in Multimodal Large Language Models A. B. M. Ashikur Rahman et.al. 2512.19350 translate read null
2025-12-22 VIGOR+: Iterative Confounder Generation and Validation via LLM-CEVAE Feedback Loop JiaWei Zhu et.al. 2512.19349 translate read null
2025-12-22 SafeMed-R1: Adversarial Reinforcement Learning for Generalizable and Robust Medical Reasoning in Vision-Language Models A. A. Gde Yogi Pramana et.al. 2512.19317 translate read null
2025-12-22 CienaLLM: Generative Climate-Impact Extraction from News Articles with Autoregressive LLMs Javier Vela-Tambo et.al. 2512.19305 translate read null
2025-12-22 Helios: A Foundational Language Model for Smart Energy Knowledge Reasoning and Application Haoyu Jiang et.al. 2512.19299 translate read null
2025-12-22 Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models Linzhi Chen et.al. 2512.19297 translate read null
2025-12-22 Auto-Prompting with Retrieval Guidance for Frame Detection in Logistics Do Minh Duc et.al. 2512.19247 translate read null
2025-12-22 ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models Mingxu Zhang et.al. 2512.19240 translate read null
2025-12-22 Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation Anna-Maria Gueorguieva et.al. 2512.19238 translate read null
2025-12-22 Generation of Programmatic Rules for Document Forgery Detection Using Large Language Models Valentin Schmidberger et.al. 2512.19228 translate read null
2025-12-22 Observer, Not Player: Simulating Theory of Mind in LLMs through Game Observation Jerry Wang et.al. 2512.19210 translate read null
2025-12-22 MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning Tao Zhang et.al. 2512.19206 translate read null
2025-12-22 Configuration Work: Four Consequences of LLMs-in-use Gabriel Alcaras et.al. 2512.19189 translate read null
2025-12-22 L4: Low-Latency and Load-Balanced LLM Serving via Length-Aware Scheduling Yitao Yuan et.al. 2512.19179 translate read null
2025-12-22 OmniMoGen: Unifying Human Motion Generation via Learning from Interleaved Text-Motion Instructions Wendong Bu et.al. 2512.19159 translate read null
2025-12-22 Understanding Chain-of-Thought in Large Language Models via Topological Data Analysis Chenghao Li et.al. 2512.19135 translate read null
2025-12-22 QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation Dehai Min et.al. 2512.19134 translate read link
2025-12-22 AWPO: Enhancing Tool-Use of Large Language Models through Explicit Integration of Reasoning Rewards Zihan Lin et.al. 2512.19126 translate read null
2025-12-22 Stop saying LLM: Large Discourse Models (LDM) and Artificial Discursive Agent (ADA)? Amar Lakel et.al. 2512.19117 translate read null
2025-12-22 Generative Giants, Retrieval Weaklings: Why do Multimodal Large Language Models Fail at Multimodal Retrieval? Hengyi Feng et.al. 2512.19115 translate read null
2025-12-22 HyperLoad: A Cross-Modality Enhanced Large Language Model-Based Framework for Green Data Center Cooling Load Prediction Haoyu Jiang et.al. 2512.19114 translate read null
2025-12-22 FC-MIR: A Mobile Screen Awareness Framework for Intent-Aware Recommendation based on Frame-Compressed Multimodal Trajectory Reasoning Zhe Yang et.al. 2512.19107 translate read null
2025-12-22 Tool-Augmented Hybrid Ensemble Reasoning with Distillation for Bilingual Mathematical Problem Solving Peiqing Lu et.al. 2512.19093 translate read null
2025-12-22 A Large Language Model Based Method for Complex Logical Reasoning over Knowledge Graphs Ziyan Zhang et.al. 2512.19092 translate read null
2025-12-22 Population-Evolve: a Parallel Sampling and Evolutionary Method for LLM Math Reasoning Yanzhi Zhang et.al. 2512.19081 translate read null
2025-12-22 Watch Closely: Mitigating Object Hallucinations in Large Vision-Language Models with Disentangled Decoding Ruiqi Ma et.al. 2512.19070 translate read null
2025-12-22 Can abstract concepts from LLM improve SLM performance? Siddharth Tandon et.al. 2512.19069 translate read null
2025-12-22 Finer-Personalization Rank: Fine-Grained Retrieval Examines Identity Preservation for Personalized Generation Connor Kilrain et.al. 2512.19026 translate read null
2025-12-22 The Erasure Illusion: Stress-Testing the Generalization of LLM Forgetting Evaluation Hengrui Jia et.al. 2512.19025 translate read null
2025-12-22 PEAK: A Performance Engineering AI-Assistant for GPU Kernels Powered by Natural Language Transformations Muhammad Usman Tariq et.al. 2512.19018 translate read null
2025-12-22 DREAM: Dynamic Red-teaming across Environments for AI Models Liming Lu et.al. 2512.19016 translate read null
2025-12-22 Efficient Jailbreak Mitigation Using Semantic Linear Classification in a Multi-Staged Pipeline Akshaj Prashanth Rao et.al. 2512.19011 translate read null
2025-12-22 Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models Tongyuan Miao et.al. 2512.19004 translate read null
2025-12-22 Evaluating the Challenges of LLMs in Real-world Medical Follow-up: A Comparative Study and An Optimized Framework Jinyan Liu et.al. 2512.18999 translate read null
2025-12-22 R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer’s Disease Progression Kun Zhao et.al. 2512.18986 translate read null
2025-12-22 Scrum Sprint Planning: LLM-based and algorithmic solutions Yuwon Yoon et.al. 2512.18966 translate read null
2025-12-22 Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement Saman Forouzandeh et.al. 2512.18950 translate read null
2025-12-22 FASTRIC: Prompt Specification Language for Verifiable LLM Interactions Wen-Long Jin et.al. 2512.18940 translate read null
2025-12-22 When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models Michael S. Zhang et.al. 2512.18934 translate read null
2025-12-21 An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects Shaokang Jiang et.al. 2512.18925 translate read null
2025-12-21 Delta-LLaVA: Base-then-Specialize Alignment for Token-Efficient Vision-Language Models Mohamad Zamini et.al. 2512.18910 translate read null
2025-12-21 Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models Gökdeniz Gülmez et.al. 2512.18901 translate read null
2025-12-21 Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Ming Li et.al. 2512.18880 translate read null
2025-12-21 CrashChat: A Multimodal Large Language Model for Multitask Traffic Crash Video Analysis Kaidi Liang et.al. 2512.18878 translate read null
2025-12-21 CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning Zijun Gao et.al. 2512.18857 translate read null
2025-12-21 VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference Sicheng Song et.al. 2512.18853 translate read null
2025-12-21 MDToC: Metacognitive Dynamic Tree of Concepts for Boosting Mathematical Problem-Solving of Large Language Models Tung Duong Ta et.al. 2512.18841 translate read null
2025-12-21 From Word to World: Can Large Language Models be Implicit Text-based World Models? Yixia Li et.al. 2512.18832 translate read null
2025-12-21 HARBOR: Holistic Adaptive Risk assessment model for BehaviORal healthcare Aditya Siddhant et.al. 2512.18829 translate read null
2025-12-21 “Even GPT Can Reject Me”: Conceptualizing Abrupt Refusal Secondary Harm (ARSH) and Reimagining Psychological AI Safety with Compassionate Completion Standard (CCS) Yang Ni et.al. 2512.18776 translate read null
2025-12-21 MEEA: Mere Exposure Effect-Driven Confrontational Optimization for LLM Jailbreaking Jianyi Zhang et.al. 2512.18755 translate read null
2025-12-21 Code2Doc: A Quality-First Curated Dataset for Code Documentation Recep Kaan Karaman et.al. 2512.18748 translate read null
2025-12-21 IPCV: Information-Preserving Compression for MLLM Visual Encoders Yuan Chen et.al. 2512.18747 translate read null
2025-12-21 MemEvolve: Meta-Evolution of Agent Memory Systems Guibin Zhang et.al. 2512.18746 translate read null
2025-12-21 Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection Junjun Pan et.al. 2512.18733 translate read null
2025-12-21 A Theoretical Lens for RL-Tuned Language Models via Energy-Based Models Zhiquan Tan et.al. 2512.18730 translate read null
2025-12-21 Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design Yuchen Li et.al. 2512.18682 translate read null
2025-12-21 Remoe: Towards Efficient and Low-Cost MoE Inference in Serverless Computing Wentao Liu et.al. 2512.18674 translate read null
2025-12-21 SmartSight: Mitigating Hallucination in Video-LLMs Without Compromising Video Understanding via Temporal Attention Collapse Yiming Sun et.al. 2512.18671 translate read null
2025-12-21 Tackling dataset curation challenges towards reliable machine learning: a case study on thermoelectric materials Shoeb Athar et.al. 2512.18653 translate read null
2025-12-21 LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction Jensen Zhang et.al. 2512.18623 translate read null
2025-12-21 A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback Thanh Dat Hoang et.al. 2512.18622 translate read null
2025-12-21 A Comparative Study of Light-weight Language Models for PII Masking and their Deployment for Real Conversational Texts Prabigya Acharya et.al. 2512.18608 translate read null
2025-12-21 Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction Qinglin Zeng et.al. 2512.18605 translate read null
2025-12-21 SimpleCall: A Lightweight Image Restoration Agent in Label-Free Environments with MLLM Perceptual Feedback Jianglin Lu et.al. 2512.18599 translate read null
2025-12-21 Wireless Copilot: An AI-Powered Partner for Navigating Next-Generation Wireless Complexity Haoxiang Luo et.al. 2512.18582 translate read null
2025-12-21 ESearch-R1: Learning Cost-Aware MLLM Agents for Interactive Embodied Search via Reinforcement Learning Weijie Zhou et.al. 2512.18571 translate read null
2025-12-21 AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software Bin Wang et.al. 2512.18567 translate read null
2025-12-21 Vox Deorum: A Hybrid LLM Architecture for 4X / Grand Strategy Game AI – Lessons from Civilization V John Chen et.al. 2512.18564 translate read null
2025-12-21 OpenView: Empowering MLLMs with Out-of-view VQA Qixiang Chen et.al. 2512.18563 translate read null
2025-12-18 AdaTooler-V: Adaptive Tool-Use for Images and Videos Chaoyang Wang et.al. 2512.16918 translate read null
2025-12-18 Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning Qihao Liu et.al. 2512.16917 translate read null
2025-12-18 Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward Peter Chen et.al. 2512.16912 translate read null
2025-12-18 Impacts of Racial Bias in Historical Training Data for News AI Rahul Bhargava et.al. 2512.16901 translate read null
2025-12-18 Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image Yushi Hu et.al. 2512.16899 translate read null
2025-12-18 LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation Haichao Zhang et.al. 2512.16891 translate read null
2025-12-18 AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning Tzu-Han Lin et.al. 2512.16883 translate read null
2025-12-18 TOGGLE: Temporal Logic-Guided Large Language Model Compression for Edge Khurram Khalil et.al. 2512.16855 translate read null
2025-12-18 Meta-RL Induces Exploration in Language Agents Yulun Jiang et.al. 2512.16848 translate read null
2025-12-18 Toward Systematic Counterfactual Fairness Evaluation of Large Language Models: The CAFFE Framework Alessandra Parziale et.al. 2512.16816 translate read null
2025-12-18 From Facts to Conclusions : Integrating Deductive Reasoning in Retrieval-Augmented LLMs Shubham Mishra et.al. 2512.16795 translate read null
2025-12-18 Inside Out: Uncovering How Comment Internalization Steers LLMs for Better or Worse Aaron Imani et.al. 2512.16790 translate read null
2025-12-18 Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future Tianshuai Hu et.al. 2512.16760 translate read null
2025-12-18 Plausibility as Failure: How LLMs and Humans Co-Construct Epistemic Error Claudia Vale Oliveira et.al. 2512.16750 translate read null
2025-12-18 AI-Driven Prediction of Cancer Pain Episodes: A Hybrid Decision Support Approach Yipeng Zhuang et.al. 2512.16739 translate read null
2025-12-18 Cyber Humanism in Education: Reclaiming Agency through AI and Learning Sciences Giovanni Adorni et.al. 2512.16701 translate read null
2025-12-18 Do Multi-Agents Solve Better Than Single? Evaluating Agentic Frameworks for Diagram-Grounded Geometry Problem Solving and Reasoning Mahbub E Sobhani et.al. 2512.16698 translate read null
2025-12-18 DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI Hao Liang et.al. 2512.16676 translate read null
2025-12-18 Microsoft Academic Graph Information Retrieval for Research Recommendation and Assistance Jacob Reiss et.al. 2512.16661 translate read null
2025-12-18 Prefix Probing: Lightweight Harmful Content Detection for Large Language Models Jirui Yang et.al. 2512.16650 translate read null
2025-12-18 JustRL: Scaling a 1.5B LLM with a Simple RL Recipe Bingxiang He et.al. 2512.16649 translate read null
2025-12-18 Stackelberg Learning from Human Feedback: Preference Optimization as a Sequential Game Barna Pásztor et.al. 2512.16626 translate read null
2025-12-18 Refusal Steering: Fine-grained Control over LLM Refusal Behaviour for Sensitive Topics Iker García-Ferrero et.al. 2512.16602 translate read null
2025-12-18 Muon is Provably Faster with Momentum Variance Reduction Xun Qian et.al. 2512.16598 translate read null
2025-12-18 Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs Jintao Tong et.al. 2512.16584 translate read null
2025-12-18 Non-Asymptotic Global Convergence of PPO-Clip Yin Liu et.al. 2512.16565 translate read null
2025-12-18 Needle in the Web: A Benchmark for Retrieving Targeted Web Pages in the Wild Yumeng Wang et.al. 2512.16553 translate read null
2025-12-18 A Systematic Study of Code Obfuscation Against LLM-based Vulnerability Detection Xiao Li et.al. 2512.16538 translate read null
2025-12-18 From Personalization to Prejudice: Bias and Discrimination in Memory-Enhanced AI Agents for Recruitment Himanshu Gharat et.al. 2512.16532 translate read null
2025-12-18 Scaling Laws for Energy Efficiency of Local LLMs Ander Alvarez et.al. 2512.16531 translate read null
2025-12-18 Plain language adaptations of biomedical text using LLMs: Comparision of evaluation metrics Primoz Kocbek et.al. 2512.16530 translate read null
2025-12-18 Efficient CPU-GPU Collaborative Inference for MoE-based LLMs on Memory-Limited Systems En-Ming Huang et.al. 2512.16473 translate read null
2025-12-18 cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution Jinwu Chen et.al. 2512.16465 translate read null
2025-12-18 TimeSeries2Report prompting enables adaptive large language model management of lithium-ion batteries Jiayang Yang et.al. 2512.16453 translate read null
2025-12-18 Towards AI-Supported Research: a Vision of the TIB AIssistant Sören Auer et.al. 2512.16447 translate read null
2025-12-18 Topic Modelling Black Box Optimization Roman Akramov et.al. 2512.16445 translate read null
2025-12-18 TIB AIssistant: a Platform for AI-Supported Research Across Research Life Cycles Allard Oelen et.al. 2512.16442 translate read null
2025-12-18 From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection Hao Li et.al. 2512.16439 translate read null
2025-12-18 Introducing ORKG ASK: an AI-driven Scholarly Literature Search and Exploration System Taking a Neuro-Symbolic Approach Allard Oelen et.al. 2512.16425 translate read null
2025-12-18 Synthelite: Chemist-aligned and feasibility-aware synthesis planning with LLMs Nguyen Xuan-Vu et.al. 2512.16424 translate read null
2025-12-18 Large Language Models as a (Bad) Security Norm in the Context of Regulation and Compliance Kaspar Rosager Ludvigsen et.al. 2512.16419 translate read null
2025-12-18 BrepLLM: Native Boundary Representation Understanding with Large Language Models Liyuan Deng et.al. 2512.16413 translate read null
2025-12-18 A Network Arena for Benchmarking AI Agents on Network Troubleshooting Zhihao Wang et.al. 2512.16381 translate read null
2025-12-18 Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs Sara Papi et.al. 2512.16378 translate read null
2025-12-18 Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models Mariam Hassan et.al. 2512.16371 translate read null
2025-12-18 AI Needs Physics More Than Physics Needs AI Peter Coveney et.al. 2512.16344 translate read null
2025-12-18 Design and Evaluation of Cost-Aware PoQ for Decentralized LLM Inference Arther Tian et.al. 2512.16317 translate read null
2025-12-18 Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation Yuxuan Qiao et.al. 2512.16310 translate read null
2025-12-18 PixelArena: A benchmark for Pixel-Precision Visual Intelligence Feng Liang et.al. 2512.16303 translate read null
2025-12-18 Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection Fanrui Zhang et.al. 2512.16300 translate read null
2025-12-18 Feature-Selective Representation Misdirection for Machine Unlearning Taozhao Chen et.al. 2512.16297 translate read null
2025-12-18 MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval Amna Amir et.al. 2512.16294 translate read null
2025-12-18 Ein Typenrad auf der Überholspur: Die Kult-Schreibmaschine “Erika” trifft KI Karola Köpferl et.al. 2512.16293 translate read null
2025-12-18 In-Context Probing for Membership Inference in Fine-Tuned Language Models Zhexi Lu et.al. 2512.16292 translate read null
2025-12-18 Evaluating OpenAI GPT Models for Translation of Endangered Uralic Languages: A Comparison of Reasoning and Non-Reasoning Architectures Yehor Tereshchenko et.al. 2512.16287 translate read null
2025-12-18 CKA-Guided Modular Quantization: Beyond Bit-Width to Algorithmic Diversity Jinhao Zhang et.al. 2512.16282 translate read null
2025-12-18 Love, Lies, and Language Models: Investigating AI’s Role in Romance-Baiting Scams Gilad Gressel et.al. 2512.16280 translate read null
2025-12-18 QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems Yiliu Yang et.al. 2512.16279 translate read null
2025-12-18 Fast Collaborative Inference via Distributed Speculative Decoding Ce Zheng et.al. 2512.16273 translate read null
2025-12-18 Beyond Blind Spots: Analytic Hints for Mitigating LLM-Based Evaluation Pitfalls Ora Nova Fandina et.al. 2512.16272 translate read null
2025-12-18 Learning to Wait: Synchronizing Agents with the Physical World Yifei She et.al. 2512.16262 translate read null
2025-12-18 AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding Sanjoy Chowdhury et.al. 2512.16250 translate read null
2025-12-18 AlignMerge - Alignment-Preserving Large Language Model Merging via Fisher-Guided Geometric Constraints Aniruddha Roy et.al. 2512.16245 translate read null
2025-12-18 Coarse-to-Fine Open-Set Graph Node Classification with Large Language Models Xueqi Ma et.al. 2512.16244 translate read null
2025-12-18 Trustworthy and Controllable Professional Knowledge Utilization in Large Language Models with TEE-GPU Execution Yifeng Cai et.al. 2512.16238 translate read null
2025-12-18 The Evolution of Reranking Models in Information Retrieval: From Heuristic Methods to Large Language Models Tejul Pandit et.al. 2512.16236 translate read null
2025-12-18 LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding Chenkai Xu et.al. 2512.16229 translate read link
2025-12-18 An Information-Theoretic Framework for Robust Large Language Model Editing Qizhou Chen et.al. 2512.16227 translate read null
2025-12-18 DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack Hao Li et.al. 2512.16182 translate read null
2025-12-18 Ev-Trust: A Strategy Equilibrium Trust Mechanism for Evolutionary Games in LLM-Based Multi-Agent Services Shiduo Yang et.al. 2512.16167 translate read null
2025-12-18 Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference Jian Tian et.al. 2512.16134 translate read null
2025-12-18 Scaling Text2SQL via LLM-efficient Schema Filtering with Functional Dependency Graph Rerankers Thanh Dat Hoang et.al. 2512.16083 translate read link
2025-12-18 Auto-Vocabulary 3D Object Detection Haomeng Zhang et.al. 2512.16077 translate read null
2025-12-18 LLM4Perf: Large Language Models Are Effective Samplers for Multi-Objective Performance Modeling (Copy) Xin Wang et.al. 2512.16070 translate read null
2025-12-18 A Multi-Agent Large Language Model Framework for Automated Qualitative Analysis Qidi Xu et.al. 2512.16063 translate read null
2025-12-18 ContextLeak: Auditing Leakage in Private In-Context Learning Methods Jacob Choi et.al. 2512.16059 translate read null
2025-12-18 MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM Services Lingfeng Tang et.al. 2512.16056 translate read null
2025-12-17 Topic Discovery and Classification for Responsible Generative AI Adaptation in Higher Education Diane Myung-kyung Woodbridge et.al. 2512.16036 translate read null
2025-12-17 Do Large Language Models Know What They Don’t Know? Kalshibench: A New Benchmark for Evaluating Epistemic Calibration via Prediction Markets Lukas Nel et.al. 2512.16030 translate read null
2025-12-17 Cross-Language Bias Examination in Large Language Models Yuxuan Liang et.al. 2512.16029 translate read null
2025-12-17 Conversational Time Series Foundation Models: Towards Explainable and Effective Forecasting Defu Cao et.al. 2512.16022 translate read null
2025-12-17 Few-Shot Inference of Human Perceptions of Robot Performance in Social Navigation Scenarios Qiping Zhang et.al. 2512.16019 translate read null
2025-12-17 OLAF: Towards Robust LLM-Based Annotation Framework in Empirical Software Engineering Mia Mohammad Imran et.al. 2512.15979 translate read null
2025-12-17 Dynamic Rank Reinforcement Learning for Adaptive Low-Rank Multi-Head Self Attention in Large Language Models Caner Erden et.al. 2512.15973 translate read null
2025-12-17 BRAID: Bounded Reasoning for Autonomous Inference and Decisions Armağan Amcalar et.al. 2512.15959 translate read null
2025-12-17 The Perceptual Observatory Characterizing Robustness and Grounding in MLLMs Tejas Anvekar et.al. 2512.15949 translate read null
2025-12-17 Privacy Discourse and Emotional Dynamics in Mental Health Information Interaction on Reddit Jai Kruthunz Naveen Kumar et.al. 2512.15945 translate read null
2025-12-17 Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning Polaris Jhandi et.al. 2512.15943 translate read null
2025-12-17 City Navigation in the Wild: Exploring Emergent Navigation from Web-Scale Knowledge in MLLMs Dwip Dalal et.al. 2512.15933 translate read null
2025-12-17 DSO: Direct Steering Optimization for Bias Mitigation Lucas Monteiro Paes et.al. 2512.15926 translate read null
2025-12-17 Leveraging Spreading Activation for Improved Document Retrieval in Knowledge-Graph-Based RAG Systems Jovan Pavlović et.al. 2512.15922 translate read null
2025-12-17 TabReX : Tabular Referenceless eXplainable Evaluation Tejas Anvekar et.al. 2512.15907 translate read link
2025-12-17 Darth Vecdor: An Open-Source System for Generating Knowledge Graphs Through Large Language Model Queries Jonathan A. Handler et.al. 2512.15906 translate read null
2025-12-17 PediatricAnxietyBench: Evaluating Large Language Model Safety Under Parental Anxiety and Pressure in Pediatric Consultations Vahideh Zolfaghari et.al. 2512.15894 translate read null
2025-12-17 VET Your Agent: Towards Host-Independent Autonomy via Verifiable Execution Traces Artem Grigor et.al. 2512.15892 translate read null
2025-12-17 Seeing Beyond Words: Self-Supervised Visual Learning for Multimodal Large Language Models Davide Caffagni et.al. 2512.15885 translate read null
2025-12-17 HEPTAPOD: Orchestrating High Energy Physics Workflows Towards Autonomous Agency Tony Menzo et.al. 2512.15867 translate read null
2025-12-17 Dynamic Rebatching for Efficient Early-Exit Inference with DREX Xuting Liu et.al. 2512.15705 translate read null
2025-12-17 Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning Yifei Li et.al. 2512.15693 translate read null
2025-12-17 Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Zhenwen Liang et.al. 2512.15687 translate read null
2025-12-17 Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers Adam Karvonen et.al. 2512.15674 translate read null
2025-12-17 Explaining the Reasoning of Large Language Models Using Attribution Graphs Chase Walker et.al. 2512.15663 translate read null
2025-12-17 Stepwise Think-Critique: A Unified Framework for Robust and Interpretable LLM Reasoning Jiaqi Xu et.al. 2512.15662 translate read null
2025-12-17 How Much is Too Much? Exploring LoRA Rank Trade-offs for Retaining Knowledge and Domain Robustness Darshita Rathore et.al. 2512.15634 translate read null
2025-12-17 Evaluating Metrics for Safety with LLM-as-Judges Kester Clegg et.al. 2512.15617 translate read null
2025-12-17 Behavior Tokens Speak Louder: Disentangled Explainable Recommendation with Behavior Vocabulary Xinshun Feng et.al. 2512.15614 translate read null
2025-12-17 Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction Mathieu Blondel et.al. 2512.15605 translate read null
2025-12-17 Evaluating Large Language Models in Scientific Discovery Zhangde Song et.al. 2512.15567 translate read null
2025-12-17 GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models Bozhou Li et.al. 2512.15560 translate read null
2025-12-17 CTkvr: KV Cache Retrieval for Long-Context LLMs via Centroid then Token Indexing Kuan Lu et.al. 2512.15550 translate read null
2025-12-17 When a Nation Speaks: Machine Learning and NLP in People’s Sentiment Analysis During Bangladesh’s 2024 Mass Uprising Md. Samiul Alim et.al. 2512.15547 translate read null
2025-12-17 An Efficient and Effective Encoder Model for Vision and Language Tasks in the Remote Sensing Domain João Daniel Silva et.al. 2512.15531 translate read null
2025-12-17 EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration Daiqing Wu et.al. 2512.15528 translate read null
2025-12-17 How Do Semantically Equivalent Code Transformations Impact Membership Inference on LLMs for Code? Hua Yang et.al. 2512.15468 translate read null
2025-12-17 On Assessing the Relevance of Code Reviews Authored by Generative Models Robert Heumüller et.al. 2512.15466 translate read null
2025-12-17 Toward expert-level motivational interviewing for health behavior improvement with LLMs Run-ze Hu et.al. 2512.15446 translate read null
2025-12-17 Step-GUI Technical Report Haolong Yan et.al. 2512.15431 translate read null
2025-12-17 Can AI Generate more Comprehensive Test Scenarios? Review on Automated Driving Systems Test Scenario Generation Methods Ji Zhou et.al. 2512.15422 translate read null
2025-12-17 Bilateral Spatial Reasoning about Street Networks: Graph-based RAG with Qualitative Spatial Representations Reinhard Moratz et.al. 2512.15388 translate read null
2025-12-17 MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents Gregor Donabauer et.al. 2512.15384 translate read null
2025-12-17 SCOPE: Prompt Evolution for Enhancing Agent Effectiveness Zehua Pei et.al. 2512.15374 translate read null
2025-12-17 ArcBERT: An LLM-based Search Engine for Exploring Integrated Multi-Omics Metadata Gajendra Doniparthi et.al. 2512.15365 translate read null
2025-12-17 Revisiting Task-Oriented Dataset Search in the Era of Large Language Models: Challenges, Benchmark, and Solution Zixin Wei et.al. 2512.15363 translate read null
2025-12-17 Dual-Density Inference for Efficient Language Model Reasoning Zhengyi Zhao et.al. 2512.15358 translate read null
2025-12-17 Adversarial versification in portuguese as a jailbreak operator in LLMs Joao Queiroz et.al. 2512.15353 translate read null
2025-12-17 Exploring User Acceptance and Concerns toward LLM-powered Conversational Agents in Immersive Extended Reality Efe Bozkir et.al. 2512.15343 translate read null
2025-12-17 Evaluating LLMs for Zeolite Synthesis Event Extraction (ZSEE): A Systematic Analysis of Prompting Strategies Charan Prakash Rathore et.al. 2512.15312 translate read null
2025-12-17 SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation Wangyu Wu et.al. 2512.15310 translate read null
2025-12-17 Towards Proactive Personalization through Profile Customization for Individual Users in Dialogues Xiaotian Zhang et.al. 2512.15302 translate read null
2025-12-17 ChatGPT and Gemini participated in the Korean College Scholastic Ability Test – Earth Science I Seok-Hyun Ga et.al. 2512.15298 translate read null
2025-12-17 Heterogeneous Model Alignment in Digital Twin Faima Abbasi et.al. 2512.15281 translate read null
2025-12-17 Bounty Hunter: Autonomous, Comprehensive Emulation of Multi-Faceted Adversaries Louis Hackländer-Jansen et.al. 2512.15275 translate read null
2025-12-17 Well Begun, Half Done: Reinforcement Learning with Prefix Optimization for LLM Reasoning Yiliu Sun et.al. 2512.15274 translate read null
2025-12-17 Gaming the Arena: AI Model Evaluation and the Viral Capture of Attention Sam Hind et.al. 2512.15252 translate read null
2025-12-17 The Moralization Corpus: Frame-Based Annotation and Analysis of Moralizing Speech Acts across Diverse Text Genres Maria Becker et.al. 2512.15248 translate read null
2025-12-17 Null-LoRA: Low-Rank Adaptation on Null Space Yi Zhang et.al. 2512.15233 translate read null
2025-12-17 CangLing-KnowFlow: A Unified Knowledge-and-Flow-fused Agent for Comprehensive Remote Sensing Applications Zhengchao Chen et.al. 2512.15231 translate read null
2025-12-17 Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024 Yash Bhaskar et.al. 2512.15226 translate read null
2025-12-17 RFKG-CoT: Relation-Driven Adaptive Hop-count Selection and Few-Shot Path Guidance for Knowledge-Aware QA Chao Zhang et.al. 2512.15219 translate read null
2025-12-17 DEER: Draft with Diffusion, Verify with Autoregressive Models Zicong Cheng et.al. 2512.15176 translate read null
2025-12-17 MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers Xuanjun Zong et.al. 2512.15163 translate read null
2025-12-17 Offline Multi-Task Multi-Objective Data-Driven Evolutionary Algorithm with Language Surrogate Model and Implicit Q-Learning Xian-Rong Zhang et.al. 2512.15149 translate read null
2025-12-17 Aligning Academia with Industry: An Empirical Study of Industrial Needs and Academic Capabilities in AI-Driven Software Engineering Hang Yu et.al. 2512.15148 translate read null
2025-12-17 Beyond Majority Voting: Towards Fine-grained and More Reliable Reward Signal for Test-Time Reinforcement Learning Weiqin Wang et.al. 2512.15146 translate read null
2025-12-17 I am here for you”: How relational conversational AI appeals to adolescents, especially those who are socially and emotionally vulnerable Pilyoung Kim et.al. 2512.15117 translate read null
2025-12-17 Uni-Parser Technical Report Xi Fang et.al. 2512.15098 translate read null
2025-12-17 Beyond Fast and Slow: Cognitive-Inspired Elastic Reasoning for Large Language Models Jinwu Hu et.al. 2512.15089 translate read null
2025-12-17 The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks Wanfu Gao et.al. 2512.15082 translate read null
2025-12-17 Quantifying Return on Security Controls in LLM Systems Richard Helder Moulton et.al. 2512.15081 translate read null
2025-12-17 An Exploratory Study of Bayesian Prompt Optimization for Test-Driven Code Generation with Large Language Models Shlok Tomar et.al. 2512.15076 translate read null
2025-12-17 The Meta-Prompting Protocol: Orchestrating LLMs via Adversarial Feedback Loops Fanzhe Fu et.al. 2512.15053 translate read null
2025-12-17 SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification Hongbo Wang et.al. 2512.15052 translate read null
2025-12-17 Beyond Accuracy: A Geometric Stability Analysis of Large Language Models in Chess Evaluation Xidan Song et.al. 2512.15033 translate read null
2025-12-17 Toxicity Ahead: Forecasting Conversational Derailment on GitHub Mia Mohammad Imran et.al. 2512.15031 translate read null
2025-12-17 SeBERTis: A Framework for Producing Classifiers of Security-Related Issue Reports Sogol Masoumzadeh et.al. 2512.15003 translate read null
2025-12-17 DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding Ruiyi Zhang et.al. 2512.15000 translate read null
2025-12-17 Evaluating Large Language Models on Multimodal Chemistry Olympiad Exams Yiming Cui et.al. 2512.14989 translate read null
2025-12-16 EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving Shaoting Feng et.al. 2512.14946 translate read null
2025-12-16 Parameter Efficient Multimodal Instruction Tuning for Romanian Vision Language Models George-Andrei Dima et.al. 2512.14926 translate read null
2025-12-16 Multiscale Aggregated Hierarchical Attention (MAHA): A Game Theoretic and Optimization Driven Approach to Efficient Contextual Modeling in Large Language Models Caner Erden et.al. 2512.14925 translate read null
2025-12-16 Evaluating Code Reasoning Abilities of Large Language Models Under Real-World Settings Changshu Liu et.al. 2512.14917 translate read null
2025-12-16 DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline Houman Kazemzadeh et.al. 2512.14896 translate read null
2025-12-16 Integrating Large Language Models and Knowledge Graphs to Capture Political Viewpoints in News Media Massimiliano Fadda et.al. 2512.14887 translate read null
2025-12-16 Entropy-Reservoir Bregman Projection: An Information-Geometric Unification of Model Collapse Jingwei Chen et.al. 2512.14879 translate read null
2025-12-16 Isolated Sign Language Recognition with Segmentation and Pose Estimation Daniel Perkins et.al. 2512.14876 translate read null
2025-12-16 HERBench: A Benchmark for Multi-Evidence Integration in Video Question Answering Dan Ben-Ami et.al. 2512.14870 translate read null
2025-12-16 MALCDF: A Distributed Multi-Agent LLM Framework for Real-Time Cyber Arth Bhardwaj et.al. 2512.14846 translate read null
2025-12-16 Sharing State Between Prompts and Programs Ellie Y. Cheng et.al. 2512.14805 translate read null
2025-12-16 Incentives or Ontology? A Structural Rebuttal to OpenAI’s Hallucination Thesis Richard Ackermann et.al. 2512.14801 translate read null
2025-12-16 IaC Generation with LLMs: An Error Taxonomy and A Study on Configuration Knowledge Injection Roman Nekrasov et.al. 2512.14792 translate read null
2025-12-16 TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs Jun Zhang et.al. 2512.14698 translate read null
2025-12-16 Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Lanxiang Hu et.al. 2512.14681 translate read null
2025-12-16 EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models Zechen Bai et.al. 2512.14666 translate read null
2025-12-16 Enhancing Visual Sentiment Analysis via Semiotic Isotopy-Guided Dataset Construction Marco Blanchini et.al. 2512.14665 translate read null
2025-12-16 Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models Chiyue Wei et.al. 2512.14661 translate read null
2025-12-16 Beyond Text-to-SQL: Autonomous Research-Driven Database Exploration with DAR Ostap Vykhopen et.al. 2512.14622 translate read null
2025-12-16 PerProb: Indirectly Evaluating Memorization in Large Language Models Yihan Liao et.al. 2512.14600 translate read null
2025-12-16 LLM-driven Knowledge Enhancement for Multimodal Cancer Survival Prediction Chenyu Zhao et.al. 2512.14594 translate read null
2025-12-16 Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer Adarsha Shrestha et.al. 2512.14585 translate read null
2025-12-16 Pairwise Comparison for Bias Identification and Quantification Fabian Haak et.al. 2512.14565 translate read null
2025-12-16 Polypersona: Persona-Grounded LLM for Synthetic Survey Responses Tejaswani Dash et.al. 2512.14562 translate read null
2025-12-16 Agreement Between Large Language Models and Human Raters in Essay Scoring: A Research Synthesis Hongli Li et.al. 2512.14561 translate read null
2025-12-16 CLNet: Cross-View Correspondence Makes a Stronger Geo-Localizationer Xianwei Cao et.al. 2512.14560 translate read null
2025-12-16 VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models Nguyen Tien Dong et.al. 2512.14554 translate read null
2025-12-16 VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse Ying Nie et.al. 2512.14531 translate read null
2025-12-16 RecGPT-V2 Technical Report Chao Yi et.al. 2512.14503 translate read null
2025-12-16 C-ing Clearly: Enhanced Binary Code Explanations using C code Teodor Poncu et.al. 2512.14500 translate read null
2025-12-16 SASQ: Static Activation Scaling for Quantization-Aware Training in Large Language Models Shizhuo Mao et.al. 2512.14481 translate read null
2025-12-16 Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling Annu Rana et.al. 2512.14474 translate read null
2025-12-16 Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space Xingfu Zhou et.al. 2512.14448 translate read null
2025-12-16 Seismology modeling agent: A smart assistant for geophysical researchers Yukun Ren et.al. 2512.14429 translate read null
2025-12-16 Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models Gabriele Prato et.al. 2512.14427 translate read null
2025-12-16 DISCODE: Distribution-Aware Score Decoder for Robust Automatic Evaluation of Image Captioning Nakamasa Inoue et.al. 2512.14420 translate read null
2025-12-16 PortAgent: LLM-driven Vehicle Dispatching Agent for Port Terminals Jia Hu et.al. 2512.14417 translate read null
2025-12-16 Massive Editing for Large Language Models Based on Dynamic Weight Generation Wentao Wan et.al. 2512.14395 translate read null
2025-12-16 RePo: Language Models with Context Re-Positioning Huayang Li et.al. 2512.14391 translate read null
2025-12-16 Multi-Agent Medical Decision Consensus Matrix System: An Intelligent Collaborative Framework for Oncology MDT Consultations Xudong Han et.al. 2512.14321 translate read null
2025-12-16 Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity Shuai Dong et.al. 2512.14320 translate read null
2025-12-16 Inflation Attitudes of Large Language Models Nikoleta Anesti et.al. 2512.14306 translate read null
2025-12-16 Leveraging LLMs for Collaborative Ontology Engineering in Parkinson Disease Monitoring and Alerting Georgios Bouchouras et.al. 2512.14288 translate read null
2025-12-16 The Trust in AI-Generated Health Advice (TAIGHA) Scale and Short Version (TAIGHA-S): Development and Validation Study Marvin Kopka et.al. 2512.14278 translate read null
2025-12-16 SPARQL-LLM: Real-Time SPARQL Query Generation from Natural Language Questions Panayiotis Smeros et.al. 2512.14277 translate read null
2025-12-16 Enhancing Visual Programming for Visual Reasoning via Probabilistic Graphs Wentao Wan et.al. 2512.14257 translate read null
2025-12-16 TEMP: A Memory Efficient Physical-aware Tensor Partition-Mapping Framework on Wafer-scale Chips Huizheng Wang et.al. 2512.14256 translate read null
2025-12-16 From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition Yiqing Zhou et.al. 2512.14244 translate read null
2025-12-16 Two CFG Nahuatl for automatic corpora expansion Juan-José Guzmán-Landa et.al. 2512.14239 translate read null
2025-12-16 Ladder Up, Memory Down: Low-Cost Fine-Tuning With Side Nets Estelle Zheng et.al. 2512.14237 translate read null
2025-12-16 PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design Ruozhao Yang et.al. 2512.14233 translate read null
2025-12-16 Georeferencing complex relative locality descriptions with large language models Aneesha Fernando et.al. 2512.14228 translate read null
2025-12-16 Estimating problem difficulty without ground truth using Large Language Model comparisons Marthe Ballon et.al. 2512.14220 translate read null
2025-12-16 IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol Yunhao Yao et.al. 2512.14166 translate read null
2025-12-16 Adaptive Cache Pollution Control for Large Language Model Inference Workloads Using Temporal CNN-Based Prediction and Priority-Aware Replacement Songze Liu et.al. 2512.14151 translate read null
2025-12-16 Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents Hongqiu Ni et.al. 2512.14142 translate read null
2025-12-16 TorchTraceAP: A New Benchmark Dataset for Detecting Performance Anti-Patterns in Computer Vision Models Hanning Chen et.al. 2512.14141 translate read null
2025-12-16 LAPPI: Interactive Optimization with LLM-Assisted Preference-Based Problem Instantiation So Kuroki et.al. 2512.14138 translate read null
2025-12-16 SportsGPT: An LLM-driven Framework for Interpretable Sports Motion Assessment and Training Guidance Wenbo Tian et.al. 2512.14121 translate read null
2025-12-16 CogMem: A Cognitive Memory Architecture for Sustained Multi-Turn Reasoning in Large Language Models Yiran Zhang et.al. 2512.14118 translate read null
2025-12-16 Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries Emanuele Mezzi et.al. 2512.14102 translate read null
2025-12-16 A First-Order Logic-Based Alternative to Reward Models in RLHF Chunjin Jian et.al. 2512.14100 translate read null
2025-12-16 Cornserve: Efficiently Serving Any-to-Any Multimodal Models Jeff J. Ma et.al. 2512.14098 translate read null
2025-12-16 A Unified Sparse Attention via Multi-Granularity Compression Siran Liu et.al. 2512.14082 translate read null
2025-12-16 From Obfuscated to Obvious: A Comprehensive JavaScript Deobfuscation Tool for Security Analysis Dongchao Zhou et.al. 2512.14070 translate read null
2025-12-16 RADAR: Accelerating Large Language Model Inference With RL-Based Dynamic Draft Trees Junjie Ma et.al. 2512.14069 translate read null
2025-12-16 What Affects the Effective Depth of Large Language Models? Yi Hu et.al. 2512.14064 translate read null
2025-12-16 HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices HyperAI Team et.al. 2512.14052 translate read null
2025-12-16 OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value Mengzhang Cai et.al. 2512.14051 translate read null
2025-12-16 Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation Shen Li et.al. 2512.14048 translate read null
2025-12-16 Evaluating Small Language Models for Agentic On-Farm Decision Support Systems Enhong Liu et.al. 2512.14043 translate read null
2025-12-16 ChartAgent: A Chart Understanding Framework with Tool Integrated Reasoning Boran Wang et.al. 2512.14040 translate read null
2025-12-16 PerfCoder: Large Language Models for Interpretable Code Performance Optimization Jiuding Yang et.al. 2512.14018 translate read null
2025-12-16 KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding Zongyao Li et.al. 2512.14017 translate read null
2025-12-16 Sparsity-Controllable Dynamic Top-p MoE for Large Foundation Model Pre-training Can Jin et.al. 2512.13996 translate read null
2025-12-16 Structure-Aware Decoding Mechanisms for Complex Entity Extraction with Large-Scale Language Models Zhimin Qiu et.al. 2512.13980 translate read null
2025-12-16 ReflCtrl: Controlling LLM Reflection via Representation Engineering Ge Yan et.al. 2512.13979 translate read null
2025-12-16 Evaluating Frontier LLMs on PhD-Level Mathematical Reasoning: A Benchmark on a Textbook in Theoretical Computer Science about Randomized Algorithms Yang Cao et.al. 2512.13978 translate read null
2025-12-16 Autonomous Construction-Site Safety Inspection Using Mobile Robots: A Multilayer VLM-LLM Pipeline Hossein Naderi et.al. 2512.13974 translate read null
2025-12-15 Informing Acquisition Functions via Foundation Models for Molecular Discovery Qi Chen et.al. 2512.13935 translate read null
2025-12-15 Hierarchical Multi-agent Large Language Model Reasoning for Autonomous Functional Materials Discovery Samuel Rothfarb et.al. 2512.13930 translate read null
2025-12-15 Context Branching for LLM Conversations: A Version Control Approach to Exploratory Programming Bhargav Chickmagalur Nanjundappa et.al. 2512.13914 translate read null
2025-12-15 FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition Jonas Golde et.al. 2512.13884 translate read null
2025-12-15 Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors Henger Li et.al. 2512.13860 translate read null
2025-12-15 EvoLattice: Persistent Internal-Population Evolution through Multi-Alternative Quality-Diversity Graph Representations for LLM-Guided Program Discovery Kamer Ali Yuksel et.al. 2512.13857 translate read null
2025-12-15 Practitioner Insights on Fairness Requirements in the AI Development Life Cycle: An Interview Study Chaima Boufaied et.al. 2512.13830 translate read null
2025-12-15 The Double Life of Code World Models: Provably Unmasking Malicious Behavior Through Execution Traces Subramanyam Sahoo et.al. 2512.13821 translate read null
2025-12-15 State-Dependent Refusal and Learned Incapacity in RLHF-Aligned Language Models TK Lee et.al. 2512.13762 translate read null
2025-12-15 A Scientific Reasoning Model for Organic Synthesis Procedure Generation Guoqing Liu et.al. 2512.13668 translate read null
2025-12-15 Embedding-Based Rankings of Educational Resources based on Learning Outcome Alignment: Benchmarking, Expert Validation, and Learner Performance Mohammadreza Molavi et.al. 2512.13658 translate read null
2025-12-15 Comparative Analysis of LLM Abliteration Methods: A Cross-Architecture Evaluation Richard J. Young et.al. 2512.13655 translate read null
2025-12-15 Large-Language Memorization During the Classification of United States Supreme Court Cases John E. Ortega et.al. 2512.13654 translate read null
2025-12-15 MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning Haoyu Fu et.al. 2512.13636 translate read null
2025-12-15 Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models Zefang Liu et.al. 2512.13618 translate read null
2025-12-15 Textual Gradients are a Flawed Metaphor for Automatic Prompt Optimization Daniel Melcer et.al. 2512.13598 translate read null
2025-12-15 ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Jia-Nan Li et.al. 2512.13586 translate read null
2025-12-15 MMhops-R1: Multimodal Multi-hop Reasoning Tao Zhang et.al. 2512.13573 translate read null
2025-12-15 PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation Hour Kaing et.al. 2512.13552 translate read null
2025-12-15 Fine-tuned LLM-based Code Migration Framework Oleg Grynets et.al. 2512.13515 translate read null
2025-12-15 MedCEG: Reinforcing Verifiable Medical Reasoning with Critical Evidence Graph Linjie Mu et.al. 2512.13510 translate read null
2025-12-15 SkipCat: Rank-Maximized Low-Rank Compression of Large Language Models via Shared Projection and Block Skipping Yu-Chen Lu et.al. 2512.13494 translate read null
2025-12-15 From Zipf’s Law to Neural Scaling through Heaps’ Law and Hilberg’s Hypothesis Łukasz Dębowski et.al. 2512.13491 translate read null
2025-12-15 neuralFOMO: Can LLMs Handle Being Second Best? Measuring Envy-Like Preferences in Multi-Agent Settings Ojas Pungalia et.al. 2512.13481 translate read null
2025-12-15 Non-Resolution Reasoning (NRR): A Computational Framework for Contextual Identity and Ambiguity Preservation Kei Saito et.al. 2512.13478 translate read null
2025-12-15 Scaling Laws for Code: Every Programming Language Matters Jian Yang et.al. 2512.13472 translate read null
2025-12-15 Large language models are not about natural language Johan J. Bolhuis et.al. 2512.13441 translate read null
2025-12-15 From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents Dezhi Ran et.al. 2512.13438 translate read null
2025-12-15 Behavior and Representation in Large Language Models for Combinatorial Optimization: From Feature Extraction to Algorithm Selection Francesca Da Ros et.al. 2512.13374 translate read null
2025-12-15 Detecting Emotion Drift in Mental Health Text Using Pre-Trained Transformers Shibani Sankpal et.al. 2512.13363 translate read null
2025-12-15 UCRBench: Benchmarking LLMs on Use Case Recovery Shuyuan Xiao et.al. 2512.13360 translate read null
2025-12-15 On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models Ali Al Sahili et.al. 2512.13352 translate read null
2025-12-15 FROC: A Unified Framework with Risk-Optimized Control for Machine Unlearning in LLMs Si Qi Goh et.al. 2512.13337 translate read null
2025-12-15 FIN-bench-v2: A Unified and Robust Benchmark Suite for Evaluating Finnish Large Language Models Joona Kytöniemi et.al. 2512.13330 translate read null
2025-12-15 Security and Detectability Analysis of Unicode Text Watermarking Methods Against Large Language Models Malte Hellmeier et.al. 2512.13325 translate read null
2025-12-15 KlingAvatar 2.0 Technical Report Kling Team et.al. 2512.13313 translate read null
2025-12-15 MiniLingua: A Small Open-Source LLM for European Languages Anna Aksenova et.al. 2512.13298 translate read null
2025-12-15 AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning Jiaru Zou et.al. 2512.13278 translate read null
2025-12-15 CogniEdit: Dense Gradient Flow Optimization for Fine-Grained Image Editing Yan Li et.al. 2512.13276 translate read null
2025-12-15 Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection Juil Koo et.al. 2512.13250 translate read null
2025-12-15 Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistance Francesco Ragusa et.al. 2512.13238 translate read null
2025-12-15 Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models Chendong Sun et.al. 2512.13194 translate read null
2025-12-15 Integrated Semantic and Temporal Alignment for Interactive Video Retrieval Thanh-Danh Luu et.al. 2512.13169 translate read null
2025-12-15 A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis Xianchao Guan et.al. 2512.13164 translate read null
2025-12-15 Can AI Understand What We Cannot Say? Measuring Multilevel Alignment Through Abortion Stigma Across Cognitive, Interpersonal, and Structural Levels Anika Sharma et.al. 2512.13142 translate read null
2025-12-15 Uncovering the Role of Initial Saliency in U-Shaped Attention Bias: Scaling Initial Token Weight for Enhanced Long-Text Processing Zewen Qiang et.al. 2512.13109 translate read null
2025-12-15 Socratic Students: Teaching Language Models to Learn by Asking Questions Rajeev Bhatt Ambati et.al. 2512.13102 translate read null
2025-12-15 A Simple and Effective Framework for Symmetric Consistent Indexing in Large-Scale Dense Retrieval Huimu Wang et.al. 2512.13074 translate read null
2025-12-15 M-GRPO: Stabilizing Self-Supervised Reinforcement Learning for Large Language Models with Momentum-Anchored Policy Optimization Bizhe Bai et.al. 2512.13070 translate read null
2025-12-15 LLM Rationalis? Measuring Bargaining Capabilities of AI Negotiators Cheril Shah et.al. 2512.13063 translate read null
2025-12-15 An Open and Reproducible Deep Research Agent for Long-Form Question Answering Ikuya Yamada et.al. 2512.13059 translate read null
2025-12-15 Sharpen the Spec, Cut the Code: A Case for Generative File System with SYSSPEC Qingyuan Liu et.al. 2512.13047 translate read null
2025-12-15 Understanding Structured Financial Data with LLMs: A Case Study on Fraud Detection Xuwei Tan et.al. 2512.13040 translate read null
2025-12-15 Large Language Models for Power System Applications: A Comprehensive Literature Survey Muhammad Sarwar et.al. 2512.13004 translate read null
2025-12-15 Are Large Language Models Really Effective for Training-Free Cold-Start Recommendation? Genki Kusano et.al. 2512.13001 translate read null
2025-12-15 Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search from Task-Centric Views Tingyang Chen et.al. 2512.12980 translate read null
2025-12-15 Do Reviews Matter for Recommendations in the Era of Large Language Models? Chee Heng Tan et.al. 2512.12978 translate read null
2025-12-15 Authors Should Annotate Marcus Ma et.al. 2512.12976 translate read null
2025-12-15 Database Research needs an Abstract Relational Query Language Wolfgang Gatterbauer et.al. 2512.12957 translate read null
2025-12-15 Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping Lingyi Meng et.al. 2512.12950 translate read null
2025-12-15 SPAR: Session-based Pipeline for Adaptive Retrieval on Legacy File Systems Duy A. Nguyen et.al. 2512.12938 translate read null
2025-12-15 PROSERVE: Unified Multi-Priority Request Scheduling for LLM Serving Weizhe Huang et.al. 2512.12928 translate read null
2025-12-15 Interpretable Hypothesis-Driven Trading:A Rigorous Walk-Forward Validation Framework for Market Microstructure Signals Gagan Deep et.al. 2512.12924 translate read null
2025-12-15 LLM-based Personalized Portfolio Recommender: Integrating Large Language Models and Reinforcement Learning for Intelligent Investment Strategy Optimization Bangyu Li et.al. 2512.12922 translate read null
2025-12-15 Cisco Integrated AI Security and Safety Framework Report Amy Chang et.al. 2512.12921 translate read null
2025-12-15 CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs Shashie Dilhara Batan Arachchige et.al. 2512.12914 translate read null
2025-12-14 SignRAG: A Retrieval-Augmented System for Scalable Zero-Shot Road Sign Recognition Minghao Zhu et.al. 2512.12885 translate read null
2025-12-14 ERA-IT: Aligning Semantic Models with Revealed Economic Preference for Real-Time and Explainable Patent Valuation Yoo Yongmin et.al. 2512.12869 translate read null
2025-12-14 Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM Furong Jia et.al. 2512.12868 translate read null
2025-12-14 Information-Consistent Language Model Recommendations through Group Relative Policy Optimization Sonal Prabhune et.al. 2512.12858 translate read null
2025-12-14 Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, LLaMA Hanyu Cai et.al. 2512.12812 translate read null
2025-12-14 Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution Boyang Yan et.al. 2512.12806 translate read null
2025-12-14 A Disproof of Large Language Model Consciousness: The Necessity of Continual Learning for Consciousness Erik Hoel et.al. 2512.12802 translate read null
2025-12-14 Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P Anurag Dutt et.al. 2512.12801 translate read null
2025-12-14 DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning Zhe Liu et.al. 2512.12799 translate read null
2025-12-14 A Rule-Aware Prompt Framework for Structured Numeric Reasoning in Cyber-Physical Systems Yichen Liu et.al. 2512.12794 translate read null
2025-12-14 Beyond Task Completion: An Assessment Framework for Evaluating Agentic AI Systems Sreemaee Akshathala et.al. 2512.12791 translate read null
2025-12-14 State over Tokens: Characterizing the Role of Reasoning Tokens Mosh Levy et.al. 2512.12777 translate read null
2025-12-14 Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions Pedro Henrique Luz de Araujo et.al. 2512.12775 translate read null
2025-12-14 JointAVBench: A Benchmark for Joint Audio-Visual Reasoning Evaluation Jianghan Chao et.al. 2512.12772 translate read null
2025-12-14 Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models (ASTA) Mohammad Jalili Torkamani et.al. 2512.12769 translate read null
2025-12-14 Intelligent Scientific Literature Explorer using Machine Learning (ISLE) Sina Jani et.al. 2512.12760 translate read null
2025-12-14 FysicsWorld: A Unified Full-Modality Benchmark for Any-to-Any Understanding, Generation, and Reasoning Yue Jiang et.al. 2512.12756 translate read null
2025-12-14 Resting Neurons, Active Insights: Improving Input Sparsification for Large Language Models Haotian Xu et.al. 2512.12744 translate read null
2025-12-14 CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning Xuanzhang Liu et.al. 2512.12716 translate read null
2025-12-14 Synergizing Code Coverage and Gameplay Intent: Coverage-Aware Game Playtesting with LLM-Guided Reinforcement Learning Enhong Mu et.al. 2512.12706 translate read null
2025-12-14 Hybrid Retrieval-Augmented Generation for Robust Multilingual Document Question Answering Anthony Mudet et.al. 2512.12694 translate read null
2025-12-14 Memoria: A Scalable Agentic Memory Framework for Personalized Conversational AI Samarth Sarin et.al. 2512.12686 translate read null
2025-12-14 Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches Amirhossein Yousefiramandi et.al. 2512.12677 translate read null
2025-12-14 LexRel: Benchmarking Legal Relation Extraction for Chinese Civil Cases Yida Cai et.al. 2512.12643 translate read null
2025-12-14 DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model Zhou Tao et.al. 2512.12633 translate read null
2025-12-14 ORIBA: Exploring LLM-Driven Role-Play Chatbot as a Creativity Support Tool for Original Character Artists Yuqian Sun et.al. 2512.12630 translate read null
2025-12-14 Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space Chengzhi Liu et.al. 2512.12623 translate read null
2025-12-14 Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives Aheli Poddar et.al. 2512.12620 translate read null
2025-12-14 Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching Wonseok Choi et.al. 2512.12610 translate read null
2025-12-14 Human-Inspired Learning for Large Language Models via Obvious Record and Maximum-Entropy Method Discovery Hong Su et.al. 2512.12608 translate read null
2025-12-14 Vision-Enhanced Large Language Models for High-Resolution Image Synthesis and Multimodal Data Interpretation Karthikeya KV et.al. 2512.12595 translate read null
2025-12-14 Beyond Static Scoring: Enhancing Assessment Validity via AI-Generated Interactive Verification Tom Lee et.al. 2512.12592 translate read null
2025-12-14 StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding Xinqi Jin et.al. 2512.12560 translate read null
2025-12-14 Large Language Newsvendor: Decision Biases and Cognitive Mechanisms Jifei Liu et.al. 2512.12552 translate read null
2025-12-14 HyperEdit: Unlocking Instruction-based Text Editing in LLMs via Hypernetworks Yiming Zeng et.al. 2512.12544 translate read null
2025-12-14 NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data Agniva Maiti et.al. 2512.12537 translate read null
2025-12-14 Diverse LLMs vs. Vulnerabilities: Who Detects and Fixes Them Better? Arastoo Zibaeirad et.al. 2512.12536 translate read null
2025-12-14 ATLAS: Automated Tree-based Language Analysis System for C and C++ source programs Jaid Monwar Chowdhury et.al. 2512.12507 translate read null
2025-12-14 KidsArtBench: Multi-Dimensional Children’s Art Evaluation with Attribute-Aware MLLMs Mingrui Ye et.al. 2512.12503 translate read null
2025-12-14 Explainable AI as a Double-Edged Sword in Dermatology: The Impact on Clinicians versus The Public Xuhai Xu et.al. 2512.12500 translate read null
2025-12-13 The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting James Luther et.al. 2512.12488 translate read null
2025-12-13 HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments Yongjun He et.al. 2512.12476 translate read null
2025-12-13 Large language models have learned to use language Gary Lupyan et.al. 2512.12447 translate read null
2025-12-13 Can GPT replace human raters? Validity and reliability of machine-generated norms for metaphors Veronica Mangiaterra et.al. 2512.12444 translate read null
2025-12-11 Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving Jiawei Yang et.al. 2512.10947 translate read null
2025-12-11 FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos Yulu Gan et.al. 2512.10927 translate read null
2025-12-11 SparseSwaps: Tractable LLM Pruning Mask Refinement at Scale Max Zimmer et.al. 2512.10922 translate read null
2025-12-11 CompanionCast: A Multi-Agent Conversational AI Framework with Spatial Audio for Social Co-Viewing Experiences Yiyang Wang et.al. 2512.10918 translate read null
2025-12-11 Multi-Granular Node Pruning for Circuit Discovery Muhammad Umair Haider et.al. 2512.10903 translate read null
2025-12-11 LLMs Can Assist with Proposal Selection at Large User Facilities Lijie Ding et.al. 2512.10895 translate read null
2025-12-11 Computational emotion analysis with multimodal LLMs: Current evidence on an emerging methodological opportunity Hauke Licht et.al. 2512.10882 translate read null
2025-12-11 Quantifying Emotional Tone in Tolkien’s The Hobbit: Dialogue Sentiment Analysis with RegEx, NRC-VAD, and Python Lilin Qiu et.al. 2512.10865 translate read null
2025-12-11 Large Language Models for Superconductor Discovery Suman Itani et.al. 2512.10847 translate read null
2025-12-11 LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification Michael Schlee et.al. 2512.10793 translate read null
2025-12-11 The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality Aileen Cheng et.al. 2512.10791 translate read null
2025-12-11 Natural Language Interface for Firewall Configuration F. Taghiyev et.al. 2512.10789 translate read null
2025-12-11 Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving Holger Maus et.al. 2512.10785 translate read null
2025-12-11 Script Gap: Evaluating LLM Triage on Indian Languages in Native vs Roman Scripts in a Real World Setting Manurag Khullar et.al. 2512.10780 translate read null
2025-12-11 OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification Zijian Wu et.al. 2512.10756 translate read null
2025-12-11 LDP: Parameter-Efficient Fine-Tuning of Multimodal LLM for Medical Report Generation Tianyu Zhou et.al. 2512.10750 translate read null
2025-12-11 Echoes of Automation: How Bots Shaped Political Discourse in Brazil Merve Ipek Bal et.al. 2512.10749 translate read null
2025-12-11 TRIDENT: A Redundant Architecture for Caribbean-Accented Emergency Speech Triage Elroy Galbraith et.al. 2512.10741 translate read null
2025-12-11 Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving Songyang Gao et.al. 2512.10739 translate read null
2025-12-11 Textual Data Bias Detection and Mitigation - An Extensible Pipeline with Experimental Evaluation Rebekka Görge et.al. 2512.10734 translate read null
2025-12-11 IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation Yuan-Ming Li et.al. 2512.10730 translate read link
2025-12-11 Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality Lingjing Kong et.al. 2512.10720 translate read null
2025-12-11 PACIFIC: a framework for generating benchmarks to check Precise Automatically Checked Instruction Following In Code Itay Dreyfuss et.al. 2512.10713 translate read null
2025-12-11 COMPARE: Clinical Optimization with Modular Planning and Assessment via RAG-Enhanced AI-OCT: Superior Decision Support for Percutaneous Coronary Intervention Compared to ChatGPT-5 and Junior Operators Wei Fang et.al. 2512.10702 translate read null
2025-12-11 Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution Zouying Cao et.al. 2512.10696 translate read null
2025-12-11 Challenges of Evaluating LLM Safety for User Welfare Manon Kempermann et.al. 2512.10687 translate read null
2025-12-11 On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity Muhua Huang et.al. 2512.10665 translate read null
2025-12-11 Token Sample Complexity of Attention Léa Bohbot et.al. 2512.10656 translate read null
2025-12-11 TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection Jian-Yu Jiang-Lin et.al. 2512.10652 translate read null
2025-12-11 From Data Scarcity to Data Care: Reimagining Language Technologies for Serbian and other Low-Resource Languages Smiljana Antonijevic Ubois et.al. 2512.10630 translate read null
2025-12-11 AgriGPT-Omni: A Unified Speech-Vision-Text Framework for Multilingual Agricultural Intelligence Bo Yang et.al. 2512.10624 translate read null
2025-12-11 Phythesis: Physics-Guided Evolutionary Scene Synthesis for Energy-Efficient Data Center Design via LLMs Minghao LI et.al. 2512.10611 translate read null
2025-12-11 Multi-Objective Reward and Preference Optimization: Theory and Algorithms Akhil Agnihotri et.al. 2512.10601 translate read null
2025-12-11 Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval J. Xiao et.al. 2512.10596 translate read null
2025-12-11 RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems Hang Ding et.al. 2512.10575 translate read null
2025-12-11 NormCode: A Semi-Formal Language for Context-Isolated AI Planning Xin Guan et.al. 2512.10563 translate read null
2025-12-11 Causal Reasoning Favors Encoders: On The Limits of Decoder-Only Models Amartya Roy et.al. 2512.10561 translate read null
2025-12-11 Grounding Everything in Tokens for Multimodal Large Language Models Xiangxuan Ren et.al. 2512.10554 translate read null
2025-12-11 LLM-Auction: Generative Auction towards LLM-Native Advertising Chujie Zhao et.al. 2512.10551 translate read null
2025-12-11 Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding Yuchen Feng et.al. 2512.10548 translate read null
2025-12-11 Unlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse Autoencoders Qingsen Ma et.al. 2512.10547 translate read null
2025-12-11 XDoGE: Multilingual Data Reweighting to Enhance Language Inclusivity in LLMs Iñaki Lacunza et.al. 2512.10545 translate read null
2025-12-11 Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning Haiteng Zhao et.al. 2512.10534 translate read null
2025-12-11 Zero-shot 3D Map Generation with LLM Agents: A Dual-Agent Architecture for Procedural Content Generation Lim Chien Her et.al. 2512.10501 translate read null
2025-12-11 Decoding Human-LLM Collaboration in Coding: An Empirical Study of Multi-Turn Conversations in the Wild Binquan Zhang et.al. 2512.10493 translate read null
2025-12-11 LLM-Assisted AHP for Explainable Cyber Range Evaluation Vyron Kampourakis et.al. 2512.10487 translate read null
2025-12-11 From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection Chaomeng Lu et.al. 2512.10485 translate read null
2025-12-11 Grammaticality Judgments in Humans and Language Models: Revisiting Generative Grammar with LLMs Lars G. B. Johnsen et.al. 2512.10453 translate read null
2025-12-11 When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection Devanshu Sahoo et.al. 2512.10449 translate read null
2025-12-11 Decoding Student Minds: Leveraging Conversational Agents for Psychological and Learning Analysis Nour El Houda Ben Chaabene et.al. 2512.10441 translate read null
2025-12-11 Enhancing Next-Generation Language Models with Knowledge Graphs: Extending Claude, Mistral IA, and GPT-4 via KG-BERT Nour El Houda Ben Chaabene et.al. 2512.10440 translate read null
2025-12-11 Semantic Reconstruction of Adversarial Plagiarism: A Context-Aware Framework for Detecting and Restoring “Tortured Phrases” in Scientific Literature Agniva Maiti et.al. 2512.10435 translate read null
2025-12-11 Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers Youmin Ko et.al. 2512.10422 translate read null
2025-12-11 How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation Devanshu Sahoo et.al. 2512.10415 translate read null
2025-12-11 Sliding Window Attention Adaptation Yijiong Yu et.al. 2512.10411 translate read null
2025-12-11 RoboNeuron: A Modular Framework Linking Foundation Models and ROS for Embodied AI Weifan Guan et.al. 2512.10394 translate read null
2025-12-11 GPG: Generalized Policy Gradient Theorem for Transformer-based Policies Hangyu Mao et.al. 2512.10365 translate read null
2025-12-11 Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models Woojun Jung et.al. 2512.10362 translate read null
2025-12-11 Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task Sunqi Fan et.al. 2512.10359 translate read null
2025-12-11 Dynamics of Agentic Loops in Large Language Models: A Geometric Theory of Trajectories Nicolas Tacheny et.al. 2512.10350 translate read null
2025-12-11 EchoingPixels: Cross-Modal Adaptive Token Reduction for Efficient Audio-Visual LLMs Chao Gong et.al. 2512.10324 translate read null
2025-12-11 EpiPlanAgent: Agentic Automated Epidemic Response Planning Kangkun Mao et.al. 2512.10313 translate read null
2025-12-11 Efficient-VLN: A Training-Efficient Vision-Language Navigation Model Duo Zheng et.al. 2512.10310 translate read null
2025-12-11 Reverse Thinking Enhances Missing Information Detection in Large Language Models Yuxin Liu et.al. 2512.10273 translate read null
2025-12-11 VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models Yuetong Su et.al. 2512.10262 translate read null
2025-12-11 Reject or Not?: A Benchmark for Voice Assistant Query Rejection in Smart Home Scenario and an Improved Method Based on LLMs Huichao Men et.al. 2512.10257 translate read null
2025-12-11 InFerActive: Towards Scalable Human Evaluation of Large Language Models through Interactive Inference Junhyeong Hwangbo et.al. 2512.10234 translate read null
2025-12-11 Adaptive Information Routing for Multimodal Time Series Forecasting Jun Seo et.al. 2512.10229 translate read null
2025-12-11 Does SWE-Bench-Verified Test Agent Ability or Model Memory? Thanosan Prathifkumar et.al. 2512.10218 translate read null
2025-12-11 CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment Yakun Zhu et.al. 2512.10206 translate read null
2025-12-11 AutoMedic: An Automated Evaluation Framework for Clinical Conversational Agents with Medical Dataset Grounding Gyutaek Oh et.al. 2512.10195 translate read null
2025-12-11 CIEGAD: Cluster-Conditioned Interpolative and Extrapolative Framework for Geometry-Aware and Domain-Aligned Data Augmentation Keito Inoshita et.al. 2512.10178 translate read null
2025-12-11 ATLAS: Automated Toolkit for Large-Scale Verified Code Synthesis Mantas Baksys et.al. 2512.10173 translate read null
2025-12-11 Offscript: Automated Auditing of Instruction Adherence in LLMs Nicholas Clark et.al. 2512.10172 translate read null
2025-12-10 Enhancing Large Language Models for End-to-End Circuit Analysis Problem Solving Liangliang Chen et.al. 2512.10159 translate read null
2025-12-10 Unforgotten Safety: Preserving Safety Alignment of Large Language Models with Continual Learning Lama Alssum et.al. 2512.10150 translate read null
2025-12-10 PARAN: Persona-Augmented Review ANswering system on Food Delivery Review Dataset Moonsoo Park et.al. 2512.10148 translate read null
2025-12-10 Workflow is All You Need: Escaping the “Statistical Smoothing Trap” via High-Entropy Information Foraging and Adversarial Pacing Zhongjie Jiang et.al. 2512.10121 translate read null
2025-12-10 AgriRegion: Region-Aware Retrieval for High-Fidelity Agricultural Advice Mesafint Fanuel et.al. 2512.10114 translate read null
2025-12-10 Generate-Then-Validate: A Novel Question Generation Approach Using Small Language Models Yumou Wei et.al. 2512.10110 translate read null
2025-12-10 LLM-PEA: Leveraging Large Language Models Against Phishing Email Attacks Najmul Hassan et.al. 2512.10104 translate read null
2025-12-10 What Kind of Reasoning (if any) is an LLM actually doing? On the Stochastic Nature and Abductive Appearance of Large Language Models Luciano Floridi et.al. 2512.10080 translate read null
2025-12-10 Independent Density Estimation Jiahao Liu et.al. 2512.10067 translate read null
2025-12-10 Linear socio-demographic representations emerge in Large Language Models from indirect cues Paul Bouchaud et.al. 2512.10065 translate read null
2025-12-10 \textsc{Text2Graph}: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios João Lucas Luz Lima Sarcinelli et.al. 2512.10061 translate read null
2025-12-10 Parallel Decoder Transformer: Model-Internal Parallel Decoding with Speculative Invariance via Note Conditioning Logan Robbins et.al. 2512.10054 translate read null
2025-12-10 Detailed balance in large language model-driven agents Zhuo-Yang Song et.al. 2512.10047 translate read null
2025-12-10 Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition João Lucas Luz Lima Sarcinelli et.al. 2512.10043 translate read null
2025-12-10 Intelligently Weighting Multiple Reference Models for Direct Preference Optimization of LLMs Skyler Wu et.al. 2512.10040 translate read null
2025-12-10 Exploring LLMs for Scientific Information Extraction Using The SciEx Framework Sha Li et.al. 2512.10004 translate read null
2025-12-10 SCOPE: Language Models as One-Time Teacher for Hierarchical Planning in Text Environments Haoye Lu et.al. 2512.09897 translate read null
2025-12-10 Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs Pius Horn et.al. 2512.09874 translate read link
2025-12-10 FlipLLM: Efficient Bit-Flip Attacks on Multimodal LLMs using Reinforcement Learning Khurram Khalil et.al. 2512.09872 translate read null
2025-12-10 MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI Fengli Wu et.al. 2512.09867 translate read null
2025-12-10 UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving Hao Lu et.al. 2512.09864 translate read null
2025-12-10 Mitigating Social Bias in English and Urdu Language Models Using PRM-Guided Candidate Selection and Sequential Refinement Muneeb Ur Raheem Khan et.al. 2512.09854 translate read null
2025-12-10 ChronusOmni: Improving Time Awareness of Omni Large Language Models Yijing Chen et.al. 2512.09841 translate read null
2025-12-10 LLMs in Interpreting Legal Documents Simone Corbo et.al. 2512.09830 translate read null
2025-12-10 RIFT: A Scalable Methodology for LLM Accelerator Fault Assessment using Reinforcement Learning Khurram Khalil et.al. 2512.09829 translate read null
2025-12-10 DeepSeek’s WEIRD Behavior: The cultural alignment of Large Language Models and the effects of prompt language and cultural prompting James Luther et.al. 2512.09772 translate read null
2025-12-10 Defining Cost Function of Steganography with Large Language Models Hanzhou Wu et.al. 2512.09769 translate read null
2025-12-10 Towards Language Model Guided TLA+ Proof Automation Yuhao Zhou et.al. 2512.09758 translate read null
2025-12-10 Knowledge Graph Enrichment and Reasoning for Nobel Laureates Thanh-Lam T. Nguyen et.al. 2512.09707 translate read null
2025-12-10 Exqutor: Extended Query Optimizer for Vector-augmented Analytical Queries Hyunjoon Kim et.al. 2512.09695 translate read null
2025-12-10 Understanding Chain-of-Thought Effectiveness in Code Generation: An Empirical and Information-Theoretic Analysis Naizhu Jin et.al. 2512.09679 translate read null
2025-12-10 The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization Alexey Kravatskiy et.al. 2512.09678 translate read null
2025-12-10 d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models Leyi Pan et.al. 2512.09675 translate read null
2025-12-10 IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting Tao Zhang et.al. 2512.09663 translate read link
2025-12-10 Can LLMs Evaluate What They Cannot Annotate? Revisiting LLM Reliability in Hate Speech Detection Paloma Piot et.al. 2512.09662 translate read null
2025-12-10 Measuring Corruption from Text Data Arieda Muço et.al. 2512.09652 translate read null
2025-12-10 MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment Mengxi Xiao et.al. 2512.09636 translate read null
2025-12-10 Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale Karl Gustav Gailit et.al. 2512.09634 translate read null
2025-12-10 An End-to-end Planning Framework with Agentic LLMs and PDDL Emanuele La Malfa et.al. 2512.09629 translate read null
2025-12-10 LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection Jingwei Ye et.al. 2512.09627 translate read null
2025-12-10 Rethinking Chain-of-Thought Reasoning for Videos Yiwu Zhong et.al. 2512.09616 translate read link
2025-12-10 ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation Boyin Yang et.al. 2512.09610 translate read null
2025-12-10 Investigate the Low-level Visual Perception in Vision-Language based Image Quality Assessment Yuan Li et.al. 2512.09573 translate read null
2025-12-10 System Report for CCL25-Eval Task 10: Prompt-Driven Large Language Model Merge for Fine-Grained Chinese Hate Speech Detection Binglin Wu et.al. 2512.09563 translate read null
2025-12-10 Systematic Framework of Application Methods for Large Language Models in Language Sciences Kun Sun et.al. 2512.09552 translate read null
2025-12-10 Chasing Shadows: Pitfalls in LLM Security Research Jonathan Evertz et.al. 2512.09549 translate read null
2025-12-10 Supporting Dynamic Agentic Workloads: How Data and Agents Interact Ioana Giurgiu et.al. 2512.09548 translate read null
2025-12-10 Don’t Throw Away Your Beams: Improving Consistency-based Uncertainties in LLMs via Beam Search Ekaterina Fadeeva et.al. 2512.09538 translate read null
2025-12-10 CNFinBench: A Benchmark for Safety and Compliance of Large Language Models in Finance Jinru Ding et.al. 2512.09506 translate read null
2025-12-10 RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning Yucan Guo et.al. 2512.09487 translate read null
2025-12-10 Advancing LLM-Based Security Automation with Customized Group Relative Policy Optimization for Zero-Touch Networks Xinye Cao et.al. 2512.09485 translate read null
2025-12-10 An Efficient Interaction Human-AI Synergy System Bridging Visual Awareness and Large Language Model for Intensive Care Units Yibowen Zhao et.al. 2512.09473 translate read null
2025-12-10 WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving Chiheng Lou et.al. 2512.09472 translate read null
2025-12-10 Advancing Text Classification with Large Language Models and Neural Attention Mechanisms Ning Lyu et.al. 2512.09444 translate read null
2025-12-10 Advancing Research via Human-AI Interactive Theorem Proving Chenyi Li et.al. 2512.09443 translate read null
2025-12-10 Knowledge-Augmented Large Language Model Agents for Explainable Financial Decision-Making Qingyuan Zhang et.al. 2512.09440 translate read null
2025-12-10 ODMA: On-Demand Memory Allocation Framework for LLM Serving on LPDDR-Class Accelerators Guoqiang Zou et.al. 2512.09427 translate read null
2025-12-10 Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs Sohely Jahan et.al. 2512.09403 translate read null
2025-12-10 Optimizing Data Extraction from Materials Science Literature: A Study of Tools Using Large Language Models Wenkai Ning et.al. 2512.09370 translate read null
2025-12-10 Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs Yezi Liu et.al. 2512.09369 translate read null
2025-12-10 Video-QTR: Query-Driven Temporal Reasoning Framework for Lightweight Video Understanding Xinkui Zhao et.al. 2512.09354 translate read null
2025-12-10 Self Distillation Fine-Tuning of Protein Language Models Improves Versatility in Protein Design Amin Tavakoli et.al. 2512.09329 translate read null
2025-12-10 RACAM: Enhancing DRAM with Reuse-Aware Computation and Automated Mapping for ML Inference Siyuan Ma et.al. 2512.09304 translate read null
2025-12-10 Identifying Bias in Machine-generated Text Detection Kevin Stowe et.al. 2512.09292 translate read null
2025-12-10 LongT2IBench: A Benchmark for Evaluating Long Text-to-Image Generation with Graph-structured Annotations Zhichao Yang et.al. 2512.09271 translate read null
2025-12-10 From Forecast to Action: Uncertainty-Aware UAV Deployment for Ocean Drifter Recovery Jingeun Kim et.al. 2512.09260 translate read null
2025-12-10 The Illusion of Rationality: Tacit Bias and Strategic Dominance in Frontier LLM Negotiation Games Manuel S. Ríos et.al. 2512.09254 translate read null
2025-12-10 GLACIA: Instance-Aware Positional Reasoning for Glacial Lake Segmentation via Multimodal Large Language Model Lalit Maurya et.al. 2512.09251 translate read null
2025-12-10 Training-free Context-adaptive Attention for Efficient Long Context Modeling Zeng You et.al. 2512.09238 translate read null
2025-12-10 CORE: A Conceptual Reasoning Layer for Large Language Models Vishwas Hegde et.al. 2512.09222 translate read null
2025-12-10 Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment Zixuan Liu et.al. 2512.09212 translate read null
2025-12-09 LLMs for Analog Circuit Design Continuum (ACDC) Yasaman Esfandiari et.al. 2512.09199 translate read null
2025-12-09 TritonForge: Profiling-Guided Framework for Automated Triton Kernel Optimization Haonan Li et.al. 2512.09196 translate read null
2025-12-09 WOLF: Werewolf-based Observations for LLM Deception and Falsehoods Mrinal Agarwal et.al. 2512.09187 translate read null
2025-12-09 MindShift: Analyzing Language Models’ Reactions to Psychological Prompts Anton Vasiliuk et.al. 2512.09149 translate read null
2025-12-09 Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment Shanghao Li et.al. 2512.09148 translate read null
2025-12-09 Knowledge-Guided Large Language Model for Automatic Pediatric Dental Record Understanding and Safe Antibiotic Recommendation Zihan Han et.al. 2512.09127 translate read null
2025-12-09 A Categorical Analysis of Large Language Models and Why LLMs Circumvent the Symbol Grounding Problem Luciano Floridi et.al. 2512.09117 translate read null
2025-12-09 Evolving Excellence: Automated Optimization of LLM-based Agents Paul Brookes et.al. 2512.09108 translate read null
2025-12-09 Learning Unmasking Policies for Diffusion Language Models Metod Jazbec et.al. 2512.09106 translate read null
2025-12-09 Explaining the Unseen: Multimodal Vision-Language Reasoning for Situational Awareness in Underground Mining Disasters Mizanur Rahman Jewel et.al. 2512.09092 translate read null
2025-12-09 Calibrated Trust in Dealing with LLM Hallucinations: A Qualitative Study Adrian Ryser et.al. 2512.09088 translate read null
2025-12-09 AgentComp: From Agentic Reasoning to Compositional Mastery in Text-to-Image Models Arman Zarei et.al. 2512.09081 translate read null
2025-12-09 Llama-based source code vulnerability detection: Prompt engineering vs Fine tuning Dyna Soumhane Ouchebara et.al. 2512.09006 translate read null
2025-12-09 Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs Angela van Sprang et.al. 2512.08923 translate read null
2025-12-09 Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training Jakub Krajewski et.al. 2512.08894 translate read null
2025-12-09 Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders Guangzhi Xiong et.al. 2512.08892 translate read null
2025-12-09 AI Didn’t Start the Fire: Examining the Stack Exchange Moderator and Contributor Strike Yiwei Wu et.al. 2512.08884 translate read null
2025-12-09 When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation Joshua Ward et.al. 2512.08875 translate read null
2025-12-09 Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning Jing Jie Tan et.al. 2512.08873 translate read null
2025-12-09 SimpleDevQA: Benchmarking Large Language Models on Development Knowledge QA Jing Zhang et.al. 2512.08867 translate read null
2025-12-09 Ask, Answer, and Detect: Role-Playing LLMs for Personality Detection with Question-Conditioned Mixture-of-Experts Yifan Lyu et.al. 2512.08814 translate read null
2025-12-09 PrivTune: Efficient and Privacy-Preserving Fine-Tuning of Large Language Models via Device-Cloud Collaboration Yi Liu et.al. 2512.08809 translate read null
2025-12-09 A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs Mahmoud Srewa et.al. 2512.08786 translate read null
2025-12-09 A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows Eranga Bandara et.al. 2512.08769 translate read null
2025-12-09 Financial News Summarization: Can extractive methods still offer a true alternative to LLMs? Nicolas Reche et.al. 2512.08764 translate read null
2025-12-09 Towards Foundation Models with Native Multi-Agent Intelligence Shuyue Hu et.al. 2512.08743 translate read null
2025-12-09 LaMoSys3.5D: Enabling 3.5D-IC-Based Large Language Model Inference Serving Systems via Hardware/Software Co-Design Qipan Wang et.al. 2512.08731 translate read null
2025-12-09 Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search Manos Plitsis et.al. 2512.08724 translate read null
2025-12-09 Multi-Agent Intelligence for Multidisciplinary Decision-Making in Gastrointestinal Oncology Rongzhao Zhang et.al. 2512.08674 translate read null
2025-12-09 An Agentic AI System for Multi-Framework Communication Coding Bohao Yang et.al. 2512.08659 translate read null
2025-12-09 QSTN: A Modular Framework for Robust Questionnaire Inference with Large Language Models Maximilian Kreutner et.al. 2512.08646 translate read null
2025-12-09 Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation Young Kyung Kim et.al. 2512.08645 translate read null
2025-12-09 See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm Haoyu Zhao et.al. 2512.08629 translate read null
2025-12-09 HealthcareNLP: where are we and what is next? Lifeng Han et.al. 2512.08617 translate read null
2025-12-09 CogMCTS: A Novel Cognitive-Guided Monte Carlo Tree Search Framework for Iterative Heuristic Evolution with Large Language Models Hui Wang et.al. 2512.08609 translate read null
2025-12-09 Bridging Scale Discrepancies in Robotic Control via Language-Based Action Representations Yuchi Zhang et.al. 2512.08548 translate read null
2025-12-09 Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks Indrajit Kar et.al. 2512.08545 translate read null
2025-12-09 Principles2Plan: LLM-Guided System for Operationalising Ethical Principles into Plans Tammy Zhong et.al. 2512.08536 translate read null
2025-12-09 Autonomous Issue Resolver: Towards Zero-Touch Code Maintenance Aliaksei Kaliutau et.al. 2512.08492 translate read null
2025-12-09 Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models Ju-Young Kim et.al. 2512.08480 translate read null
2025-12-09 A Multi-Agent LLM Framework for Design Space Exploration in Autonomous Driving Systems Po-An Shih et.al. 2512.08476 translate read null
2025-12-09 Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models III: Implementing the Bacterial Biothreat Benchmark (B3) Dataset Gary Ackerman et.al. 2512.08459 translate read null
2025-12-09 Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models II: Benchmark Generation Process Gary Ackerman et.al. 2512.08451 translate read null
2025-12-09 What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models Janiça Hackenbuchner et.al. 2512.08440 translate read null
2025-12-09 Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs Yinan Zhong et.al. 2512.08417 translate read null
2025-12-09 Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval Tao Chen et.al. 2512.08410 translate read null
2025-12-09 DFALLM: Achieving Generalizable Multitask Deepfake Detection by Optimizing Audio LLM Components Yupei Li et.al. 2512.08403 translate read null
2025-12-09 The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss Bozhou Li et.al. 2512.08374 translate read null
2025-12-09 Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making Wentao Zhang et.al. 2512.08366 translate read null
2025-12-09 The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations Benedikt Mangold et.al. 2512.08345 translate read null
2025-12-09 Argus: A Multi-Agent Sensitive Information Leakage Detection Framework Based on Hierarchical Reference Relationships Bin Wang et.al. 2512.08326 translate read null
2025-12-09 rSIM: Incentivizing Reasoning Capabilities of LLMs via Reinforced Strategy Injection Sijia Chen et.al. 2512.08300 translate read null
2025-12-09 Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem Shiva Gaire et.al. 2512.08290 translate read null
2025-12-09 Empowering smart app development with SolidGPT: an edge-cloud hybrid AI agent framework Liao Hu et.al. 2512.08286 translate read null
2025-12-09 AgentEval: Generative Agents as Reliable Proxies for Human Evaluation of AI-Generated Content Thanh Vu et.al. 2512.08273 translate read null
2025-12-09 Reasoning Models Ace the CFA Exams Jaisal Patel et.al. 2512.08270 translate read null
2025-12-09 Token Sugar: Making Source Code Sweeter for LLMs through Token-Efficient Shorthand Zhensu Sun et.al. 2512.08266 translate read null
2025-12-09 Beyond Traditional Diagnostics: Transforming Patient-Side Information into Predictive Insights with Knowledge Graphs and Prototypes Yibowen Zhao et.al. 2512.08261 translate read null
2025-12-09 Chopper: A Multi-Level GPU Characterization Tool & Derived Insights Into LLM Training Inefficiency Marco Kurzynski et.al. 2512.08242 translate read null
2025-12-09 SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection Ching-Hung Cheng et.al. 2512.08223 translate read null
2025-12-09 Secure or Suspect? Investigating Package Hallucinations of Shell Command in Original and Quantized LLMs Md Nazmul Haque et.al. 2512.08213 translate read null
2025-12-09 MobileFineTuner: A Unified End-to-End Framework for Fine-Tuning LLMs on Mobile Phones Jiaxiang Geng et.al. 2512.08211 translate read null
2025-12-09 ClinicalTrialsHub: Bridging Registries and Literature for Comprehensive Clinical Trial Access Jiwoo Park et.al. 2512.08193 translate read null
2025-12-09 A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties Jinghao Wang et.al. 2512.08185 translate read null
2025-12-09 Framing Climate Change on YouTube: North-South Divides in Narratives and Public Engagement Sanika Damle et.al. 2512.08183 translate read null
2025-12-09 Chat with UAV – Human-UAV Interaction Based on Large Language Models Haoran Wang et.al. 2512.08145 translate read null
2025-12-09 PolyLingua: Margin-based Inter-class Transformer for Robust Cross-domain Language Detection Ali Lotfi Rezaabad et.al. 2512.08143 translate read null
2025-12-09 Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models I: The Task-Query Architecture Gary Ackerman et.al. 2512.08130 translate read null
2025-12-09 Universal Adversarial Suffixes Using Calibrated Gumbel-Softmax Relaxation Sampriti Soor et.al. 2512.08123 translate read null
2025-12-08 Evolutionary perspective of large language models on shaping research insights into healthcare disparities David An et.al. 2512.08122 translate read null
2025-12-08 Balanced Accuracy: The Right Metric for Evaluating LLM Judges – Explained through Youden’s J statistic Stephane Collot et.al. 2512.08121 translate read null
2025-12-08 Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies Stephan Carney et.al. 2512.08107 translate read null
2025-12-08 AgentCrypt: Advancing Privacy and (Secure) Computation in AI Agent Collaboration Harish Karthikeyan et.al. 2512.08104 translate read null
2025-12-08 Training LLMs for Honesty via Confessions Manas Joglekar et.al. 2512.08093 translate read null
2025-12-08 Adaptation of Embedding Models to Financial Filings via LLM Distillation Eliot Brenner et.al. 2512.08088 translate read null
2025-12-08 Exploiting the Randomness of Large Language Models (LLM) in Text Classification Tasks: Locating Privileged Documents in Legal Matters Keith Huffman et.al. 2512.08083 translate read null
2025-12-08 Short-Context Dominance: How Much Local Context Natural Language Actually Needs? Vala Vakilian et.al. 2512.08082 translate read null
2025-12-08 Leveraging Machine Learning and Large Language Models for Automated Image Clustering and Description in Legal Discovery Qiang Mao et.al. 2512.08079 translate read null
2025-12-08 A Comparative Study of Retrieval Methods in Azure AI Search Qiang Mao et.al. 2512.08078 translate read null
2025-12-08 Unveiling Latent Knowledge in Chemistry Language Models through Sparse Autoencoders Jaron Cohen et.al. 2512.08077 translate read null
2025-12-08 Large Language Models for Education and Research: An Empirical and User Survey-based Analysis Md Mostafizer Rahman et.al. 2512.08057 translate read null
2025-12-08 CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space Tianxingjian Ding et.al. 2512.08029 translate read null
2025-12-08 Toward an AI Reasoning-Enabled System for Patient-Clinical Trial Matching Caroline N. Leach et.al. 2512.08026 translate read null
2025-12-08 FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models Jiyoon Pyo et.al. 2512.08016 translate read null
2025-12-08 Bridging the Clinical Expertise Gap: Development of a Web-Based Platform for Accessible Time Series Forecasting and Analysis Aaron D. Mullen et.al. 2512.07992 translate read null
2025-12-08 DeepCode: Open Agentic Coding Zongwei Li et.al. 2512.07921 translate read link
2025-12-08 Relational Visual Similarity Thao Nguyen et.al. 2512.07833 translate read null
2025-12-08 Do Generalisation Results Generalise? Matteo Boglioni et.al. 2512.07832 translate read null
2025-12-08 Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach Hua Yang et.al. 2512.07814 translate read null
2025-12-08 LLM Use for Mental Health: Crowdsourcing Users’ Sentiment-based Perspectives and Values from Social Discussions Lingyao Li et.al. 2512.07797 translate read null
2025-12-08 Large Causal Models from Large Language Models Sridhar Mahadevan et.al. 2512.07796 translate read null
2025-12-08 ReasonBENCH: Benchmarking the (In)Stability of LLM Reasoning Nearchos Potamitis et.al. 2512.07795 translate read null
2025-12-08 Automating High Energy Physics Data Analysis with LLM-Powered Agents Eli Gendreau-Distler et.al. 2512.07785 translate read null
2025-12-08 Mary, the Cheeseburger-Eating Vegetarian: Do LLMs Recognize Incoherence in Narratives? Karin de Langis et.al. 2512.07777 translate read null
2025-12-08 RL-MTJail: Reinforcement Learning for Automated Black-Box Multi-Turn Jailbreaking of Large Language Models Xiqiao Xiong et.al. 2512.07761 translate read null
2025-12-08 SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery Meng Cao et.al. 2512.07733 translate read null
2025-12-08 SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination Sangha Park et.al. 2512.07730 translate read null
2025-12-08 Privacy Practices of Browser Agents Alisha Ukani et.al. 2512.07725 translate read null
2025-12-08 In-Context and Few-Shots Learning for Forecasting Time Series Data based on Large Language Models Saroj Gopali et.al. 2512.07705 translate read null
2025-12-08 HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs Sujoy Nath et.al. 2512.07687 translate read null
2025-12-08 When Large Language Models Do Not Work: Online Incivility Prediction through Graph Neural Networks Zihan Chen et.al. 2512.07684 translate read null
2025-12-08 Depth-Wise Activation Steering for Honest Language Models Gracjan Góral et.al. 2512.07667 translate read null
2025-12-08 Bridging Code Graphs and Large Language Models for Better Code Understanding Zeqi Chen et.al. 2512.07666 translate read null
2025-12-08 Reliable agent engineering should integrate machine-compatible organizational principles R. Patrick Xian et.al. 2512.07665 translate read null
2025-12-08 An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research Hamad Almazrouei et.al. 2512.07652 translate read null
2025-12-08 PCMind-2.1-Kaiyuan-2B Technical Report Kairong Luo et.al. 2512.07612 translate read null
2025-12-08 Comparative Analysis and Parametric Tuning of PPO, GRPO, and DAPO for LLM Reasoning Enhancement Yongsheng Lian et.al. 2512.07611 translate read null
2025-12-08 Metric-Fair Prompting: Treating Similar Samples Similarly Jing Wang et.al. 2512.07608 translate read null
2025-12-08 Complementary Learning Approach for Text Classification using Large Language Models Navid Asgari et.al. 2512.07583 translate read null
2025-12-08 All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs Yahong Wang et.al. 2512.07580 translate read null
2025-12-08 A Simple Method to Enhance Pre-trained Language Models with Speech Tokens for Classification Nicolas Calbucura et.al. 2512.07571 translate read null
2025-12-08 MoCoRP: Modeling Consistent Relations between Persona and Response for Persona-based Dialogue Kyungro Lee et.al. 2512.07544 translate read null
2025-12-08 SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents Michelle Wastl et.al. 2512.07538 translate read null
2025-12-08 Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs Xiaoran Liu et.al. 2512.07525 translate read link
2025-12-08 AutoICE: Automatically Synthesizing Verifiable C Code via LLM-driven Evolution Weilin Luo et.al. 2512.07501 translate read null
2025-12-08 How Do LLMs Fail In Agentic Scenarios? A Qualitative Analysis of Success and Failure Scenarios of Various LLMs in Agentic Simulations JV Roig et.al. 2512.07497 translate read null
2025-12-08 Enhancing Agentic RL with Progressive Reward Shaping and Value-based Sampling Policy Optimization Zhuoran Zhuang et.al. 2512.07478 translate read null
2025-12-08 Understanding LLM Agent Behaviours via Game Theory: Strategy Recognition, Biases and Multi-Agent Dynamics Trung-Kiet Huynh et.al. 2512.07462 translate read null
2025-12-08 Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Tong Wu et.al. 2512.07461 translate read link
2025-12-08 Persian-Phi: Efficient Cross-Lingual Adaptation of Compact LLMs via Curriculum Learning Amir Mohammad Akhlaghi et.al. 2512.07454 translate read null
2025-12-08 From Show Programmes to Data: Designing a Workflow to Make Performing Arts Ephemera Accessible Through Language Models Clarisse Bardiot et.al. 2512.07452 translate read null
2025-12-08 MIDG: Mixture of Invariant Experts with knowledge injection for Domain Generalization in Multimodal Sentiment Analysis Yangle Li et.al. 2512.07430 translate read null
2025-12-08 Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models Haidong Kang et.al. 2512.07419 translate read null
2025-12-08 Do LLMs Trust the Code They Write? Francisco Ribeiro et.al. 2512.07404 translate read null
2025-12-08 LUNE: Efficient LLM Unlearning via LoRA Fine-Tuning with Negative Examples Yezi Liu et.al. 2512.07375 translate read null
2025-12-08 Communication-Efficient Serving for Video Diffusion Models with Latent Parallelism Zhiyuan Wu et.al. 2512.07350 translate read null
2025-12-08 Generalized Referring Expression Segmentation on Aerial Photos Luís Marnoto et.al. 2512.07338 translate read link
2025-12-08 DCO: Dynamic Cache Orchestration for LLM Accelerators through Predictive Management Zhongchun Zhou et.al. 2512.07312 translate read null
2025-12-08 Exact Synthetic Populations for Scalable Societal and Market Modeling Thierry Petit et.al. 2512.07306 translate read null
2025-12-08 Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts Mingning Guo et.al. 2512.07302 translate read null
2025-12-08 Investigating Training and Generalization in Faithful Self-Explanations of Large Language Models Tomoki Doi et.al. 2512.07288 translate read null
2025-12-08 Automatic Syntax Error Repair for Discrete Controller Synthesis using Large Language Model Yusei Ishimizu et.al. 2512.07261 translate read null
2025-12-08 Ensembling LLM-Induced Decision Trees for Explainable and Robust Error Detection Mengqi Wang et.al. 2512.07246 translate read null
2025-12-08 NeSTR: A Neuro-Symbolic Abductive Framework for Temporal Reasoning in Large Language Models Feng Liang et.al. 2512.07218 translate read null
2025-12-08 MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning Xuhui Zheng et.al. 2512.07203 translate read null
2025-12-08 Generating Storytelling Images with Rich Chains-of-Reasoning Xiujie Song et.al. 2512.07198 translate read null
2025-12-08 START: Spatial and Textual Learning for Chart Understanding Zhuoming Liu et.al. 2512.07186 translate read link
2025-12-08 ContextualSHAP : Enhancing SHAP Explanations Through Contextual Language Generation Latifa Dwiyanti et.al. 2512.07178 translate read null
2025-12-08 SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models Yibo Wang et.al. 2512.07175 translate read null
2025-12-08 Improving the Throughput of Diffusion-based Large Language Models via a Training-Free Confidence-Aware Calibration Jucheng Shen et.al. 2512.07173 translate read null
2025-12-08 When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing Siyuan Xu et.al. 2512.07166 translate read null
2025-12-08 A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning Siyang Jiang et.al. 2512.07136 translate read null
2025-12-08 DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning Nithin Sivakumaran et.al. 2512.07132 translate read null
2025-12-08 RisConFix: LLM-based Automated Repair of Risk-Prone Drone Configurations Liping Han et.al. 2512.07122 translate read null
2025-12-08 FOAM: Blocked State Folding for Memory-Efficient LLM Training Ziqing Wen et.al. 2512.07112 translate read null
2025-12-08 The Geometry of Persona: Disentangling Personality from Reasoning in Large Language Models Zhixiang Wang et.al. 2512.07092 translate read null
2025-12-08 Leveraging KV Similarity for Online Structured Pruning in LLMs Jungmin Lee et.al. 2512.07090 translate read null
2025-12-08 ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking Yunzhe Li et.al. 2512.07086 translate read null
2025-12-08 Do Large Language Models Truly Understand Cross-cultural Differences? Shiwei Guo et.al. 2512.07075 translate read null
2025-12-08 Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models Richard Young et.al. 2512.07059 translate read null
2025-12-07 Reformulate, Retrieve, Localize: Agents for Repository-Level Bug Localization Genevieve Caumartin et.al. 2512.07022 translate read null
2025-12-07 Latency-Response Theory Model: Evaluating Large Language Models via Response Accuracy and Chain-of-Thought Length Zhiyu Xu et.al. 2512.07019 translate read null
2025-12-07 FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations Mayank Ravishankara et.al. 2512.07015 translate read null
2025-12-07 Block Sparse Flash Attention Daniel Ohayon et.al. 2512.07011 translate read null
2025-12-07 Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model Zihao Wang et.al. 2512.06999 translate read null
2025-12-07 Prompting-in-a-Series: Psychology-Informed Contents and Embeddings for Personality Recognition With Decoder-Only Models Jing Jie Tan et.al. 2512.06991 translate read null
2025-12-07 Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation Ivanhoé Botcazou et.al. 2512.06938 translate read null
2025-12-07 Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI George Mikros et.al. 2512.06922 translate read null
2025-12-07 NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification Ziyang Song et.al. 2512.06921 translate read null
2025-12-07 SoK: Trust-Authorization Mismatch in LLM Agent Interactions Guanquan Shi et.al. 2512.06914 translate read null
2025-12-07 Robots with Attitudes: Influence of LLM-Driven Robot Personalities on Motivation and Performance Dennis Becker et.al. 2512.06910 translate read null
2025-12-07 BabelCoder: Agentic Code Translation with Specification Alignment Fazle Rabbi et.al. 2512.06902 translate read null
2025-12-07 An Analysis of Large Language Models for Simulating User Responses in Surveys Ziyun Yu et.al. 2512.06874 translate read null
2025-12-07 Rhea: Role-aware Heuristic Episodic Attention for Conversational LLMs Wanyang Hong et.al. 2512.06869 translate read null
2025-12-07 Do Persona-Infused LLMs Affect Performance in a Strategic Reasoning Game? John Licato et.al. 2512.06867 translate read null
2025-12-07 Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior Yulin Li et.al. 2512.06866 translate read null
2025-12-07 Spatial Retrieval Augmented Autonomous Driving Xiaosong Jia et.al. 2512.06865 translate read null
2025-12-07 JT-DA: Enhancing Data Analysis with Tool-Integrated Table Reasoning Large Language Models Ce Chi et.al. 2512.06859 translate read null
2025-12-07 Formal that “Floats” High: Formal Verification of Floating Point Arithmetic Hansa Mohanty et.al. 2512.06850 translate read null
2025-12-07 CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs Xiaoqi Li et.al. 2512.06846 translate read null
2025-12-07 Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs Weixing Zhang et.al. 2512.06836 translate read null
2025-12-07 Large Language Model-Based Generation of Discharge Summaries Tiago Rodrigues et.al. 2512.06812 translate read null
2025-12-07 MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning Yueqian Wang et.al. 2512.06810 translate read null
2025-12-07 Optimal and Diffusion Transports in Machine Learning Gabriel Peyré et.al. 2512.06797 translate read null
2025-12-07 LLM4SFC: Sequential Function Chart Generation via Large Language Models Ofek Glick et.al. 2512.06787 translate read null
2025-12-07 From Description to Score: Can LLMs Quantify Vulnerabilities? Sima Jafarikhah et.al. 2512.06781 translate read null
2025-12-07 From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs Yuchuan Tian et.al. 2512.06776 translate read link
2025-12-07 Becoming Experienced Judges: Selective Test-Time Learning for Evaluators Seungyeon Jwa et.al. 2512.06751 translate read null
2025-12-07 DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems Ming Ma et.al. 2512.06749 translate read null
2025-12-07 PrivLLMSwarm: Privacy-Preserving LLM-Driven UAV Swarms for Secure IoT Surveillance Jifar Wakuma Ayana et.al. 2512.06747 translate read null
2025-12-07 A Patient-Doctor-NLP-System to contest inequality for less privileged Subrit Dikshit et.al. 2512.06734 translate read null
2025-12-07 “The Dentist is an involved parent, the bartender is not”: Revealing Implicit Biases in QA with Implicit BBQ Aarushi Wagh et.al. 2512.06732 translate read null
2025-12-07 KV-CAR: KV Cache Compression using Autoencoders and KV Reuse in Large Language Models Sourjya Roy et.al. 2512.06727 translate read null
2025-12-07 The Role of Entropy in Visual Grounding: Analysis and Optimization Shuo Li et.al. 2512.06726 translate read null
2025-12-07 ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems Bufang Yang et.al. 2512.06721 translate read null
2025-12-07 Cognitive Control Architecture (CCA): A Lifecycle Supervision Framework for Robustly Aligned AI Agents Zhibo Liang et.al. 2512.06716 translate read null

(<a href=../LLM.md>back to LLM</a>)