LLM - 2025-12
LLM - 2025-12
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-12-31 | Constructing a Neuro-Symbolic Mathematician from First Principles | Keqin Xie et.al. | 2601.00125 | translate | read | null |
| 2025-12-31 | Ask, Clarify, Optimize: Human-LLM Agent Collaboration for Smarter Inventory Control | Yaqi Duan et.al. | 2601.00121 | translate | read | null |
| 2025-12-31 | CTMap: LLM-Enabled Connectivity-Aware Path Planning in Millimeter-Wave Digital Twin Networks | Md Salik Parwez et.al. | 2601.00110 | translate | read | null |
| 2025-12-31 | Mortar: Evolving Mechanics for Automatic Game Design | Muhammad U. Nasir et.al. | 2601.00105 | translate | read | null |
| 2025-12-31 | The Agentic Leash: Extracting Causal Feedback Fuzzy Cognitive Maps with LLMs | Akash Kumar Panda et.al. | 2601.00097 | translate | read | null |
| 2025-12-31 | Universal Adaptive Constraint Propagation: Scaling Structured Inference for Large Language Models via Meta-Reinforcement Learning | Ibne Farabi Shihab et.al. | 2601.00095 | translate | read | null |
| 2025-12-31 | Spatial4D-Bench: A Versatile 4D Spatial Intelligence Benchmark | Pan Wang et.al. | 2601.00092 | translate | read | null |
| 2025-12-31 | Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery | Junqi Qu et.al. | 2601.00088 | translate | read | null |
| 2025-12-31 | RIMRULE: Improving Tool-Using Language Agents via MDL-Guided Rule Learning | Xiang Gao et.al. | 2601.00086 | translate | read | null |
| 2025-12-31 | Vulcan: Instance-Optimal Systems Heuristics Through LLM-Driven Search | Rohit Dwivedula et.al. | 2512.25065 | translate | read | null |
| 2025-12-31 | Many Minds from One Model: Bayesian Transformers for Population Intelligence | Diji Yang et.al. | 2512.25063 | translate | read | null |
| 2025-12-31 | Context-aware LLM-based AI Agents for Human-centered Energy Management Systems in Smart Buildings | Tianzhi He et.al. | 2512.25055 | translate | read | null |
| 2025-12-31 | MAMA-Memeia! Multi-Aspect Multi-Agent Collaboration for Depressive Symptoms Identification in Memes | Siddhant Agarwal et.al. | 2512.25015 | translate | read | null |
| 2025-12-31 | Efficiently Estimating Data Efficiency for Language Model Fine-tuning | Gyung Hyun Je et.al. | 2512.24991 | translate | read | null |
| 2025-12-31 | PhysTalk: Language-driven Real-time Physics in 3D Gaussian Scenes | Luca Collorone et.al. | 2512.24986 | translate | read | null |
| 2025-12-31 | Large language models and the entropy of English | Colin Scheibner et.al. | 2512.24969 | translate | read | null |
| 2025-12-31 | The Impact of LLMs on Online News Consumption and Production | Hangcheng Zhao et.al. | 2512.24968 | translate | read | null |
| 2025-12-31 | AMAP Agentic Planning Technical Report | Yulan Hu et.al. | 2512.24957 | translate | read | null |
| 2025-12-31 | CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement | Wentao Zhang et.al. | 2512.24947 | translate | read | null |
| 2025-12-31 | RAIR: A Rule-Aware Benchmark Uniting Challenging Long-Tail and Visual Salience Subset for E-commerce Relevance Assessment | Chenji Lu et.al. | 2512.24943 | translate | read | null |
| 2025-12-31 | Iterative Deployment Improves Planning Skills in LLMs | Augusto B. Corrêa et.al. | 2512.24940 | translate | read | null |
| 2025-12-31 | Vibe Coding, Interface Flattening | Hongrui Jin et.al. | 2512.24939 | translate | read | null |
| 2025-12-31 | Adaptive Dependency-aware Prompt Optimization Framework for Multi-Step LLM Pipeline | Minjun Zhao et.al. | 2512.24933 | translate | read | null |
| 2025-12-31 | FinMMDocR: Benchmarking Financial Multimodal Reasoning with Scenario Awareness, Document Understanding, and Multi-Step Computation | Zichen Tang et.al. | 2512.24903 | translate | read | null |
| 2025-12-31 | Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements | Yiming Liang et.al. | 2512.24867 | translate | read | null |
| 2025-12-31 | VLN-MME: Diagnosing MLLMs as Language-guided Visual Navigation agents | Xunyi Zhao et.al. | 2512.24851 | translate | read | null |
| 2025-12-31 | GenZ: Foundational models as latent variable generators within traditional statistical models | Marko Jojic et.al. | 2512.24834 | translate | read | null |
| 2025-12-31 | Unregularized Linear Convergence in Zero-Sum Game from Preference Feedback | Shulun Chen et.al. | 2512.24818 | translate | read | null |
| 2025-12-31 | LeanCat: A Benchmark Suite for Formal Category Theory in Lean (Part I: 1-Categories) | Rongge Xu et.al. | 2512.24796 | translate | read | null |
| 2025-12-31 | Compute-Accuracy Pareto Frontiers for Open-Source Reasoning Large Language Models | Ákos Prucs et.al. | 2512.24776 | translate | read | null |
| 2025-12-31 | Analyzing Communication Predictability in LLM Training | Wenxue Li et.al. | 2512.24750 | translate | read | null |
| 2025-12-31 | BIOME-Bench: A Benchmark for Biomolecular Interaction Inference and Multi-Omics Pathway Mechanism Elucidation from Scientific Literature | Sibo Wei et.al. | 2512.24733 | translate | read | null |
| 2025-12-31 | FPGA Co-Design for Efficient N:M Sparse and Quantized Model Inference | Fen-Yu Hsieh et.al. | 2512.24713 | translate | read | null |
| 2025-12-31 | MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints | Kangyang Luo et.al. | 2512.24711 | translate | read | null |
| 2025-12-31 | MUSIC: MUlti-Step Instruction Contrast for Multi-Turn Reward Models | Wenzhe Li et.al. | 2512.24693 | translate | read | null |
| 2025-12-31 | Quantum Visual Word Sense Disambiguation: Unraveling Ambiguities Through Quantum Inference Model | Wenbo Qiao et.al. | 2512.24687 | translate | read | null |
| 2025-12-31 | BatteryAgent: Synergizing Physics-Informed Interpretation with LLM Reasoning for Intelligent Battery Fault Diagnosis | Songqi Zhou et.al. | 2512.24686 | translate | read | null |
| 2025-12-31 | Do Large Language Models Know What They Are Capable Of? | Casey O. Barkan et.al. | 2512.24661 | translate | read | null |
| 2025-12-31 | DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information | Zhili Huang et.al. | 2512.24635 | translate | read | null |
| 2025-12-31 | How Do Agentic AI Systems Address Performance Optimizations? A BERTopic-Based Analysis of Pull Requests | Md Nahidul Islam Opu et.al. | 2512.24630 | translate | read | null |
| 2025-12-31 | Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models | Junru Lu et.al. | 2512.24618 | translate | read | link |
| 2025-12-31 | Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space | Xingwei Qu et.al. | 2512.24617 | translate | read | null |
| 2025-12-31 | Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization | Yuchen Shi et.al. | 2512.24615 | translate | read | link |
| 2025-12-31 | Chat-Driven Optimal Management for Virtual Network Services | Yuya Miyaoka et.al. | 2512.24614 | translate | read | null |
| 2025-12-31 | Group Deliberation Oriented Multi-Agent Conversational Model for Complex Reasoning | Zheyu Shi et.al. | 2512.24613 | translate | read | null |
| 2025-12-31 | Reinforcement Learning-Augmented LLM Agents for Collaborative Decision Making and Performance Optimization | Dong Qiu et.al. | 2512.24609 | translate | read | null |
| 2025-12-31 | Recursive Language Models | Alex L. Zhang et.al. | 2512.24601 | translate | read | link |
| 2025-12-31 | A Tale of 1001 LoC: Potential Runtime Error-Guided Specification Synthesis for Verifying Large-Scale Programs | Zhongyi Wang et.al. | 2512.24594 | translate | read | null |
| 2025-12-31 | Improving Few-Shot Change Detection Visual Question Answering via Decision-Ambiguity-guided Reinforcement Fine-Tuning | Fuyu Dong et.al. | 2512.24591 | translate | read | null |
| 2025-12-31 | MultiRisk: Multiple Risk Control via Iterative Score Thresholding | Sunay Joshi et.al. | 2512.24587 | translate | read | null |
| 2025-12-31 | Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time | Zhenyu Zhang et.al. | 2512.24574 | translate | read | null |
| 2025-12-31 | SynRAG: A Large Language Model Framework for Executable Query Generation in Heterogeneous SIEM System | Md Hasan Saju et.al. | 2512.24571 | translate | read | null |
| 2025-12-31 | On the Effectiveness of Training Data Optimization for LLM-based Code Generation: An Empirical Study | Shiqi Kuang et.al. | 2512.24570 | translate | read | null |
| 2025-12-31 | MCPAgentBench: A Real-world Task Benchmark for Evaluating LLM Agent MCP Tool Use | Wenrui Liu et.al. | 2512.24565 | translate | read | null |
| 2025-12-31 | HaluNet: Multi-Granular Uncertainty Modeling for Efficient Hallucination Detection in LLM Question Answering | Chaodong Tong et.al. | 2512.24562 | translate | read | null |
| 2025-12-31 | Localized Calibrated Uncertainty in Code Language Models | David Gros et.al. | 2512.24560 | translate | read | null |
| 2025-12-31 | Safe in the Future, Dangerous in the Past: Dissecting Temporal and Linguistic Vulnerabilities in LLMs | Muhammad Abdullahi Said et.al. | 2512.24556 | translate | read | null |
| 2025-12-31 | More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization | Yuma Ichikawa et.al. | 2512.24545 | translate | read | null |
| 2025-12-31 | From Building Blocks to Planning: Multi-Step Spatial Reasoning in LLMs with Reinforcement Learning | Amir Tahmasbi et.al. | 2512.24532 | translate | read | null |
| 2025-12-31 | Generative AI-enhanced Sector-based Investment Portfolio Construction | Alina Voronina et.al. | 2512.24526 | translate | read | null |
| 2025-12-30 | Using Large Language Models To Translate Machine Results To Human Results | Trishna Niraula et.al. | 2512.24518 | translate | read | null |
| 2025-12-30 | Paragraph Segmentation Revisited: Towards a Standard Task for Structuring Speech | Fabian Retkowski et.al. | 2512.24517 | translate | read | null |
| 2025-12-30 | Evaluating the Reasoning Abilities of LLMs on Underrepresented Mathematics Competition Problems | Samuel Golladay et.al. | 2512.24505 | translate | read | null |
| 2025-12-30 | HOLOGRAPH: Active Causal Discovery via Sheaf-Theoretic Alignment of Large Language Model Priors | Hyunjun Kim et.al. | 2512.24478 | translate | read | null |
| 2025-12-30 | PackKV: Reducing KV Cache Memory Footprint through LLM-Aware Lossy Compression | Bo Jiang et.al. | 2512.24449 | translate | read | null |
| 2025-12-30 | Towards mechanistic understanding in a data-driven weather model: internal activations reveal interpretable physical features | Theodore MacMillan et.al. | 2512.24440 | translate | read | null |
| 2025-12-30 | Comparing Approaches to Automatic Summarization in Less-Resourced Languages | Chester Palen-Michel et.al. | 2512.24410 | translate | read | null |
| 2025-12-30 | World model inspired sarcasm reasoning with large language model agents | Keito Inoshita et.al. | 2512.24329 | translate | read | null |
| 2025-12-30 | QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs | Shupeng Li et.al. | 2512.24314 | translate | read | null |
| 2025-12-30 | Automated Analysis of Sustainability Reports: Using Large Language Models for the Extraction and Prediction of EU Taxonomy-Compliant KPIs | Jonathan Schmoll et.al. | 2512.24289 | translate | read | null |
| 2025-12-30 | Taming Hallucinations: Boosting MLLMs’ Video Understanding via Counterfactual Video Generation | Zhe Huang et.al. | 2512.24271 | translate | read | link |
| 2025-12-30 | RAGPart & RAGMask: Retrieval-Stage Defenses Against Corpus Poisoning in Retrieval-Augmented Generation | Pankayaraj Pathmanathan et.al. | 2512.24268 | translate | read | null |
| 2025-12-30 | Joint Selection for Large-Scale Pre-Training Data via Policy Gradient-based Mask Learning | Ziqing Fan et.al. | 2512.24265 | translate | read | null |
| 2025-12-30 | GPT-like transformer model for silicon tracking detector simulation | Tadej Novak et.al. | 2512.24254 | translate | read | null |
| 2025-12-30 | MedKGI: Iterative Differential Diagnosis with Medical Knowledge Graphs and Information-Guided Inquiring | Qipeng Wang et.al. | 2512.24181 | translate | read | null |
| 2025-12-30 | DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models | Zefeng He et.al. | 2512.24165 | translate | read | link |
| 2025-12-30 | Training Report of TeleChat3-MoE | Xinzhang Liu et.al. | 2512.24157 | translate | read | null |
| 2025-12-30 | Large Emotional World Model | Changhao Song et.al. | 2512.24149 | translate | read | null |
| 2025-12-30 | Activation Steering for Masked Diffusion Language Models | Adi Shnaidman et.al. | 2512.24143 | translate | read | null |
| 2025-12-30 | Bridging Visual Intuition and Chemical Expertise: An Autonomous Analysis Framework for Nonadiabatic Dynamics Simulations via Mentor-Engineer-Student Collaboration | Yifei Zhu et.al. | 2512.24133 | translate | read | null |
| 2025-12-30 | OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization | Advait Gadhikar et.al. | 2512.24124 | translate | read | null |
| 2025-12-30 | Enhancing LLM-Based Neural Network Generation: Few-Shot Prompting and Efficient Validation for Automated Architecture Design | Chandini Vysyaraju et.al. | 2512.24120 | translate | read | null |
| 2025-12-30 | CogRec: A Cognitive Recommender Agent Fusing Large Language Models and Soar for Explainable Recommendation | Jiaxin Hu et.al. | 2512.24113 | translate | read | null |
| 2025-12-30 | Training a Huggingface Model on AWS Sagemaker (Without Tears) | Liling Tan et.al. | 2512.24098 | translate | read | null |
| 2025-12-30 | LoongFlow: Directed Evolutionary Search via a Cognitive Plan-Execute-Summarize Paradigm | Chunhui Wan et.al. | 2512.24077 | translate | read | link |
| 2025-12-30 | How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns | Haoyue Bai et.al. | 2512.24063 | translate | read | null |
| 2025-12-30 | Beyond Hallucinations: A Composite Score for Measuring Reliability in Open-Source Large Language Models | Rohit Kumar Salla et.al. | 2512.24058 | translate | read | null |
| 2025-12-30 | Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race? | Yuan Xin et.al. | 2512.24044 | translate | read | null |
| 2025-12-30 | ROAD: Reflective Optimization via Automated Debugging for Zero-Shot Agent Alignment | Natchaya Temyingyong et.al. | 2512.24040 | translate | read | null |
| 2025-12-30 | RSAgent: Learning to Reason and Act for Text-Guided Segmentation via Multi-Turn Tool Invocations | Xingqi He et.al. | 2512.24023 | translate | read | null |
| 2025-12-30 | FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing | Yunkai Dang et.al. | 2512.24022 | translate | read | link |
| 2025-12-30 | iCLP: Large Language Model Reasoning with Implicit Cognition Latent Planning | Sijia Chen et.al. | 2512.24014 | translate | read | null |
| 2025-12-30 | SPARK: Search Personalization via Agent-Driven Retrieval and Knowledge-sharing | Gaurab Chhetri et.al. | 2512.24008 | translate | read | null |
| 2025-12-30 | RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress | Ruixuan Huang et.al. | 2512.23995 | translate | read | null |
| 2025-12-30 | Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process | Zhenyu Zhang et.al. | 2512.23988 | translate | read | null |
| 2025-12-30 | Coding With AI: From a Reflection on Industrial Practices to Future Computer Science and Software Engineering Education | Hung-Fu Chang et.al. | 2512.23982 | translate | read | null |
| 2025-12-30 | Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling | Chulun Zhou et.al. | 2512.23959 | translate | read | link |
| 2025-12-30 | A Proof-of-Concept for Explainable Disease Diagnosis Using Large Language Models and Answer Set Programming | Ioanna Gemou et.al. | 2512.23932 | translate | read | null |
| 2025-12-29 | Scaling Remote Sensing Foundation Models: Data Domain Tradeoffs at the Peta-Scale | Charith Wickrema et.al. | 2512.23903 | translate | read | null |
| 2025-12-29 | How Large Language Models Systematically Misrepresent American Climate Opinions | Sola Kim et.al. | 2512.23889 | translate | read | null |
| 2025-12-29 | Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack | Roee Ziv et.al. | 2512.23881 | translate | read | null |
| 2025-12-29 | CASCADE: Cumulative Agentic Skill Creation through Autonomous Development and Evolution | Xu Huang et.al. | 2512.23880 | translate | read | null |
| 2025-12-29 | Seeking Late Night Life Lines: Experiences of Conversational AI Use in Mental Health Crisis | Leah Hope Ajmani et.al. | 2512.23859 | translate | read | null |
| 2025-12-29 | Integrating Domain Knowledge for Financial QA: A Multi-Retriever RAG Approach with LLMs | Yukun Zhang et.al. | 2512.23848 | translate | read | null |
| 2025-12-29 | A Test of Lookahead Bias in LLM Forecasts | Zhenyu Gao et.al. | 2512.23847 | translate | read | null |
| 2025-12-29 | From Correctness to Collaboration: Toward a Human-Centered Framework for Evaluating AI Agent Behavior in Software Engineering | Tao Dong et.al. | 2512.23844 | translate | read | null |
| 2025-12-29 | Retrieval Augmented Question Answering: When Should LLMs Admit Ignorance? | Dingmin Wang et.al. | 2512.23836 | translate | read | null |
| 2025-12-29 | Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark | Manu et.al. | 2512.23779 | translate | read | null |
| 2025-12-29 | Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning | Tiancheng Su et.al. | 2512.23765 | translate | read | null |
| 2025-12-28 | Audited Skill-Graph Self-Improvement for Agentic LLMs via Verifiable Rewards, Experience Synthesis, and Continual Memory | Ken Huang et.al. | 2512.23760 | translate | read | null |
| 2025-12-29 | Eliciting Behaviors in Multi-Turn Conversations | Jing Huang et.al. | 2512.23701 | translate | read | null |
| 2025-12-29 | Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing | Panagiotis Theocharopoulos et.al. | 2512.23684 | translate | read | null |
| 2025-12-29 | Web World Models | Jichen Feng et.al. | 2512.23676 | translate | read | link |
| 2025-12-29 | OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding | Keda Tao et.al. | 2512.23646 | translate | read | null |
| 2025-12-29 | BOAD: Discovering Hierarchical Software Engineering Agents via Bandit Optimization | Iris Xu et.al. | 2512.23631 | translate | read | link |
| 2025-12-29 | Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing | Yuwen Li et.al. | 2512.23611 | translate | read | null |
| 2025-12-29 | The Big Three in Marriage Talk: LLM-Assisted Analysis of Moral Ethics and Sentiment on Weibo and Xiaohongshu | Frank Tian-Fang Ye et.al. | 2512.23609 | translate | read | null |
| 2025-12-29 | Divergent-Convergent Thinking in Large Language Models for Creative Problem Generation | Manh Hung Nguyen et.al. | 2512.23601 | translate | read | null |
| 2025-12-29 | Can AI Recognize Its Own Reflection? Self-Detection Performance of LLMs in Computing Education | Christopher Burger et.al. | 2512.23587 | translate | read | null |
| 2025-12-29 | Instruction-Following Evaluation of Large Vision-Language Models | Daiki Shiono et.al. | 2512.23572 | translate | read | null |
| 2025-12-29 | ThinkGen: Generalized Thinking for Visual Generation | Siyu Jiao et.al. | 2512.23568 | translate | read | link |
| 2025-12-29 | RxnBench: A Multimodal Benchmark for Evaluating Large Language Models on Chemical Reaction Understanding from Scientific Literature | Hanzheng Li et.al. | 2512.23565 | translate | read | null |
| 2025-12-29 | Toward Trustworthy Agentic AI: A Multimodal Framework for Preventing Prompt Injection Attacks | Toqeer Ali Syed et.al. | 2512.23557 | translate | read | null |
| 2025-12-29 | Trustworthy Machine Learning under Distribution Shifts | Zhuo Huang et.al. | 2512.23524 | translate | read | null |
| 2025-12-29 | Single LLM Debate, MoLaCE: Mixture of Latent Concept Experts Against Confirmation Bias | Hazel Kim et.al. | 2512.23518 | translate | read | null |
| 2025-12-29 | Alpha-R1: Alpha Screening with LLM Reasoning via Reinforcement Learning | Zuoyou Jiang et.al. | 2512.23515 | translate | read | link |
| 2025-12-29 | Beyond Correctness: Exposing LLM-generated Logical Flaws in Reasoning via Multi-step Automated Theorem Proving | Xinyi Zheng et.al. | 2512.23511 | translate | read | null |
| 2025-12-29 | Hierarchical Decision Mamba Meets Agentic AI: A Novel Approach for RAN Slicing in 6G | Md Arafat Habib et.al. | 2512.23502 | translate | read | null |
| 2025-12-29 | The Gaining Paths to Investment Success: Information-Driven LLM Graph Reasoning for Venture Capital Prediction | Haoyu Pei et.al. | 2512.23489 | translate | read | null |
| 2025-12-29 | Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation | Toqeer Ali Syed et.al. | 2512.23480 | translate | read | null |
| 2025-12-29 | Semantic Tree Inference on Text Corpa using a Nested Density Approach together with Large Language Model Embeddings | Thomas Haschka et.al. | 2512.23471 | translate | read | null |
| 2025-12-29 | Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance | Zhuo Li et.al. | 2512.23461 | translate | read | null |
| 2025-12-29 | Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following | Kongcheng Zhang et.al. | 2512.23457 | translate | read | null |
| 2025-12-29 | ClinDEF: A Dynamic Evaluation Framework for Large Language Models in Clinical Reasoning | Yuqi Tang et.al. | 2512.23440 | translate | read | null |
| 2025-12-29 | C2PO: Diagnosing and Disentangling Bias Shortcuts in LLMs | Xuan Feng et.al. | 2512.23430 | translate | read | null |
| 2025-12-29 | Entropy-Guided Token Dropout: Training Autoregressive Language Models with Limited Domain Data | Jiapeng Wang et.al. | 2512.23422 | translate | read | null |
| 2025-12-29 | MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning | Jiawei Chen et.al. | 2512.23412 | translate | read | null |
| 2025-12-29 | Theoretical Foundations of Scaling Law in Familial Models | Huan Song et.al. | 2512.23407 | translate | read | null |
| 2025-12-29 | Securing the AI Supply Chain: What Can We Learn From Developer-Reported Security Issues and Solutions of AI Projects? | The Anh Nguyen et.al. | 2512.23385 | translate | read | null |
| 2025-12-29 | A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers | Mohammad Nasirzadeh et.al. | 2512.23380 | translate | read | link |
| 2025-12-29 | Post-Training Quantization of OpenPangu Models for Efficient Deployment on Atlas A2 | Yilun Luo et.al. | 2512.23367 | translate | read | null |
| 2025-12-29 | SpatialMosaic: A Multiview VLM Dataset for Partial Visibility | Kanghee Lee et.al. | 2512.23365 | translate | read | null |
| 2025-12-29 | A Stepwise-Enhanced Reasoning Framework for Large Language Models Based on External Subgraph Generation | Xin Zhang et.al. | 2512.23356 | translate | read | null |
| 2025-12-29 | The Law of Multi-Model Collaboration: Scaling Limits of Model Ensembling for Large Language Models | Dakuan Lu et.al. | 2512.23340 | translate | read | null |
| 2025-12-29 | CubeBench: Diagnosing Interactive, Long-Horizon Spatial Reasoning Under Partial Observations | Huan-ang Gao et.al. | 2512.23328 | translate | read | null |
| 2025-12-29 | Flexible Keyword-Aware Top- $k$ Route Search | Ziqiang Yu et.al. | 2512.23319 | translate | read | null |
| 2025-12-29 | Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRL | Abolfazl Younesi et.al. | 2512.23310 | translate | read | null |
| 2025-12-29 | MedGemma vs GPT-4: Open-Source and Proprietary Zero-shot Medical Disease Classification from Images | Md. Sazzadul Islam Prottasha et.al. | 2512.23304 | translate | read | null |
| 2025-12-29 | AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration | Minjiang Huang et.al. | 2512.23300 | translate | read | null |
| 2025-12-29 | Agentic AI-Enhanced Semantic Communications: Foundations, Architecture, and Applications | Haixiao Gao et.al. | 2512.23294 | translate | read | null |
| 2025-12-29 | Chinese Morph Resolution in E-commerce Live Streaming Scenarios | Jiahao Zhu et.al. | 2512.23280 | translate | read | null |
| 2025-12-29 | Interpretable Safety Alignment via SAE-Constructed Low-Rank Subspace Adaptation | Dianyun Wang et.al. | 2512.23260 | translate | read | null |
| 2025-12-29 | Multimodal Interpretation of Remote Sensing Images: Dynamic Resolution Input Strategy and Multi-scale Vision-Language Alignment Mechanism | Siyu Zhang et.al. | 2512.23243 | translate | read | null |
| 2025-12-29 | Anomaly Detection by Effectively Leveraging Synthetic Images | Sungho Kang et.al. | 2512.23227 | translate | read | null |
| 2025-12-29 | Bridging Your Imagination with Audio-Video Generation via a Unified Director | Jiaxu Zhang et.al. | 2512.23222 | translate | read | null |
| 2025-12-29 | MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios? | Shiqi Dai et.al. | 2512.23219 | translate | read | null |
| 2025-12-29 | TCEval: Using Thermal Comfort to Assess Cognitive and Perceptual Abilities of AI | Jingming Li et.al. | 2512.23217 | translate | read | null |
| 2025-12-29 | Anka: A Domain-Specific Language for Reliable LLM Code Generation | Saif Khalfan Saif Al Mazrouei et.al. | 2512.23214 | translate | read | null |
| 2025-12-29 | Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process | Zhijun Chen et.al. | 2512.23213 | translate | read | null |
| 2025-12-29 | Not too long do read: Evaluating LLM-generated extreme scientific summaries | Zhuoqi Lyu et.al. | 2512.23206 | translate | read | null |
| 2025-12-29 | From Model Choice to Model Belief: Establishing a New Measure for LLM-Based Research | Hongshen Sun et.al. | 2512.23184 | translate | read | null |
| 2025-12-29 | EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion | Zhen Liang et.al. | 2512.23173 | translate | read | null |
| 2025-12-29 | REVEALER: Reinforcement-Guided Visual Reasoning for Element-Level Text-Image Alignment Evaluation | Fulin Shi et.al. | 2512.23169 | translate | read | null |
| 2025-12-29 | SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search | Yifan Zhang et.al. | 2512.23167 | translate | read | null |
| 2025-12-29 | Reservoir Computing inspired Matrix Multiplication-free Language Model | Takumi Shiratsuchi et.al. | 2512.23145 | translate | read | null |
| 2025-12-29 | Understanding EFL Learners’ Code-Switching and Teachers’ Pedagogical Approaches in LLM-Supported Speaking Practice | Junyeong Park et.al. | 2512.23136 | translate | read | null |
| 2025-12-29 | It’s a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents | Karolina Korgul et.al. | 2512.23128 | translate | read | null |
| 2025-12-29 | InSPO: Unlocking Intrinsic Self-Reflection for LLM Preference Optimization | Yu Li et.al. | 2512.23126 | translate | read | null |
| 2025-12-28 | A Note on Hybrid Online Reinforcement and Imitation Learning for LLMs: Formulations and Algorithms | Yingru Li et.al. | 2512.23097 | translate | read | null |
| 2025-12-28 | Benchmark Success, Clinical Failure: When Reinforcement Learning Optimizes for Benchmarks, Not Patients | Armin Berger et.al. | 2512.23090 | translate | read | null |
| 2025-12-28 | Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning | Yingru Li et.al. | 2512.23087 | translate | read | null |
| 2025-12-28 | Trust Region Masking for Long-Horizon LLM Reinforcement Learning | Yingru Li et.al. | 2512.23075 | translate | read | null |
| 2025-12-28 | Accelerating Language Model Workflows with Prompt Choreography | TJ Bai et.al. | 2512.23049 | translate | read | null |
| 2025-12-28 | Problems With Large Language Models for Learner Modelling: Why LLMs Alone Fall Short for Responsible Tutoring in K–12 Education | Danial Hooshyar et.al. | 2512.23036 | translate | read | null |
| 2025-12-28 | Viability and Performance of a Private LLM Server for SMBs: A Benchmark Analysis of Qwen3-30B on Consumer-Grade Hardware | Alex Khalil et.al. | 2512.23029 | translate | read | null |
| 2025-12-28 | With Great Context Comes Great Prediction Power: Classifying Objects via Geo-Semantic Scene Graphs | Ciprian Constantinescu et.al. | 2512.23024 | translate | read | null |
| 2025-12-28 | Merge before Forget: A Single LoRA Continual Learning via Continual Merging | Fuli Qiao et.al. | 2512.23017 | translate | read | null |
| 2025-12-28 | Improving Generalization in LLM Structured Pruning via Function-Aware Neuron Grouping | Tao Yu et.al. | 2512.23014 | translate | read | null |
| 2025-12-28 | Masgent: An AI-assisted Materials Simulation Agent | Guanghen Liu et.al. | 2512.23010 | translate | read | null |
| 2025-12-28 | Prompt engineering does not universally improve Large Language Model performance across clinical decision-making tasks | Mengdi Chai et.al. | 2512.22966 | translate | read | null |
| 2025-12-28 | Diversity or Precision? A Deep Dive into Next Token Prediction | Haoyuan Wu et.al. | 2512.22955 | translate | read | null |
| 2025-12-28 | Multimodal Fact-Checking: An Agent-based Approach | Danni Xu et.al. | 2512.22933 | translate | read | null |
| 2025-12-28 | Argus: Token Aware Distributed LLM Inference Optimization | Panlong Wu et.al. | 2512.22925 | translate | read | null |
| 2025-12-28 | JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation | Kai Liu et.al. | 2512.22905 | translate | read | link |
| 2025-12-28 | Debugging Tabular Log as Dynamic Graphs | Chumeng Liang et.al. | 2512.22903 | translate | read | null |
| 2025-12-28 | HiSciBench: A Hierarchical Multi-disciplinary Benchmark for Scientific Intelligence from Reading to Discovery | Yaping Zhang et.al. | 2512.22899 | translate | read | null |
| 2025-12-28 | Theory and Algorithms for Learning with Multi-Class Abstention and Multi-Expert Deferral | Anqi Mao et.al. | 2512.22886 | translate | read | null |
| 2025-12-28 | Agentic AI for Cyber Resilience: A New Security Paradigm and Its System-Theoretic Foundations | Tao Li et.al. | 2512.22883 | translate | read | null |
| 2025-12-28 | FasterPy: An LLM-based Code Execution Efficiency Optimization Framework | Yue Wu et.al. | 2512.22827 | translate | read | null |
| 2025-12-28 | NepEMO: A Multi-Label Emotion and Sentiment Analysis on Nepali Reddit with Linguistic Insights and Temporal Trends | Sameer Sitoula et.al. | 2512.22823 | translate | read | null |
| 2025-12-28 | VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM | Jingchao Wang et.al. | 2512.22799 | translate | read | link |
| 2025-12-28 | CNSight: Evaluation of Clinical Note Segmentation Tools | Risha Surana et.al. | 2512.22795 | translate | read | null |
| 2025-12-28 | ChatGraPhT: A Visual Conversation Interface for Multi-Path Reflection with Agentic LLM Support | Geoff Kimm et.al. | 2512.22790 | translate | read | null |
| 2025-12-28 | Understanding the Mechanisms of Fast Hyperparameter Transfer | Nikhil Ghosh et.al. | 2512.22768 | translate | read | null |
| 2025-12-28 | Bridging Global Intent with Local Details: A Hierarchical Representation Approach for Semantic Validation in Text-to-SQL | Rihong Qiu et.al. | 2512.22744 | translate | read | null |
| 2025-12-28 | Robust LLM-based Column Type Annotation via Prompt Augmentation with LoRA Tuning | Hanze Meng et.al. | 2512.22742 | translate | read | null |
| 2025-12-28 | Text-Routed Sparse Mixture-of-Experts Model with Explanation and Temporal Alignment for Multi-Modal Sentiment Analysis | Dongning Rao et.al. | 2512.22741 | translate | read | null |
| 2025-12-28 | Harnessing Large Language Models for Biomedical Named Entity Recognition | Jian Chen et.al. | 2512.22738 | translate | read | null |
| 2025-12-28 | WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference | Aiwei Liu et.al. | 2512.22737 | translate | read | null |
| 2025-12-28 | FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents | Jiaqi Shao et.al. | 2512.22733 | translate | read | link |
| 2025-12-27 | Mitigating Social Desirability Bias in Random Silicon Sampling | Sashank Chapala et.al. | 2512.22725 | translate | read | null |
| 2025-12-27 | Cyber Resilience in Next-Generation Networks: Threat Landscape, Theoretical Foundations, and Design Paradigms | Junaid Farooq et.al. | 2512.22721 | translate | read | null |
| 2025-12-27 | Memento-II: Learning by Stateful Reflective Memory | Jun Wang et.al. | 2512.22716 | translate | read | null |
| 2025-12-27 | Beg to Differ: Understanding Reasoning-Answer Misalignment Across Languages | Anaelia Ovalle et.al. | 2512.22712 | translate | read | null |
| 2025-12-27 | Modality Inflation: Energy Characterization and Optimization Opportunities for MLLM Inference | Mona Moghadampanah et.al. | 2512.22695 | translate | read | null |
| 2025-12-27 | Conformal Prediction Sets for Next-Token Prediction in Large Language Models: Balancing Coverage Guarantees with Set Efficiency | Yoshith Roy Kotla et.al. | 2512.22682 | translate | read | null |
| 2025-12-27 | CritiFusion: Semantic Critique and Spectral Alignment for Faithful Text-to-Image Generation | ZhenQi Chen et.al. | 2512.22681 | translate | read | null |
| 2025-12-27 | From Electrochemical Energy Storage to Next-Generation Intelligent Battery Technologies for Electric Vehicles: A Survey | Abderaouf Bahi et.al. | 2512.22680 | translate | read | null |
| 2025-12-27 | TravelBench: A Real-World Benchmark for Multi-Turn and Tool-Augmented Travel Planning | Xiang Cheng et.al. | 2512.22673 | translate | read | null |
| 2025-12-23 | Making Large Language Models Efficient Dense Retrievers | Yibin Lei et.al. | 2512.20612 | translate | read | null |
| 2025-12-23 | MoE-DiffuSeq: Enhancing Long-Document Diffusion Models with Sparse Attention and Mixture of Experts | Alexandros Christoforos et.al. | 2512.20604 | translate | read | null |
| 2025-12-23 | Cube Bench: A Benchmark for Spatial Visual Reasoning in MLLMs | Dhruv Anand et.al. | 2512.20595 | translate | read | null |
| 2025-12-23 | Automated stereotactic radiosurgery planning using a human-in-the-loop reasoning large language model agent | Humza Nusrat et.al. | 2512.20586 | translate | read | null |
| 2025-12-23 | Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits | Amirhosein Ghasemabadi et.al. | 2512.20578 | translate | read | null |
| 2025-12-23 | Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs | Rui Pan et.al. | 2512.20573 | translate | read | null |
| 2025-12-23 | LLM-Based Authoring of Agent-Based Narratives through Scene Descriptions | Vinayak Regmi et.al. | 2512.20550 | translate | read | null |
| 2025-12-23 | Advancing Multimodal Teacher Sentiment Analysis:The Large-Scale T-MED Dataset & The Effective AAM-TSA Model | Zhiyi Duan et.al. | 2512.20548 | translate | read | null |
| 2025-12-23 | Benchmarking LLMs for Predictive Applications in the Intensive Care Units | Chehak Malhotra et.al. | 2512.20520 | translate | read | null |
| 2025-12-23 | Coherence in the brain unfolds across separable temporal regimes | Davide Stauba et.al. | 2512.20481 | translate | read | null |
| 2025-12-23 | UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images | Yiming Zhao et.al. | 2512.20479 | translate | read | null |
| 2025-12-23 | Laser: Governing Long-Horizon Agentic Search via Structured Protocol and Context Register | Shuting Wang et.al. | 2512.20458 | translate | read | null |
| 2025-12-23 | Topic-informed dynamic mixture model for occupational heterogeneity in health risk behaviors | Lorenzo Schiavon et.al. | 2512.20408 | translate | read | null |
| 2025-12-23 | ChatGPT: Excellent Paper! Accept It. Editor: Imposter Found! Review Rejected | Kanchon Gharami et.al. | 2512.20405 | translate | read | null |
| 2025-12-23 | CRAFT: Continuous Reasoning and Agentic Feedback Tuning for Multimodal Text-to-Image Generation | V. Kovalev et.al. | 2512.20362 | translate | read | null |
| 2025-12-23 | A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice | Yaowei Bai et.al. | 2512.20344 | translate | read | null |
| 2025-12-23 | Comment Traps: How Defective Commented-out Code Augment Defects in AI-Assisted Code Generation | Yuan Huang et.al. | 2512.20334 | translate | read | null |
| 2025-12-23 | SynCraft: Guiding Large Language Models to Predict Edit Sequences for Molecular Synthesizability Optimization | Junren Li et.al. | 2512.20333 | translate | read | null |
| 2025-12-23 | Toward Explaining Large Language Models in Software Engineering Tasks | Antonio Vitale et.al. | 2512.20328 | translate | read | null |
| 2025-12-23 | Can LLMs Solve My Grandma’s Riddle? Evaluating Multilingual Large Language Models on Reasoning Traditional Bangla Tricky Riddles | Nurul Labib Sayeedi et.al. | 2512.20324 | translate | read | null |
| 2025-12-23 | TableGPT-R1: Advancing Tabular Reasoning Through Reinforcement Learning | Saisai Yang et.al. | 2512.20312 | translate | read | null |
| 2025-12-23 | Structured Visualization Design Knowledge for Grounding Generative Reasoning and Situated Feedback | Péter Ferenc Gyarmati et.al. | 2512.20306 | translate | read | null |
| 2025-12-23 | AprielGuard | Jaykumar Kasundra et.al. | 2512.20293 | translate | read | null |
| 2025-12-23 | Synthesizing Procedural Memory: Challenges and Architectures in Automated Workflow Generation | Nishant Gaurav et.al. | 2512.20278 | translate | read | null |
| 2025-12-23 | Graph-Symbolic Policy Enforcement and Control (G-SPEC): A Neuro-Symbolic Framework for Safe Agentic AI in 5G Autonomous Networks | Divya Vijay et.al. | 2512.20275 | translate | read | null |
| 2025-12-23 | Memory as Resonance: A Biomimetic Architecture for Infinite Context Memory on Ergodic Phonetic Manifolds | Tarik Houichime et.al. | 2512.20245 | translate | read | null |
| 2025-12-23 | MemR $^3$ : Memory Retrieval via Reflective Reasoning for LLM Agents | Xingbo Du et.al. | 2512.20237 | translate | read | null |
| 2025-12-23 | Quantitative Financial Modeling for Sri Lankan Markets: Approach Combining NLP, Clustering and Time-Series Forecasting | Linuk Perera et.al. | 2512.20216 | translate | read | null |
| 2025-12-23 | Predictive-LoRA: A Proactive and Fragmentation-Aware Serverless Inference System for LLMs | Yinan Ni et.al. | 2512.20210 | translate | read | null |
| 2025-12-23 | TongSIM: A General Platform for Simulating Intelligent Machines | Zhe Sun et.al. | 2512.20206 | translate | read | null |
| 2025-12-23 | Corpus of Cross-lingual Dialogues with Minutes and Detection of Misunderstandings | Marko Čechovič et.al. | 2512.20204 | translate | read | null |
| 2025-12-23 | Well Begun is Half Done: Location-Aware and Trace-Guided Iterative Automated Vulnerability Repair | Zhenlei Ye et.al. | 2512.20203 | translate | read | null |
| 2025-12-23 | Designing Spatial Architectures for Sparse Attention: STAR Accelerator via Cross-Stage Tiling | Huizheng Wang et.al. | 2512.20198 | translate | read | null |
| 2025-12-23 | FaithLens: Detecting and Explaining Faithfulness Hallucination | Shuzheng Si et.al. | 2512.20182 | translate | read | null |
| 2025-12-23 | Optimistic TEE-Rollups: A Hybrid Architecture for Scalable and Verifiable Generative AI Inference on Blockchain | Aaron Chan et.al. | 2512.20176 | translate | read | null |
| 2025-12-23 | Towards Natural Language-Based Document Image Retrieval: New Dataset and Benchmark | Hao Guo et.al. | 2512.20174 | translate | read | null |
| 2025-12-23 | Learning to Reason in LLMs by Expectation Maximization | Junghyun Lee et.al. | 2512.20169 | translate | read | null |
| 2025-12-23 | Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography | Songze Li et.al. | 2512.20168 | translate | read | null |
| 2025-12-23 | AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications | Honglin Mu et.al. | 2512.20164 | translate | read | null |
| 2025-12-23 | Concept Generalization in Humans and Large Language Models: Insights from the Number Game | Arghavan Bazigaran et.al. | 2512.20162 | translate | read | null |
| 2025-12-23 | AXIOM: Benchmarking LLM-as-a-Judge for Code via Rule-Based Perturbation and Multisource Quality Calibration | Ruiqi Wang et.al. | 2512.20159 | translate | read | null |
| 2025-12-23 | Multi-hop Reasoning via Early Knowledge Alignment | Yuxin Wang et.al. | 2512.20144 | translate | read | null |
| 2025-12-23 | Enhancing Zero-Shot Time Series Forecasting in Off-the-Shelf LLMs via Noise Injection | Xingyou Yin et.al. | 2512.20140 | translate | read | null |
| 2025-12-23 | M $^3$ KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation | Hyeongcheol Park et.al. | 2512.20136 | translate | read | null |
| 2025-12-23 | A Novel Graph-Sequence Learning Model for Inductive Text Classification | Zuo Wang et.al. | 2512.20097 | translate | read | null |
| 2025-12-23 | QE-Catalytic: A Graph-Language Multimodal Base Model for Relaxed-Energy Prediction in Catalytic Adsorption | Yanjie Li et.al. | 2512.20084 | translate | read | null |
| 2025-12-23 | Adaptive Financial Sentiment Analysis for NIFTY 50 via Instruction-Tuned LLMs , RAG and Reinforcement Learning Approaches | Chaithra et.al. | 2512.20082 | translate | read | null |
| 2025-12-23 | Reason2Decide: Rationale-Driven Multi-Task Learning | H M Quamran Hasan et.al. | 2512.20074 | translate | read | null |
| 2025-12-23 | On the Effectiveness of Instruction-Tuning Local LLMs for Identifying Software Vulnerabilities | Sangryu Park et.al. | 2512.20062 | translate | read | null |
| 2025-12-23 | Scaling Reinforcement Learning for Content Moderation with Large Language Models | Hamed Firooz et.al. | 2512.20061 | translate | read | null |
| 2025-12-23 | Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieva | Nguyen Lam Phu Quy et.al. | 2512.20042 | translate | read | null |
| 2025-12-23 | VSA:Visual-Structural Alignment for UI-to-Code | Xian Wu et.al. | 2512.20034 | translate | read | null |
| 2025-12-23 | VALLR-Pin: Dual-Decoding Visual Speech Recognition for Mandarin with Pinyin-Guided LLM Refinement | Chang Sun et.al. | 2512.20032 | translate | read | null |
| 2025-12-23 | LLM-Assisted Abstract Screening with OLIVER: Evaluating Calibration and Single-Model vs. Actor-Critic Configurations in Literature Reviews | Kian Godhwani et.al. | 2512.20022 | translate | read | null |
| 2025-12-23 | Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems | Qiushuo Hou et.al. | 2512.20012 | translate | read | null |
| 2025-12-23 | LoFT-LLM: Low-Frequency Time-Series Forecasting with Large Language Models | Jiacheng You et.al. | 2512.20002 | translate | read | null |
| 2025-12-23 | Schoenfeld’s Anatomy of Mathematical Reasoning by Language Models | Ming Li et.al. | 2512.19995 | translate | read | null |
| 2025-12-23 | S $^3$ IT: A Benchmark for Spatially Situated Social Intelligence Test | Zhe Sun et.al. | 2512.19992 | translate | read | null |
| 2025-12-23 | Bias Beneath the Tone: Empirical Characterisation of Tone Bias in LLM-Driven UX Systems | Heet Bodara et.al. | 2512.19950 | translate | read | null |
| 2025-12-23 | Interpolative Decoding: Exploring the Spectrum of Personality Traits in LLMs | Eric Yeh et.al. | 2512.19937 | translate | read | null |
| 2025-12-22 | Conditional Adversarial Fragility in Financial Machine Learning under Macroeconomic Stress | Samruddhi Baviskar et.al. | 2512.19935 | translate | read | null |
| 2025-12-22 | PRISM: A Personality-Driven Multi-Agent Framework for Social Media Simulation | Zhixiang Lu et.al. | 2512.19933 | translate | read | null |
| 2025-12-22 | Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs | Houston H. Zhang et.al. | 2512.19918 | translate | read | null |
| 2025-12-22 | Demystifying LLM-as-a-Judge: Analytically Tractable Model for Inference-Time Scaling | Indranil Halder et.al. | 2512.19905 | translate | read | null |
| 2025-12-22 | How well do Large Language Models Recognize Instructional Moves? Establishing Baselines for Foundation Models in Educational Discourse | Kirk Vanacore et.al. | 2512.19903 | translate | read | null |
| 2025-12-22 | Larger Is Not Always Better: Leveraging Structured Code Diffs for Comment Inconsistency Detection | Phong Nguyen et.al. | 2512.19883 | translate | read | null |
| 2025-12-22 | Fine-Tuned In-Context Learners for Efficient Adaptation | Jorg Bornschein et.al. | 2512.19879 | translate | read | null |
| 2025-12-22 | CS-Guide: Leveraging LLMs and Student Reflections to Provide Frequent, Scalable Academic Monitoring Feedback to Computer Science Students | Samuel Jacob Chacko et.al. | 2512.19866 | translate | read | null |
| 2025-12-22 | HARMON-E: Hierarchical Agentic Reasoning for Multimodal Oncology Notes to Extract Structured Data | Shashi Kant Gupta et.al. | 2512.19864 | translate | read | null |
| 2025-12-22 | From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs | Mingrui Wu et.al. | 2512.19683 | translate | read | null |
| 2025-12-22 | GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators | Jiacheng Guo et.al. | 2512.19682 | translate | read | link |
| 2025-12-22 | Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918) | Niclas Griesshaber et.al. | 2512.19675 | translate | read | null |
| 2025-12-22 | Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies | Yuqiao Tan et.al. | 2512.19673 | translate | read | link |
| 2025-12-22 | Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis | Argha Kamal Samanta et.al. | 2512.19663 | translate | read | null |
| 2025-12-22 | Exploring Zero-Shot ACSA with Unified Meaning Representation in Chain-of-Thought Prompting | Filippos Ventirozos et.al. | 2512.19651 | translate | read | null |
| 2025-12-22 | Exploring the features used for summary evaluation by Human and GPT | Zahra Sadeghi et.al. | 2512.19620 | translate | read | null |
| 2025-12-22 | MapTrace: Scalable Data Generation for Route Tracing on Maps | Artemis Panagopoulou et.al. | 2512.19609 | translate | read | null |
| 2025-12-22 | RAPID-LLM: Resilience-Aware Performance analysis of Infrastructure for Distributed LLM Training and Inference | George Karfakis et.al. | 2512.19606 | translate | read | null |
| 2025-12-22 | Increasing the Thinking Budget is Not All You Need | Ignacio Iacobacci et.al. | 2512.19585 | translate | read | null |
| 2025-12-22 | The Epistemological Consequences of Large Language Models: Rethinking collective intelligence and institutional knowledge | Angjelin Hila et.al. | 2512.19570 | translate | read | null |
| 2025-12-22 | Algerian Dialect | Zakaria Benmounah et.al. | 2512.19543 | translate | read | null |
| 2025-12-22 | Event Extraction in Large Language Model | Bobo Li et.al. | 2512.19537 | translate | read | null |
| 2025-12-22 | Learning Continuous Solvent Effects from Transient Flow Data: A Graph Neural Network Benchmark on Catechol Rearrangement | Hongsheng Xing et.al. | 2512.19530 | translate | read | null |
| 2025-12-22 | Anatomy-R1: Enhancing Anatomy Reasoning in Multimodal Large Language Models via Anatomical Similarity Curriculum and Group Diversity Augmentation | Ziyang Song et.al. | 2512.19512 | translate | read | null |
| 2025-12-22 | Structured Event Representation and Stock Return Predictability | Gang Li et.al. | 2512.19484 | translate | read | null |
| 2025-12-22 | A Dataset and Preliminary Study of Using GPT-5 for Code-change Impact Analysis | Katharina Stengg et.al. | 2512.19481 | translate | read | null |
| 2025-12-22 | A Large-Language-Model Framework for Automated Humanitarian Situation Reporting | Ivan Decostanzi et.al. | 2512.19475 | translate | read | null |
| 2025-12-22 | Epistemological Fault Lines Between Human and Artificial Intelligence | Walter Quattrociocchi et.al. | 2512.19466 | translate | read | null |
| 2025-12-22 | An Agentic Framework for Autonomous Materials Computation | Zeyu Xia et.al. | 2512.19458 | translate | read | null |
| 2025-12-22 | Activations as Features: Probing LLMs for Generalizable Essay Scoring Representations | Jinwei Chi et.al. | 2512.19456 | translate | read | null |
| 2025-12-22 | SiamGPT: Quality-First Fine-Tuning for Stable Thai Text Generation | Thittipat Pairatsuppawat et.al. | 2512.19455 | translate | read | null |
| 2025-12-22 | D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning | Evelyn Zhang et.al. | 2512.19443 | translate | read | null |
| 2025-12-22 | dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models | Yi Xin et.al. | 2512.19433 | translate | read | link |
| 2025-12-22 | CodeSimpleQA: Scaling Factuality in Code Large Language Models | Jian Yang et.al. | 2512.19424 | translate | read | null |
| 2025-12-22 | From Retrieval to Reasoning: A Framework for Cyber Threat Intelligence NER with Explicit and Adaptive Instructions | Jiaren Peng et.al. | 2512.19414 | translate | read | null |
| 2025-12-22 | Brain-Grounded Axes for Reading and Steering LLM States | Sandro Andric et.al. | 2512.19399 | translate | read | link |
| 2025-12-22 | HATS: High-Accuracy Triple-Set Watermarking for Large Language Models | Zhiqing Hu et.al. | 2512.19378 | translate | read | null |
| 2025-12-22 | Generative vector search to improve pathology foundation models across multimodal vision-language tasks | Markus Ekvall et.al. | 2512.19360 | translate | read | null |
| 2025-12-22 | ReasonCD: A Multimodal Reasoning Large Model for Implicit Change-of-Interest Semantic Mining | Zhenyang Huang et.al. | 2512.19354 | translate | read | null |
| 2025-12-22 | PENDULUM: A Benchmark for Assessing Sycophancy in Multimodal Large Language Models | A. B. M. Ashikur Rahman et.al. | 2512.19350 | translate | read | null |
| 2025-12-22 | VIGOR+: Iterative Confounder Generation and Validation via LLM-CEVAE Feedback Loop | JiaWei Zhu et.al. | 2512.19349 | translate | read | null |
| 2025-12-22 | SafeMed-R1: Adversarial Reinforcement Learning for Generalizable and Robust Medical Reasoning in Vision-Language Models | A. A. Gde Yogi Pramana et.al. | 2512.19317 | translate | read | null |
| 2025-12-22 | CienaLLM: Generative Climate-Impact Extraction from News Articles with Autoregressive LLMs | Javier Vela-Tambo et.al. | 2512.19305 | translate | read | null |
| 2025-12-22 | Helios: A Foundational Language Model for Smart Energy Knowledge Reasoning and Application | Haoyu Jiang et.al. | 2512.19299 | translate | read | null |
| 2025-12-22 | Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models | Linzhi Chen et.al. | 2512.19297 | translate | read | null |
| 2025-12-22 | Auto-Prompting with Retrieval Guidance for Frame Detection in Logistics | Do Minh Duc et.al. | 2512.19247 | translate | read | null |
| 2025-12-22 | ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models | Mingxu Zhang et.al. | 2512.19240 | translate | read | null |
| 2025-12-22 | Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation | Anna-Maria Gueorguieva et.al. | 2512.19238 | translate | read | null |
| 2025-12-22 | Generation of Programmatic Rules for Document Forgery Detection Using Large Language Models | Valentin Schmidberger et.al. | 2512.19228 | translate | read | null |
| 2025-12-22 | Observer, Not Player: Simulating Theory of Mind in LLMs through Game Observation | Jerry Wang et.al. | 2512.19210 | translate | read | null |
| 2025-12-22 | MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning | Tao Zhang et.al. | 2512.19206 | translate | read | null |
| 2025-12-22 | Configuration Work: Four Consequences of LLMs-in-use | Gabriel Alcaras et.al. | 2512.19189 | translate | read | null |
| 2025-12-22 | L4: Low-Latency and Load-Balanced LLM Serving via Length-Aware Scheduling | Yitao Yuan et.al. | 2512.19179 | translate | read | null |
| 2025-12-22 | OmniMoGen: Unifying Human Motion Generation via Learning from Interleaved Text-Motion Instructions | Wendong Bu et.al. | 2512.19159 | translate | read | null |
| 2025-12-22 | Understanding Chain-of-Thought in Large Language Models via Topological Data Analysis | Chenghao Li et.al. | 2512.19135 | translate | read | null |
| 2025-12-22 | QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation | Dehai Min et.al. | 2512.19134 | translate | read | link |
| 2025-12-22 | AWPO: Enhancing Tool-Use of Large Language Models through Explicit Integration of Reasoning Rewards | Zihan Lin et.al. | 2512.19126 | translate | read | null |
| 2025-12-22 | Stop saying LLM: Large Discourse Models (LDM) and Artificial Discursive Agent (ADA)? | Amar Lakel et.al. | 2512.19117 | translate | read | null |
| 2025-12-22 | Generative Giants, Retrieval Weaklings: Why do Multimodal Large Language Models Fail at Multimodal Retrieval? | Hengyi Feng et.al. | 2512.19115 | translate | read | null |
| 2025-12-22 | HyperLoad: A Cross-Modality Enhanced Large Language Model-Based Framework for Green Data Center Cooling Load Prediction | Haoyu Jiang et.al. | 2512.19114 | translate | read | null |
| 2025-12-22 | FC-MIR: A Mobile Screen Awareness Framework for Intent-Aware Recommendation based on Frame-Compressed Multimodal Trajectory Reasoning | Zhe Yang et.al. | 2512.19107 | translate | read | null |
| 2025-12-22 | Tool-Augmented Hybrid Ensemble Reasoning with Distillation for Bilingual Mathematical Problem Solving | Peiqing Lu et.al. | 2512.19093 | translate | read | null |
| 2025-12-22 | A Large Language Model Based Method for Complex Logical Reasoning over Knowledge Graphs | Ziyan Zhang et.al. | 2512.19092 | translate | read | null |
| 2025-12-22 | Population-Evolve: a Parallel Sampling and Evolutionary Method for LLM Math Reasoning | Yanzhi Zhang et.al. | 2512.19081 | translate | read | null |
| 2025-12-22 | Watch Closely: Mitigating Object Hallucinations in Large Vision-Language Models with Disentangled Decoding | Ruiqi Ma et.al. | 2512.19070 | translate | read | null |
| 2025-12-22 | Can abstract concepts from LLM improve SLM performance? | Siddharth Tandon et.al. | 2512.19069 | translate | read | null |
| 2025-12-22 | Finer-Personalization Rank: Fine-Grained Retrieval Examines Identity Preservation for Personalized Generation | Connor Kilrain et.al. | 2512.19026 | translate | read | null |
| 2025-12-22 | The Erasure Illusion: Stress-Testing the Generalization of LLM Forgetting Evaluation | Hengrui Jia et.al. | 2512.19025 | translate | read | null |
| 2025-12-22 | PEAK: A Performance Engineering AI-Assistant for GPU Kernels Powered by Natural Language Transformations | Muhammad Usman Tariq et.al. | 2512.19018 | translate | read | null |
| 2025-12-22 | DREAM: Dynamic Red-teaming across Environments for AI Models | Liming Lu et.al. | 2512.19016 | translate | read | null |
| 2025-12-22 | Efficient Jailbreak Mitigation Using Semantic Linear Classification in a Multi-Staged Pipeline | Akshaj Prashanth Rao et.al. | 2512.19011 | translate | read | null |
| 2025-12-22 | Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models | Tongyuan Miao et.al. | 2512.19004 | translate | read | null |
| 2025-12-22 | Evaluating the Challenges of LLMs in Real-world Medical Follow-up: A Comparative Study and An Optimized Framework | Jinyan Liu et.al. | 2512.18999 | translate | read | null |
| 2025-12-22 | R-GenIMA: Integrating Neuroimaging and Genetics with Interpretable Multimodal AI for Alzheimer’s Disease Progression | Kun Zhao et.al. | 2512.18986 | translate | read | null |
| 2025-12-22 | Scrum Sprint Planning: LLM-based and algorithmic solutions | Yuwon Yoon et.al. | 2512.18966 | translate | read | null |
| 2025-12-22 | Learning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive Refinement | Saman Forouzandeh et.al. | 2512.18950 | translate | read | null |
| 2025-12-22 | FASTRIC: Prompt Specification Language for Verifiable LLM Interactions | Wen-Long Jin et.al. | 2512.18940 | translate | read | null |
| 2025-12-22 | When Less is More: 8-bit Quantization Improves Continual Learning in Large Language Models | Michael S. Zhang et.al. | 2512.18934 | translate | read | null |
| 2025-12-21 | An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects | Shaokang Jiang et.al. | 2512.18925 | translate | read | null |
| 2025-12-21 | Delta-LLaVA: Base-then-Specialize Alignment for Token-Efficient Vision-Language Models | Mohamad Zamini et.al. | 2512.18910 | translate | read | null |
| 2025-12-21 | Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models | Gökdeniz Gülmez et.al. | 2512.18901 | translate | read | null |
| 2025-12-21 | Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction | Ming Li et.al. | 2512.18880 | translate | read | null |
| 2025-12-21 | CrashChat: A Multimodal Large Language Model for Multitask Traffic Crash Video Analysis | Kaidi Liang et.al. | 2512.18878 | translate | read | null |
| 2025-12-21 | CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning | Zijun Gao et.al. | 2512.18857 | translate | read | null |
| 2025-12-21 | VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference | Sicheng Song et.al. | 2512.18853 | translate | read | null |
| 2025-12-21 | MDToC: Metacognitive Dynamic Tree of Concepts for Boosting Mathematical Problem-Solving of Large Language Models | Tung Duong Ta et.al. | 2512.18841 | translate | read | null |
| 2025-12-21 | From Word to World: Can Large Language Models be Implicit Text-based World Models? | Yixia Li et.al. | 2512.18832 | translate | read | null |
| 2025-12-21 | HARBOR: Holistic Adaptive Risk assessment model for BehaviORal healthcare | Aditya Siddhant et.al. | 2512.18829 | translate | read | null |
| 2025-12-21 | “Even GPT Can Reject Me”: Conceptualizing Abrupt Refusal Secondary Harm (ARSH) and Reimagining Psychological AI Safety with Compassionate Completion Standard (CCS) | Yang Ni et.al. | 2512.18776 | translate | read | null |
| 2025-12-21 | MEEA: Mere Exposure Effect-Driven Confrontational Optimization for LLM Jailbreaking | Jianyi Zhang et.al. | 2512.18755 | translate | read | null |
| 2025-12-21 | Code2Doc: A Quality-First Curated Dataset for Code Documentation | Recep Kaan Karaman et.al. | 2512.18748 | translate | read | null |
| 2025-12-21 | IPCV: Information-Preserving Compression for MLLM Visual Encoders | Yuan Chen et.al. | 2512.18747 | translate | read | null |
| 2025-12-21 | MemEvolve: Meta-Evolution of Agent Memory Systems | Guibin Zhang et.al. | 2512.18746 | translate | read | null |
| 2025-12-21 | Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection | Junjun Pan et.al. | 2512.18733 | translate | read | null |
| 2025-12-21 | A Theoretical Lens for RL-Tuned Language Models via Energy-Based Models | Zhiquan Tan et.al. | 2512.18730 | translate | read | null |
| 2025-12-21 | Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design | Yuchen Li et.al. | 2512.18682 | translate | read | null |
| 2025-12-21 | Remoe: Towards Efficient and Low-Cost MoE Inference in Serverless Computing | Wentao Liu et.al. | 2512.18674 | translate | read | null |
| 2025-12-21 | SmartSight: Mitigating Hallucination in Video-LLMs Without Compromising Video Understanding via Temporal Attention Collapse | Yiming Sun et.al. | 2512.18671 | translate | read | null |
| 2025-12-21 | Tackling dataset curation challenges towards reliable machine learning: a case study on thermoelectric materials | Shoeb Athar et.al. | 2512.18653 | translate | read | null |
| 2025-12-21 | LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction | Jensen Zhang et.al. | 2512.18623 | translate | read | null |
| 2025-12-21 | A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback | Thanh Dat Hoang et.al. | 2512.18622 | translate | read | null |
| 2025-12-21 | A Comparative Study of Light-weight Language Models for PII Masking and their Deployment for Real Conversational Texts | Prabigya Acharya et.al. | 2512.18608 | translate | read | null |
| 2025-12-21 | Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction | Qinglin Zeng et.al. | 2512.18605 | translate | read | null |
| 2025-12-21 | SimpleCall: A Lightweight Image Restoration Agent in Label-Free Environments with MLLM Perceptual Feedback | Jianglin Lu et.al. | 2512.18599 | translate | read | null |
| 2025-12-21 | Wireless Copilot: An AI-Powered Partner for Navigating Next-Generation Wireless Complexity | Haoxiang Luo et.al. | 2512.18582 | translate | read | null |
| 2025-12-21 | ESearch-R1: Learning Cost-Aware MLLM Agents for Interactive Embodied Search via Reinforcement Learning | Weijie Zhou et.al. | 2512.18571 | translate | read | null |
| 2025-12-21 | AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software | Bin Wang et.al. | 2512.18567 | translate | read | null |
| 2025-12-21 | Vox Deorum: A Hybrid LLM Architecture for 4X / Grand Strategy Game AI – Lessons from Civilization V | John Chen et.al. | 2512.18564 | translate | read | null |
| 2025-12-21 | OpenView: Empowering MLLMs with Out-of-view VQA | Qixiang Chen et.al. | 2512.18563 | translate | read | null |
| 2025-12-18 | AdaTooler-V: Adaptive Tool-Use for Images and Videos | Chaoyang Wang et.al. | 2512.16918 | translate | read | null |
| 2025-12-18 | Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning | Qihao Liu et.al. | 2512.16917 | translate | read | null |
| 2025-12-18 | Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward | Peter Chen et.al. | 2512.16912 | translate | read | null |
| 2025-12-18 | Impacts of Racial Bias in Historical Training Data for News AI | Rahul Bhargava et.al. | 2512.16901 | translate | read | null |
| 2025-12-18 | Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image | Yushi Hu et.al. | 2512.16899 | translate | read | null |
| 2025-12-18 | LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation | Haichao Zhang et.al. | 2512.16891 | translate | read | null |
| 2025-12-18 | AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning | Tzu-Han Lin et.al. | 2512.16883 | translate | read | null |
| 2025-12-18 | TOGGLE: Temporal Logic-Guided Large Language Model Compression for Edge | Khurram Khalil et.al. | 2512.16855 | translate | read | null |
| 2025-12-18 | Meta-RL Induces Exploration in Language Agents | Yulun Jiang et.al. | 2512.16848 | translate | read | null |
| 2025-12-18 | Toward Systematic Counterfactual Fairness Evaluation of Large Language Models: The CAFFE Framework | Alessandra Parziale et.al. | 2512.16816 | translate | read | null |
| 2025-12-18 | From Facts to Conclusions : Integrating Deductive Reasoning in Retrieval-Augmented LLMs | Shubham Mishra et.al. | 2512.16795 | translate | read | null |
| 2025-12-18 | Inside Out: Uncovering How Comment Internalization Steers LLMs for Better or Worse | Aaron Imani et.al. | 2512.16790 | translate | read | null |
| 2025-12-18 | Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future | Tianshuai Hu et.al. | 2512.16760 | translate | read | null |
| 2025-12-18 | Plausibility as Failure: How LLMs and Humans Co-Construct Epistemic Error | Claudia Vale Oliveira et.al. | 2512.16750 | translate | read | null |
| 2025-12-18 | AI-Driven Prediction of Cancer Pain Episodes: A Hybrid Decision Support Approach | Yipeng Zhuang et.al. | 2512.16739 | translate | read | null |
| 2025-12-18 | Cyber Humanism in Education: Reclaiming Agency through AI and Learning Sciences | Giovanni Adorni et.al. | 2512.16701 | translate | read | null |
| 2025-12-18 | Do Multi-Agents Solve Better Than Single? Evaluating Agentic Frameworks for Diagram-Grounded Geometry Problem Solving and Reasoning | Mahbub E Sobhani et.al. | 2512.16698 | translate | read | null |
| 2025-12-18 | DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI | Hao Liang et.al. | 2512.16676 | translate | read | null |
| 2025-12-18 | Microsoft Academic Graph Information Retrieval for Research Recommendation and Assistance | Jacob Reiss et.al. | 2512.16661 | translate | read | null |
| 2025-12-18 | Prefix Probing: Lightweight Harmful Content Detection for Large Language Models | Jirui Yang et.al. | 2512.16650 | translate | read | null |
| 2025-12-18 | JustRL: Scaling a 1.5B LLM with a Simple RL Recipe | Bingxiang He et.al. | 2512.16649 | translate | read | null |
| 2025-12-18 | Stackelberg Learning from Human Feedback: Preference Optimization as a Sequential Game | Barna Pásztor et.al. | 2512.16626 | translate | read | null |
| 2025-12-18 | Refusal Steering: Fine-grained Control over LLM Refusal Behaviour for Sensitive Topics | Iker García-Ferrero et.al. | 2512.16602 | translate | read | null |
| 2025-12-18 | Muon is Provably Faster with Momentum Variance Reduction | Xun Qian et.al. | 2512.16598 | translate | read | null |
| 2025-12-18 | Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs | Jintao Tong et.al. | 2512.16584 | translate | read | null |
| 2025-12-18 | Non-Asymptotic Global Convergence of PPO-Clip | Yin Liu et.al. | 2512.16565 | translate | read | null |
| 2025-12-18 | Needle in the Web: A Benchmark for Retrieving Targeted Web Pages in the Wild | Yumeng Wang et.al. | 2512.16553 | translate | read | null |
| 2025-12-18 | A Systematic Study of Code Obfuscation Against LLM-based Vulnerability Detection | Xiao Li et.al. | 2512.16538 | translate | read | null |
| 2025-12-18 | From Personalization to Prejudice: Bias and Discrimination in Memory-Enhanced AI Agents for Recruitment | Himanshu Gharat et.al. | 2512.16532 | translate | read | null |
| 2025-12-18 | Scaling Laws for Energy Efficiency of Local LLMs | Ander Alvarez et.al. | 2512.16531 | translate | read | null |
| 2025-12-18 | Plain language adaptations of biomedical text using LLMs: Comparision of evaluation metrics | Primoz Kocbek et.al. | 2512.16530 | translate | read | null |
| 2025-12-18 | Efficient CPU-GPU Collaborative Inference for MoE-based LLMs on Memory-Limited Systems | En-Ming Huang et.al. | 2512.16473 | translate | read | null |
| 2025-12-18 | cuPilot: A Strategy-Coordinated Multi-agent Framework for CUDA Kernel Evolution | Jinwu Chen et.al. | 2512.16465 | translate | read | null |
| 2025-12-18 | TimeSeries2Report prompting enables adaptive large language model management of lithium-ion batteries | Jiayang Yang et.al. | 2512.16453 | translate | read | null |
| 2025-12-18 | Towards AI-Supported Research: a Vision of the TIB AIssistant | Sören Auer et.al. | 2512.16447 | translate | read | null |
| 2025-12-18 | Topic Modelling Black Box Optimization | Roman Akramov et.al. | 2512.16445 | translate | read | null |
| 2025-12-18 | TIB AIssistant: a Platform for AI-Supported Research Across Research Life Cycles | Allard Oelen et.al. | 2512.16442 | translate | read | null |
| 2025-12-18 | From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection | Hao Li et.al. | 2512.16439 | translate | read | null |
| 2025-12-18 | Introducing ORKG ASK: an AI-driven Scholarly Literature Search and Exploration System Taking a Neuro-Symbolic Approach | Allard Oelen et.al. | 2512.16425 | translate | read | null |
| 2025-12-18 | Synthelite: Chemist-aligned and feasibility-aware synthesis planning with LLMs | Nguyen Xuan-Vu et.al. | 2512.16424 | translate | read | null |
| 2025-12-18 | Large Language Models as a (Bad) Security Norm in the Context of Regulation and Compliance | Kaspar Rosager Ludvigsen et.al. | 2512.16419 | translate | read | null |
| 2025-12-18 | BrepLLM: Native Boundary Representation Understanding with Large Language Models | Liyuan Deng et.al. | 2512.16413 | translate | read | null |
| 2025-12-18 | A Network Arena for Benchmarking AI Agents on Network Troubleshooting | Zhihao Wang et.al. | 2512.16381 | translate | read | null |
| 2025-12-18 | Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs | Sara Papi et.al. | 2512.16378 | translate | read | null |
| 2025-12-18 | Factorized Video Generation: Decoupling Scene Construction and Temporal Synthesis in Text-to-Video Diffusion Models | Mariam Hassan et.al. | 2512.16371 | translate | read | null |
| 2025-12-18 | AI Needs Physics More Than Physics Needs AI | Peter Coveney et.al. | 2512.16344 | translate | read | null |
| 2025-12-18 | Design and Evaluation of Cost-Aware PoQ for Decentralized LLM Inference | Arther Tian et.al. | 2512.16317 | translate | read | null |
| 2025-12-18 | Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation | Yuxuan Qiao et.al. | 2512.16310 | translate | read | null |
| 2025-12-18 | PixelArena: A benchmark for Pixel-Precision Visual Intelligence | Feng Liang et.al. | 2512.16303 | translate | read | null |
| 2025-12-18 | Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection | Fanrui Zhang et.al. | 2512.16300 | translate | read | null |
| 2025-12-18 | Feature-Selective Representation Misdirection for Machine Unlearning | Taozhao Chen et.al. | 2512.16297 | translate | read | null |
| 2025-12-18 | MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval | Amna Amir et.al. | 2512.16294 | translate | read | null |
| 2025-12-18 | Ein Typenrad auf der Überholspur: Die Kult-Schreibmaschine “Erika” trifft KI | Karola Köpferl et.al. | 2512.16293 | translate | read | null |
| 2025-12-18 | In-Context Probing for Membership Inference in Fine-Tuned Language Models | Zhexi Lu et.al. | 2512.16292 | translate | read | null |
| 2025-12-18 | Evaluating OpenAI GPT Models for Translation of Endangered Uralic Languages: A Comparison of Reasoning and Non-Reasoning Architectures | Yehor Tereshchenko et.al. | 2512.16287 | translate | read | null |
| 2025-12-18 | CKA-Guided Modular Quantization: Beyond Bit-Width to Algorithmic Diversity | Jinhao Zhang et.al. | 2512.16282 | translate | read | null |
| 2025-12-18 | Love, Lies, and Language Models: Investigating AI’s Role in Romance-Baiting Scams | Gilad Gressel et.al. | 2512.16280 | translate | read | null |
| 2025-12-18 | QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems | Yiliu Yang et.al. | 2512.16279 | translate | read | null |
| 2025-12-18 | Fast Collaborative Inference via Distributed Speculative Decoding | Ce Zheng et.al. | 2512.16273 | translate | read | null |
| 2025-12-18 | Beyond Blind Spots: Analytic Hints for Mitigating LLM-Based Evaluation Pitfalls | Ora Nova Fandina et.al. | 2512.16272 | translate | read | null |
| 2025-12-18 | Learning to Wait: Synchronizing Agents with the Physical World | Yifei She et.al. | 2512.16262 | translate | read | null |
| 2025-12-18 | AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding | Sanjoy Chowdhury et.al. | 2512.16250 | translate | read | null |
| 2025-12-18 | AlignMerge - Alignment-Preserving Large Language Model Merging via Fisher-Guided Geometric Constraints | Aniruddha Roy et.al. | 2512.16245 | translate | read | null |
| 2025-12-18 | Coarse-to-Fine Open-Set Graph Node Classification with Large Language Models | Xueqi Ma et.al. | 2512.16244 | translate | read | null |
| 2025-12-18 | Trustworthy and Controllable Professional Knowledge Utilization in Large Language Models with TEE-GPU Execution | Yifeng Cai et.al. | 2512.16238 | translate | read | null |
| 2025-12-18 | The Evolution of Reranking Models in Information Retrieval: From Heuristic Methods to Large Language Models | Tejul Pandit et.al. | 2512.16236 | translate | read | null |
| 2025-12-18 | LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding | Chenkai Xu et.al. | 2512.16229 | translate | read | link |
| 2025-12-18 | An Information-Theoretic Framework for Robust Large Language Model Editing | Qizhou Chen et.al. | 2512.16227 | translate | read | null |
| 2025-12-18 | DualGuard: Dual-stream Large Language Model Watermarking Defense against Paraphrase and Spoofing Attack | Hao Li et.al. | 2512.16182 | translate | read | null |
| 2025-12-18 | Ev-Trust: A Strategy Equilibrium Trust Mechanism for Evolutionary Games in LLM-Based Multi-Agent Services | Shiduo Yang et.al. | 2512.16167 | translate | read | null |
| 2025-12-18 | Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference | Jian Tian et.al. | 2512.16134 | translate | read | null |
| 2025-12-18 | Scaling Text2SQL via LLM-efficient Schema Filtering with Functional Dependency Graph Rerankers | Thanh Dat Hoang et.al. | 2512.16083 | translate | read | link |
| 2025-12-18 | Auto-Vocabulary 3D Object Detection | Haomeng Zhang et.al. | 2512.16077 | translate | read | null |
| 2025-12-18 | LLM4Perf: Large Language Models Are Effective Samplers for Multi-Objective Performance Modeling (Copy) | Xin Wang et.al. | 2512.16070 | translate | read | null |
| 2025-12-18 | A Multi-Agent Large Language Model Framework for Automated Qualitative Analysis | Qidi Xu et.al. | 2512.16063 | translate | read | null |
| 2025-12-18 | ContextLeak: Auditing Leakage in Private In-Context Learning Methods | Jacob Choi et.al. | 2512.16059 | translate | read | null |
| 2025-12-18 | MultiPath Transfer Engine: Breaking GPU and Host-Memory Bandwidth Bottlenecks in LLM Services | Lingfeng Tang et.al. | 2512.16056 | translate | read | null |
| 2025-12-17 | Topic Discovery and Classification for Responsible Generative AI Adaptation in Higher Education | Diane Myung-kyung Woodbridge et.al. | 2512.16036 | translate | read | null |
| 2025-12-17 | Do Large Language Models Know What They Don’t Know? Kalshibench: A New Benchmark for Evaluating Epistemic Calibration via Prediction Markets | Lukas Nel et.al. | 2512.16030 | translate | read | null |
| 2025-12-17 | Cross-Language Bias Examination in Large Language Models | Yuxuan Liang et.al. | 2512.16029 | translate | read | null |
| 2025-12-17 | Conversational Time Series Foundation Models: Towards Explainable and Effective Forecasting | Defu Cao et.al. | 2512.16022 | translate | read | null |
| 2025-12-17 | Few-Shot Inference of Human Perceptions of Robot Performance in Social Navigation Scenarios | Qiping Zhang et.al. | 2512.16019 | translate | read | null |
| 2025-12-17 | OLAF: Towards Robust LLM-Based Annotation Framework in Empirical Software Engineering | Mia Mohammad Imran et.al. | 2512.15979 | translate | read | null |
| 2025-12-17 | Dynamic Rank Reinforcement Learning for Adaptive Low-Rank Multi-Head Self Attention in Large Language Models | Caner Erden et.al. | 2512.15973 | translate | read | null |
| 2025-12-17 | BRAID: Bounded Reasoning for Autonomous Inference and Decisions | Armağan Amcalar et.al. | 2512.15959 | translate | read | null |
| 2025-12-17 | The Perceptual Observatory Characterizing Robustness and Grounding in MLLMs | Tejas Anvekar et.al. | 2512.15949 | translate | read | null |
| 2025-12-17 | Privacy Discourse and Emotional Dynamics in Mental Health Information Interaction on Reddit | Jai Kruthunz Naveen Kumar et.al. | 2512.15945 | translate | read | null |
| 2025-12-17 | Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning | Polaris Jhandi et.al. | 2512.15943 | translate | read | null |
| 2025-12-17 | City Navigation in the Wild: Exploring Emergent Navigation from Web-Scale Knowledge in MLLMs | Dwip Dalal et.al. | 2512.15933 | translate | read | null |
| 2025-12-17 | DSO: Direct Steering Optimization for Bias Mitigation | Lucas Monteiro Paes et.al. | 2512.15926 | translate | read | null |
| 2025-12-17 | Leveraging Spreading Activation for Improved Document Retrieval in Knowledge-Graph-Based RAG Systems | Jovan Pavlović et.al. | 2512.15922 | translate | read | null |
| 2025-12-17 | TabReX : Tabular Referenceless eXplainable Evaluation | Tejas Anvekar et.al. | 2512.15907 | translate | read | link |
| 2025-12-17 | Darth Vecdor: An Open-Source System for Generating Knowledge Graphs Through Large Language Model Queries | Jonathan A. Handler et.al. | 2512.15906 | translate | read | null |
| 2025-12-17 | PediatricAnxietyBench: Evaluating Large Language Model Safety Under Parental Anxiety and Pressure in Pediatric Consultations | Vahideh Zolfaghari et.al. | 2512.15894 | translate | read | null |
| 2025-12-17 | VET Your Agent: Towards Host-Independent Autonomy via Verifiable Execution Traces | Artem Grigor et.al. | 2512.15892 | translate | read | null |
| 2025-12-17 | Seeing Beyond Words: Self-Supervised Visual Learning for Multimodal Large Language Models | Davide Caffagni et.al. | 2512.15885 | translate | read | null |
| 2025-12-17 | HEPTAPOD: Orchestrating High Energy Physics Workflows Towards Autonomous Agency | Tony Menzo et.al. | 2512.15867 | translate | read | null |
| 2025-12-17 | Dynamic Rebatching for Efficient Early-Exit Inference with DREX | Xuting Liu et.al. | 2512.15705 | translate | read | null |
| 2025-12-17 | Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning | Yifei Li et.al. | 2512.15693 | translate | read | null |
| 2025-12-17 | Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning | Zhenwen Liang et.al. | 2512.15687 | translate | read | null |
| 2025-12-17 | Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers | Adam Karvonen et.al. | 2512.15674 | translate | read | null |
| 2025-12-17 | Explaining the Reasoning of Large Language Models Using Attribution Graphs | Chase Walker et.al. | 2512.15663 | translate | read | null |
| 2025-12-17 | Stepwise Think-Critique: A Unified Framework for Robust and Interpretable LLM Reasoning | Jiaqi Xu et.al. | 2512.15662 | translate | read | null |
| 2025-12-17 | How Much is Too Much? Exploring LoRA Rank Trade-offs for Retaining Knowledge and Domain Robustness | Darshita Rathore et.al. | 2512.15634 | translate | read | null |
| 2025-12-17 | Evaluating Metrics for Safety with LLM-as-Judges | Kester Clegg et.al. | 2512.15617 | translate | read | null |
| 2025-12-17 | Behavior Tokens Speak Louder: Disentangled Explainable Recommendation with Behavior Vocabulary | Xinshun Feng et.al. | 2512.15614 | translate | read | null |
| 2025-12-17 | Autoregressive Language Models are Secretly Energy-Based Models: Insights into the Lookahead Capabilities of Next-Token Prediction | Mathieu Blondel et.al. | 2512.15605 | translate | read | null |
| 2025-12-17 | Evaluating Large Language Models in Scientific Discovery | Zhangde Song et.al. | 2512.15567 | translate | read | null |
| 2025-12-17 | GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models | Bozhou Li et.al. | 2512.15560 | translate | read | null |
| 2025-12-17 | CTkvr: KV Cache Retrieval for Long-Context LLMs via Centroid then Token Indexing | Kuan Lu et.al. | 2512.15550 | translate | read | null |
| 2025-12-17 | When a Nation Speaks: Machine Learning and NLP in People’s Sentiment Analysis During Bangladesh’s 2024 Mass Uprising | Md. Samiul Alim et.al. | 2512.15547 | translate | read | null |
| 2025-12-17 | An Efficient and Effective Encoder Model for Vision and Language Tasks in the Remote Sensing Domain | João Daniel Silva et.al. | 2512.15531 | translate | read | null |
| 2025-12-17 | EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration | Daiqing Wu et.al. | 2512.15528 | translate | read | null |
| 2025-12-17 | How Do Semantically Equivalent Code Transformations Impact Membership Inference on LLMs for Code? | Hua Yang et.al. | 2512.15468 | translate | read | null |
| 2025-12-17 | On Assessing the Relevance of Code Reviews Authored by Generative Models | Robert Heumüller et.al. | 2512.15466 | translate | read | null |
| 2025-12-17 | Toward expert-level motivational interviewing for health behavior improvement with LLMs | Run-ze Hu et.al. | 2512.15446 | translate | read | null |
| 2025-12-17 | Step-GUI Technical Report | Haolong Yan et.al. | 2512.15431 | translate | read | null |
| 2025-12-17 | Can AI Generate more Comprehensive Test Scenarios? Review on Automated Driving Systems Test Scenario Generation Methods | Ji Zhou et.al. | 2512.15422 | translate | read | null |
| 2025-12-17 | Bilateral Spatial Reasoning about Street Networks: Graph-based RAG with Qualitative Spatial Representations | Reinhard Moratz et.al. | 2512.15388 | translate | read | null |
| 2025-12-17 | MedNuggetizer: Confidence-Based Information Nugget Extraction from Medical Documents | Gregor Donabauer et.al. | 2512.15384 | translate | read | null |
| 2025-12-17 | SCOPE: Prompt Evolution for Enhancing Agent Effectiveness | Zehua Pei et.al. | 2512.15374 | translate | read | null |
| 2025-12-17 | ArcBERT: An LLM-based Search Engine for Exploring Integrated Multi-Omics Metadata | Gajendra Doniparthi et.al. | 2512.15365 | translate | read | null |
| 2025-12-17 | Revisiting Task-Oriented Dataset Search in the Era of Large Language Models: Challenges, Benchmark, and Solution | Zixin Wei et.al. | 2512.15363 | translate | read | null |
| 2025-12-17 | Dual-Density Inference for Efficient Language Model Reasoning | Zhengyi Zhao et.al. | 2512.15358 | translate | read | null |
| 2025-12-17 | Adversarial versification in portuguese as a jailbreak operator in LLMs | Joao Queiroz et.al. | 2512.15353 | translate | read | null |
| 2025-12-17 | Exploring User Acceptance and Concerns toward LLM-powered Conversational Agents in Immersive Extended Reality | Efe Bozkir et.al. | 2512.15343 | translate | read | null |
| 2025-12-17 | Evaluating LLMs for Zeolite Synthesis Event Extraction (ZSEE): A Systematic Analysis of Prompting Strategies | Charan Prakash Rathore et.al. | 2512.15312 | translate | read | null |
| 2025-12-17 | SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation | Wangyu Wu et.al. | 2512.15310 | translate | read | null |
| 2025-12-17 | Towards Proactive Personalization through Profile Customization for Individual Users in Dialogues | Xiaotian Zhang et.al. | 2512.15302 | translate | read | null |
| 2025-12-17 | ChatGPT and Gemini participated in the Korean College Scholastic Ability Test – Earth Science I | Seok-Hyun Ga et.al. | 2512.15298 | translate | read | null |
| 2025-12-17 | Heterogeneous Model Alignment in Digital Twin | Faima Abbasi et.al. | 2512.15281 | translate | read | null |
| 2025-12-17 | Bounty Hunter: Autonomous, Comprehensive Emulation of Multi-Faceted Adversaries | Louis Hackländer-Jansen et.al. | 2512.15275 | translate | read | null |
| 2025-12-17 | Well Begun, Half Done: Reinforcement Learning with Prefix Optimization for LLM Reasoning | Yiliu Sun et.al. | 2512.15274 | translate | read | null |
| 2025-12-17 | Gaming the Arena: AI Model Evaluation and the Viral Capture of Attention | Sam Hind et.al. | 2512.15252 | translate | read | null |
| 2025-12-17 | The Moralization Corpus: Frame-Based Annotation and Analysis of Moralizing Speech Acts across Diverse Text Genres | Maria Becker et.al. | 2512.15248 | translate | read | null |
| 2025-12-17 | Null-LoRA: Low-Rank Adaptation on Null Space | Yi Zhang et.al. | 2512.15233 | translate | read | null |
| 2025-12-17 | CangLing-KnowFlow: A Unified Knowledge-and-Flow-fused Agent for Comprehensive Remote Sensing Applications | Zhengchao Chen et.al. | 2512.15231 | translate | read | null |
| 2025-12-17 | Yes-MT’s Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024 | Yash Bhaskar et.al. | 2512.15226 | translate | read | null |
| 2025-12-17 | RFKG-CoT: Relation-Driven Adaptive Hop-count Selection and Few-Shot Path Guidance for Knowledge-Aware QA | Chao Zhang et.al. | 2512.15219 | translate | read | null |
| 2025-12-17 | DEER: Draft with Diffusion, Verify with Autoregressive Models | Zicong Cheng et.al. | 2512.15176 | translate | read | null |
| 2025-12-17 | MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers | Xuanjun Zong et.al. | 2512.15163 | translate | read | null |
| 2025-12-17 | Offline Multi-Task Multi-Objective Data-Driven Evolutionary Algorithm with Language Surrogate Model and Implicit Q-Learning | Xian-Rong Zhang et.al. | 2512.15149 | translate | read | null |
| 2025-12-17 | Aligning Academia with Industry: An Empirical Study of Industrial Needs and Academic Capabilities in AI-Driven Software Engineering | Hang Yu et.al. | 2512.15148 | translate | read | null |
| 2025-12-17 | Beyond Majority Voting: Towards Fine-grained and More Reliable Reward Signal for Test-Time Reinforcement Learning | Weiqin Wang et.al. | 2512.15146 | translate | read | null |
| 2025-12-17 | I am here for you”: How relational conversational AI appeals to adolescents, especially those who are socially and emotionally vulnerable | Pilyoung Kim et.al. | 2512.15117 | translate | read | null |
| 2025-12-17 | Uni-Parser Technical Report | Xi Fang et.al. | 2512.15098 | translate | read | null |
| 2025-12-17 | Beyond Fast and Slow: Cognitive-Inspired Elastic Reasoning for Large Language Models | Jinwu Hu et.al. | 2512.15089 | translate | read | null |
| 2025-12-17 | The Semantic Architect: How FEAML Bridges Structured Data and LLMs for Multi-Label Tasks | Wanfu Gao et.al. | 2512.15082 | translate | read | null |
| 2025-12-17 | Quantifying Return on Security Controls in LLM Systems | Richard Helder Moulton et.al. | 2512.15081 | translate | read | null |
| 2025-12-17 | An Exploratory Study of Bayesian Prompt Optimization for Test-Driven Code Generation with Large Language Models | Shlok Tomar et.al. | 2512.15076 | translate | read | null |
| 2025-12-17 | The Meta-Prompting Protocol: Orchestrating LLMs via Adversarial Feedback Loops | Fanzhe Fu et.al. | 2512.15053 | translate | read | null |
| 2025-12-17 | SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification | Hongbo Wang et.al. | 2512.15052 | translate | read | null |
| 2025-12-17 | Beyond Accuracy: A Geometric Stability Analysis of Large Language Models in Chess Evaluation | Xidan Song et.al. | 2512.15033 | translate | read | null |
| 2025-12-17 | Toxicity Ahead: Forecasting Conversational Derailment on GitHub | Mia Mohammad Imran et.al. | 2512.15031 | translate | read | null |
| 2025-12-17 | SeBERTis: A Framework for Producing Classifiers of Security-Related Issue Reports | Sogol Masoumzadeh et.al. | 2512.15003 | translate | read | null |
| 2025-12-17 | DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding | Ruiyi Zhang et.al. | 2512.15000 | translate | read | null |
| 2025-12-17 | Evaluating Large Language Models on Multimodal Chemistry Olympiad Exams | Yiming Cui et.al. | 2512.14989 | translate | read | null |
| 2025-12-16 | EVICPRESS: Joint KV-Cache Compression and Eviction for Efficient LLM Serving | Shaoting Feng et.al. | 2512.14946 | translate | read | null |
| 2025-12-16 | Parameter Efficient Multimodal Instruction Tuning for Romanian Vision Language Models | George-Andrei Dima et.al. | 2512.14926 | translate | read | null |
| 2025-12-16 | Multiscale Aggregated Hierarchical Attention (MAHA): A Game Theoretic and Optimization Driven Approach to Efficient Contextual Modeling in Large Language Models | Caner Erden et.al. | 2512.14925 | translate | read | null |
| 2025-12-16 | Evaluating Code Reasoning Abilities of Large Language Models Under Real-World Settings | Changshu Liu et.al. | 2512.14917 | translate | read | null |
| 2025-12-16 | DrugRAG: Enhancing Pharmacy LLM Performance Through A Novel Retrieval-Augmented Generation Pipeline | Houman Kazemzadeh et.al. | 2512.14896 | translate | read | null |
| 2025-12-16 | Integrating Large Language Models and Knowledge Graphs to Capture Political Viewpoints in News Media | Massimiliano Fadda et.al. | 2512.14887 | translate | read | null |
| 2025-12-16 | Entropy-Reservoir Bregman Projection: An Information-Geometric Unification of Model Collapse | Jingwei Chen et.al. | 2512.14879 | translate | read | null |
| 2025-12-16 | Isolated Sign Language Recognition with Segmentation and Pose Estimation | Daniel Perkins et.al. | 2512.14876 | translate | read | null |
| 2025-12-16 | HERBench: A Benchmark for Multi-Evidence Integration in Video Question Answering | Dan Ben-Ami et.al. | 2512.14870 | translate | read | null |
| 2025-12-16 | MALCDF: A Distributed Multi-Agent LLM Framework for Real-Time Cyber | Arth Bhardwaj et.al. | 2512.14846 | translate | read | null |
| 2025-12-16 | Sharing State Between Prompts and Programs | Ellie Y. Cheng et.al. | 2512.14805 | translate | read | null |
| 2025-12-16 | Incentives or Ontology? A Structural Rebuttal to OpenAI’s Hallucination Thesis | Richard Ackermann et.al. | 2512.14801 | translate | read | null |
| 2025-12-16 | IaC Generation with LLMs: An Error Taxonomy and A Study on Configuration Knowledge Injection | Roman Nekrasov et.al. | 2512.14792 | translate | read | null |
| 2025-12-16 | TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs | Jun Zhang et.al. | 2512.14698 | translate | read | null |
| 2025-12-16 | Fast and Accurate Causal Parallel Decoding using Jacobi Forcing | Lanxiang Hu et.al. | 2512.14681 | translate | read | null |
| 2025-12-16 | EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models | Zechen Bai et.al. | 2512.14666 | translate | read | null |
| 2025-12-16 | Enhancing Visual Sentiment Analysis via Semiotic Isotopy-Guided Dataset Construction | Marco Blanchini et.al. | 2512.14665 | translate | read | null |
| 2025-12-16 | Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models | Chiyue Wei et.al. | 2512.14661 | translate | read | null |
| 2025-12-16 | Beyond Text-to-SQL: Autonomous Research-Driven Database Exploration with DAR | Ostap Vykhopen et.al. | 2512.14622 | translate | read | null |
| 2025-12-16 | PerProb: Indirectly Evaluating Memorization in Large Language Models | Yihan Liao et.al. | 2512.14600 | translate | read | null |
| 2025-12-16 | LLM-driven Knowledge Enhancement for Multimodal Cancer Survival Prediction | Chenyu Zhao et.al. | 2512.14594 | translate | read | null |
| 2025-12-16 | Towards Nepali-language LLMs: Efficient GPT training with a Nepali BPE tokenizer | Adarsha Shrestha et.al. | 2512.14585 | translate | read | null |
| 2025-12-16 | Pairwise Comparison for Bias Identification and Quantification | Fabian Haak et.al. | 2512.14565 | translate | read | null |
| 2025-12-16 | Polypersona: Persona-Grounded LLM for Synthetic Survey Responses | Tejaswani Dash et.al. | 2512.14562 | translate | read | null |
| 2025-12-16 | Agreement Between Large Language Models and Human Raters in Essay Scoring: A Research Synthesis | Hongli Li et.al. | 2512.14561 | translate | read | null |
| 2025-12-16 | CLNet: Cross-View Correspondence Makes a Stronger Geo-Localizationer | Xianwei Cao et.al. | 2512.14560 | translate | read | null |
| 2025-12-16 | VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models | Nguyen Tien Dong et.al. | 2512.14554 | translate | read | null |
| 2025-12-16 | VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse | Ying Nie et.al. | 2512.14531 | translate | read | null |
| 2025-12-16 | RecGPT-V2 Technical Report | Chao Yi et.al. | 2512.14503 | translate | read | null |
| 2025-12-16 | C-ing Clearly: Enhanced Binary Code Explanations using C code | Teodor Poncu et.al. | 2512.14500 | translate | read | null |
| 2025-12-16 | SASQ: Static Activation Scaling for Quantization-Aware Training in Large Language Models | Shizhuo Mao et.al. | 2512.14481 | translate | read | null |
| 2025-12-16 | Model-First Reasoning LLM Agents: Reducing Hallucinations through Explicit Problem Modeling | Annu Rana et.al. | 2512.14474 | translate | read | null |
| 2025-12-16 | Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space | Xingfu Zhou et.al. | 2512.14448 | translate | read | null |
| 2025-12-16 | Seismology modeling agent: A smart assistant for geophysical researchers | Yukun Ren et.al. | 2512.14429 | translate | read | null |
| 2025-12-16 | Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models | Gabriele Prato et.al. | 2512.14427 | translate | read | null |
| 2025-12-16 | DISCODE: Distribution-Aware Score Decoder for Robust Automatic Evaluation of Image Captioning | Nakamasa Inoue et.al. | 2512.14420 | translate | read | null |
| 2025-12-16 | PortAgent: LLM-driven Vehicle Dispatching Agent for Port Terminals | Jia Hu et.al. | 2512.14417 | translate | read | null |
| 2025-12-16 | Massive Editing for Large Language Models Based on Dynamic Weight Generation | Wentao Wan et.al. | 2512.14395 | translate | read | null |
| 2025-12-16 | RePo: Language Models with Context Re-Positioning | Huayang Li et.al. | 2512.14391 | translate | read | null |
| 2025-12-16 | Multi-Agent Medical Decision Consensus Matrix System: An Intelligent Collaborative Framework for Oncology MDT Consultations | Xudong Han et.al. | 2512.14321 | translate | read | null |
| 2025-12-16 | Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity | Shuai Dong et.al. | 2512.14320 | translate | read | null |
| 2025-12-16 | Inflation Attitudes of Large Language Models | Nikoleta Anesti et.al. | 2512.14306 | translate | read | null |
| 2025-12-16 | Leveraging LLMs for Collaborative Ontology Engineering in Parkinson Disease Monitoring and Alerting | Georgios Bouchouras et.al. | 2512.14288 | translate | read | null |
| 2025-12-16 | The Trust in AI-Generated Health Advice (TAIGHA) Scale and Short Version (TAIGHA-S): Development and Validation Study | Marvin Kopka et.al. | 2512.14278 | translate | read | null |
| 2025-12-16 | SPARQL-LLM: Real-Time SPARQL Query Generation from Natural Language Questions | Panayiotis Smeros et.al. | 2512.14277 | translate | read | null |
| 2025-12-16 | Enhancing Visual Programming for Visual Reasoning via Probabilistic Graphs | Wentao Wan et.al. | 2512.14257 | translate | read | null |
| 2025-12-16 | TEMP: A Memory Efficient Physical-aware Tensor Partition-Mapping Framework on Wafer-scale Chips | Huizheng Wang et.al. | 2512.14256 | translate | read | null |
| 2025-12-16 | From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition | Yiqing Zhou et.al. | 2512.14244 | translate | read | null |
| 2025-12-16 | Two CFG Nahuatl for automatic corpora expansion | Juan-José Guzmán-Landa et.al. | 2512.14239 | translate | read | null |
| 2025-12-16 | Ladder Up, Memory Down: Low-Cost Fine-Tuning With Side Nets | Estelle Zheng et.al. | 2512.14237 | translate | read | null |
| 2025-12-16 | PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design | Ruozhao Yang et.al. | 2512.14233 | translate | read | null |
| 2025-12-16 | Georeferencing complex relative locality descriptions with large language models | Aneesha Fernando et.al. | 2512.14228 | translate | read | null |
| 2025-12-16 | Estimating problem difficulty without ground truth using Large Language Model comparisons | Marthe Ballon et.al. | 2512.14220 | translate | read | null |
| 2025-12-16 | IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol | Yunhao Yao et.al. | 2512.14166 | translate | read | null |
| 2025-12-16 | Adaptive Cache Pollution Control for Large Language Model Inference Workloads Using Temporal CNN-Based Prediction and Priority-Aware Replacement | Songze Liu et.al. | 2512.14151 | translate | read | null |
| 2025-12-16 | Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents | Hongqiu Ni et.al. | 2512.14142 | translate | read | null |
| 2025-12-16 | TorchTraceAP: A New Benchmark Dataset for Detecting Performance Anti-Patterns in Computer Vision Models | Hanning Chen et.al. | 2512.14141 | translate | read | null |
| 2025-12-16 | LAPPI: Interactive Optimization with LLM-Assisted Preference-Based Problem Instantiation | So Kuroki et.al. | 2512.14138 | translate | read | null |
| 2025-12-16 | SportsGPT: An LLM-driven Framework for Interpretable Sports Motion Assessment and Training Guidance | Wenbo Tian et.al. | 2512.14121 | translate | read | null |
| 2025-12-16 | CogMem: A Cognitive Memory Architecture for Sustained Multi-Turn Reasoning in Large Language Models | Yiran Zhang et.al. | 2512.14118 | translate | read | null |
| 2025-12-16 | Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries | Emanuele Mezzi et.al. | 2512.14102 | translate | read | null |
| 2025-12-16 | A First-Order Logic-Based Alternative to Reward Models in RLHF | Chunjin Jian et.al. | 2512.14100 | translate | read | null |
| 2025-12-16 | Cornserve: Efficiently Serving Any-to-Any Multimodal Models | Jeff J. Ma et.al. | 2512.14098 | translate | read | null |
| 2025-12-16 | A Unified Sparse Attention via Multi-Granularity Compression | Siran Liu et.al. | 2512.14082 | translate | read | null |
| 2025-12-16 | From Obfuscated to Obvious: A Comprehensive JavaScript Deobfuscation Tool for Security Analysis | Dongchao Zhou et.al. | 2512.14070 | translate | read | null |
| 2025-12-16 | RADAR: Accelerating Large Language Model Inference With RL-Based Dynamic Draft Trees | Junjie Ma et.al. | 2512.14069 | translate | read | null |
| 2025-12-16 | What Affects the Effective Depth of Large Language Models? | Yi Hu et.al. | 2512.14064 | translate | read | null |
| 2025-12-16 | HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices | HyperAI Team et.al. | 2512.14052 | translate | read | null |
| 2025-12-16 | OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value | Mengzhang Cai et.al. | 2512.14051 | translate | read | null |
| 2025-12-16 | Intention Chain-of-Thought Prompting with Dynamic Routing for Code Generation | Shen Li et.al. | 2512.14048 | translate | read | null |
| 2025-12-16 | Evaluating Small Language Models for Agentic On-Farm Decision Support Systems | Enhong Liu et.al. | 2512.14043 | translate | read | null |
| 2025-12-16 | ChartAgent: A Chart Understanding Framework with Tool Integrated Reasoning | Boran Wang et.al. | 2512.14040 | translate | read | null |
| 2025-12-16 | PerfCoder: Large Language Models for Interpretable Code Performance Optimization | Jiuding Yang et.al. | 2512.14018 | translate | read | null |
| 2025-12-16 | KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding | Zongyao Li et.al. | 2512.14017 | translate | read | null |
| 2025-12-16 | Sparsity-Controllable Dynamic Top-p MoE for Large Foundation Model Pre-training | Can Jin et.al. | 2512.13996 | translate | read | null |
| 2025-12-16 | Structure-Aware Decoding Mechanisms for Complex Entity Extraction with Large-Scale Language Models | Zhimin Qiu et.al. | 2512.13980 | translate | read | null |
| 2025-12-16 | ReflCtrl: Controlling LLM Reflection via Representation Engineering | Ge Yan et.al. | 2512.13979 | translate | read | null |
| 2025-12-16 | Evaluating Frontier LLMs on PhD-Level Mathematical Reasoning: A Benchmark on a Textbook in Theoretical Computer Science about Randomized Algorithms | Yang Cao et.al. | 2512.13978 | translate | read | null |
| 2025-12-16 | Autonomous Construction-Site Safety Inspection Using Mobile Robots: A Multilayer VLM-LLM Pipeline | Hossein Naderi et.al. | 2512.13974 | translate | read | null |
| 2025-12-15 | Informing Acquisition Functions via Foundation Models for Molecular Discovery | Qi Chen et.al. | 2512.13935 | translate | read | null |
| 2025-12-15 | Hierarchical Multi-agent Large Language Model Reasoning for Autonomous Functional Materials Discovery | Samuel Rothfarb et.al. | 2512.13930 | translate | read | null |
| 2025-12-15 | Context Branching for LLM Conversations: A Version Control Approach to Exploratory Programming | Bhargav Chickmagalur Nanjundappa et.al. | 2512.13914 | translate | read | null |
| 2025-12-15 | FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition | Jonas Golde et.al. | 2512.13884 | translate | read | null |
| 2025-12-15 | Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-Editors | Henger Li et.al. | 2512.13860 | translate | read | null |
| 2025-12-15 | EvoLattice: Persistent Internal-Population Evolution through Multi-Alternative Quality-Diversity Graph Representations for LLM-Guided Program Discovery | Kamer Ali Yuksel et.al. | 2512.13857 | translate | read | null |
| 2025-12-15 | Practitioner Insights on Fairness Requirements in the AI Development Life Cycle: An Interview Study | Chaima Boufaied et.al. | 2512.13830 | translate | read | null |
| 2025-12-15 | The Double Life of Code World Models: Provably Unmasking Malicious Behavior Through Execution Traces | Subramanyam Sahoo et.al. | 2512.13821 | translate | read | null |
| 2025-12-15 | State-Dependent Refusal and Learned Incapacity in RLHF-Aligned Language Models | TK Lee et.al. | 2512.13762 | translate | read | null |
| 2025-12-15 | A Scientific Reasoning Model for Organic Synthesis Procedure Generation | Guoqing Liu et.al. | 2512.13668 | translate | read | null |
| 2025-12-15 | Embedding-Based Rankings of Educational Resources based on Learning Outcome Alignment: Benchmarking, Expert Validation, and Learner Performance | Mohammadreza Molavi et.al. | 2512.13658 | translate | read | null |
| 2025-12-15 | Comparative Analysis of LLM Abliteration Methods: A Cross-Architecture Evaluation | Richard J. Young et.al. | 2512.13655 | translate | read | null |
| 2025-12-15 | Large-Language Memorization During the Classification of United States Supreme Court Cases | John E. Ortega et.al. | 2512.13654 | translate | read | null |
| 2025-12-15 | MindDrive: A Vision-Language-Action Model for Autonomous Driving via Online Reinforcement Learning | Haoyu Fu et.al. | 2512.13636 | translate | read | null |
| 2025-12-15 | Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models | Zefang Liu et.al. | 2512.13618 | translate | read | null |
| 2025-12-15 | Textual Gradients are a Flawed Metaphor for Automatic Prompt Optimization | Daniel Melcer et.al. | 2512.13598 | translate | read | null |
| 2025-12-15 | ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding | Jia-Nan Li et.al. | 2512.13586 | translate | read | null |
| 2025-12-15 | MMhops-R1: Multimodal Multi-hop Reasoning | Tao Zhang et.al. | 2512.13573 | translate | read | null |
| 2025-12-15 | PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation | Hour Kaing et.al. | 2512.13552 | translate | read | null |
| 2025-12-15 | Fine-tuned LLM-based Code Migration Framework | Oleg Grynets et.al. | 2512.13515 | translate | read | null |
| 2025-12-15 | MedCEG: Reinforcing Verifiable Medical Reasoning with Critical Evidence Graph | Linjie Mu et.al. | 2512.13510 | translate | read | null |
| 2025-12-15 | SkipCat: Rank-Maximized Low-Rank Compression of Large Language Models via Shared Projection and Block Skipping | Yu-Chen Lu et.al. | 2512.13494 | translate | read | null |
| 2025-12-15 | From Zipf’s Law to Neural Scaling through Heaps’ Law and Hilberg’s Hypothesis | Łukasz Dębowski et.al. | 2512.13491 | translate | read | null |
| 2025-12-15 | neuralFOMO: Can LLMs Handle Being Second Best? Measuring Envy-Like Preferences in Multi-Agent Settings | Ojas Pungalia et.al. | 2512.13481 | translate | read | null |
| 2025-12-15 | Non-Resolution Reasoning (NRR): A Computational Framework for Contextual Identity and Ambiguity Preservation | Kei Saito et.al. | 2512.13478 | translate | read | null |
| 2025-12-15 | Scaling Laws for Code: Every Programming Language Matters | Jian Yang et.al. | 2512.13472 | translate | read | null |
| 2025-12-15 | Large language models are not about natural language | Johan J. Bolhuis et.al. | 2512.13441 | translate | read | null |
| 2025-12-15 | From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents | Dezhi Ran et.al. | 2512.13438 | translate | read | null |
| 2025-12-15 | Behavior and Representation in Large Language Models for Combinatorial Optimization: From Feature Extraction to Algorithm Selection | Francesca Da Ros et.al. | 2512.13374 | translate | read | null |
| 2025-12-15 | Detecting Emotion Drift in Mental Health Text Using Pre-Trained Transformers | Shibani Sankpal et.al. | 2512.13363 | translate | read | null |
| 2025-12-15 | UCRBench: Benchmarking LLMs on Use Case Recovery | Shuyuan Xiao et.al. | 2512.13360 | translate | read | null |
| 2025-12-15 | On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models | Ali Al Sahili et.al. | 2512.13352 | translate | read | null |
| 2025-12-15 | FROC: A Unified Framework with Risk-Optimized Control for Machine Unlearning in LLMs | Si Qi Goh et.al. | 2512.13337 | translate | read | null |
| 2025-12-15 | FIN-bench-v2: A Unified and Robust Benchmark Suite for Evaluating Finnish Large Language Models | Joona Kytöniemi et.al. | 2512.13330 | translate | read | null |
| 2025-12-15 | Security and Detectability Analysis of Unicode Text Watermarking Methods Against Large Language Models | Malte Hellmeier et.al. | 2512.13325 | translate | read | null |
| 2025-12-15 | KlingAvatar 2.0 Technical Report | Kling Team et.al. | 2512.13313 | translate | read | null |
| 2025-12-15 | MiniLingua: A Small Open-Source LLM for European Languages | Anna Aksenova et.al. | 2512.13298 | translate | read | null |
| 2025-12-15 | AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning | Jiaru Zou et.al. | 2512.13278 | translate | read | null |
| 2025-12-15 | CogniEdit: Dense Gradient Flow Optimization for Fine-Grained Image Editing | Yan Li et.al. | 2512.13276 | translate | read | null |
| 2025-12-15 | Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection | Juil Koo et.al. | 2512.13250 | translate | read | null |
| 2025-12-15 | Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistance | Francesco Ragusa et.al. | 2512.13238 | translate | read | null |
| 2025-12-15 | Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models | Chendong Sun et.al. | 2512.13194 | translate | read | null |
| 2025-12-15 | Integrated Semantic and Temporal Alignment for Interactive Video Retrieval | Thanh-Danh Luu et.al. | 2512.13169 | translate | read | null |
| 2025-12-15 | A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis | Xianchao Guan et.al. | 2512.13164 | translate | read | null |
| 2025-12-15 | Can AI Understand What We Cannot Say? Measuring Multilevel Alignment Through Abortion Stigma Across Cognitive, Interpersonal, and Structural Levels | Anika Sharma et.al. | 2512.13142 | translate | read | null |
| 2025-12-15 | Uncovering the Role of Initial Saliency in U-Shaped Attention Bias: Scaling Initial Token Weight for Enhanced Long-Text Processing | Zewen Qiang et.al. | 2512.13109 | translate | read | null |
| 2025-12-15 | Socratic Students: Teaching Language Models to Learn by Asking Questions | Rajeev Bhatt Ambati et.al. | 2512.13102 | translate | read | null |
| 2025-12-15 | A Simple and Effective Framework for Symmetric Consistent Indexing in Large-Scale Dense Retrieval | Huimu Wang et.al. | 2512.13074 | translate | read | null |
| 2025-12-15 | M-GRPO: Stabilizing Self-Supervised Reinforcement Learning for Large Language Models with Momentum-Anchored Policy Optimization | Bizhe Bai et.al. | 2512.13070 | translate | read | null |
| 2025-12-15 | LLM Rationalis? Measuring Bargaining Capabilities of AI Negotiators | Cheril Shah et.al. | 2512.13063 | translate | read | null |
| 2025-12-15 | An Open and Reproducible Deep Research Agent for Long-Form Question Answering | Ikuya Yamada et.al. | 2512.13059 | translate | read | null |
| 2025-12-15 | Sharpen the Spec, Cut the Code: A Case for Generative File System with SYSSPEC | Qingyuan Liu et.al. | 2512.13047 | translate | read | null |
| 2025-12-15 | Understanding Structured Financial Data with LLMs: A Case Study on Fraud Detection | Xuwei Tan et.al. | 2512.13040 | translate | read | null |
| 2025-12-15 | Large Language Models for Power System Applications: A Comprehensive Literature Survey | Muhammad Sarwar et.al. | 2512.13004 | translate | read | null |
| 2025-12-15 | Are Large Language Models Really Effective for Training-Free Cold-Start Recommendation? | Genki Kusano et.al. | 2512.13001 | translate | read | null |
| 2025-12-15 | Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search from Task-Centric Views | Tingyang Chen et.al. | 2512.12980 | translate | read | null |
| 2025-12-15 | Do Reviews Matter for Recommendations in the Era of Large Language Models? | Chee Heng Tan et.al. | 2512.12978 | translate | read | null |
| 2025-12-15 | Authors Should Annotate | Marcus Ma et.al. | 2512.12976 | translate | read | null |
| 2025-12-15 | Database Research needs an Abstract Relational Query Language | Wolfgang Gatterbauer et.al. | 2512.12957 | translate | read | null |
| 2025-12-15 | Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping | Lingyi Meng et.al. | 2512.12950 | translate | read | null |
| 2025-12-15 | SPAR: Session-based Pipeline for Adaptive Retrieval on Legacy File Systems | Duy A. Nguyen et.al. | 2512.12938 | translate | read | null |
| 2025-12-15 | PROSERVE: Unified Multi-Priority Request Scheduling for LLM Serving | Weizhe Huang et.al. | 2512.12928 | translate | read | null |
| 2025-12-15 | Interpretable Hypothesis-Driven Trading:A Rigorous Walk-Forward Validation Framework for Market Microstructure Signals | Gagan Deep et.al. | 2512.12924 | translate | read | null |
| 2025-12-15 | LLM-based Personalized Portfolio Recommender: Integrating Large Language Models and Reinforcement Learning for Intelligent Investment Strategy Optimization | Bangyu Li et.al. | 2512.12922 | translate | read | null |
| 2025-12-15 | Cisco Integrated AI Security and Safety Framework Report | Amy Chang et.al. | 2512.12921 | translate | read | null |
| 2025-12-15 | CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs | Shashie Dilhara Batan Arachchige et.al. | 2512.12914 | translate | read | null |
| 2025-12-14 | SignRAG: A Retrieval-Augmented System for Scalable Zero-Shot Road Sign Recognition | Minghao Zhu et.al. | 2512.12885 | translate | read | null |
| 2025-12-14 | ERA-IT: Aligning Semantic Models with Revealed Economic Preference for Real-Time and Explainable Patent Valuation | Yoo Yongmin et.al. | 2512.12869 | translate | read | null |
| 2025-12-14 | Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM | Furong Jia et.al. | 2512.12868 | translate | read | null |
| 2025-12-14 | Information-Consistent Language Model Recommendations through Group Relative Policy Optimization | Sonal Prabhune et.al. | 2512.12858 | translate | read | null |
| 2025-12-14 | Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, LLaMA | Hanyu Cai et.al. | 2512.12812 | translate | read | null |
| 2025-12-14 | Fault-Tolerant Sandboxing for AI Coding Agents: A Transactional Approach to Safe Autonomous Execution | Boyang Yan et.al. | 2512.12806 | translate | read | null |
| 2025-12-14 | A Disproof of Large Language Model Consciousness: The Necessity of Continual Learning for Consciousness | Erik Hoel et.al. | 2512.12802 | translate | read | null |
| 2025-12-14 | Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P | Anurag Dutt et.al. | 2512.12801 | translate | read | null |
| 2025-12-14 | DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning | Zhe Liu et.al. | 2512.12799 | translate | read | null |
| 2025-12-14 | A Rule-Aware Prompt Framework for Structured Numeric Reasoning in Cyber-Physical Systems | Yichen Liu et.al. | 2512.12794 | translate | read | null |
| 2025-12-14 | Beyond Task Completion: An Assessment Framework for Evaluating Agentic AI Systems | Sreemaee Akshathala et.al. | 2512.12791 | translate | read | null |
| 2025-12-14 | State over Tokens: Characterizing the Role of Reasoning Tokens | Mosh Levy et.al. | 2512.12777 | translate | read | null |
| 2025-12-14 | Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions | Pedro Henrique Luz de Araujo et.al. | 2512.12775 | translate | read | null |
| 2025-12-14 | JointAVBench: A Benchmark for Joint Audio-Visual Reasoning Evaluation | Jianghan Chao et.al. | 2512.12772 | translate | read | null |
| 2025-12-14 | Adaptive Edge-Cloud Inference for Speech-to-Action Systems Using ASR and Large Language Models (ASTA) | Mohammad Jalili Torkamani et.al. | 2512.12769 | translate | read | null |
| 2025-12-14 | Intelligent Scientific Literature Explorer using Machine Learning (ISLE) | Sina Jani et.al. | 2512.12760 | translate | read | null |
| 2025-12-14 | FysicsWorld: A Unified Full-Modality Benchmark for Any-to-Any Understanding, Generation, and Reasoning | Yue Jiang et.al. | 2512.12756 | translate | read | null |
| 2025-12-14 | Resting Neurons, Active Insights: Improving Input Sparsification for Large Language Models | Haotian Xu et.al. | 2512.12744 | translate | read | null |
| 2025-12-14 | CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning | Xuanzhang Liu et.al. | 2512.12716 | translate | read | null |
| 2025-12-14 | Synergizing Code Coverage and Gameplay Intent: Coverage-Aware Game Playtesting with LLM-Guided Reinforcement Learning | Enhong Mu et.al. | 2512.12706 | translate | read | null |
| 2025-12-14 | Hybrid Retrieval-Augmented Generation for Robust Multilingual Document Question Answering | Anthony Mudet et.al. | 2512.12694 | translate | read | null |
| 2025-12-14 | Memoria: A Scalable Agentic Memory Framework for Personalized Conversational AI | Samarth Sarin et.al. | 2512.12686 | translate | read | null |
| 2025-12-14 | Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches | Amirhossein Yousefiramandi et.al. | 2512.12677 | translate | read | null |
| 2025-12-14 | LexRel: Benchmarking Legal Relation Extraction for Chinese Civil Cases | Yida Cai et.al. | 2512.12643 | translate | read | null |
| 2025-12-14 | DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model | Zhou Tao et.al. | 2512.12633 | translate | read | null |
| 2025-12-14 | ORIBA: Exploring LLM-Driven Role-Play Chatbot as a Creativity Support Tool for Original Character Artists | Yuqian Sun et.al. | 2512.12630 | translate | read | null |
| 2025-12-14 | Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space | Chengzhi Liu et.al. | 2512.12623 | translate | read | null |
| 2025-12-14 | Understanding Syllogistic Reasoning in LLMs from Formal and Natural Language Perspectives | Aheli Poddar et.al. | 2512.12620 | translate | read | null |
| 2025-12-14 | Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching | Wonseok Choi et.al. | 2512.12610 | translate | read | null |
| 2025-12-14 | Human-Inspired Learning for Large Language Models via Obvious Record and Maximum-Entropy Method Discovery | Hong Su et.al. | 2512.12608 | translate | read | null |
| 2025-12-14 | Vision-Enhanced Large Language Models for High-Resolution Image Synthesis and Multimodal Data Interpretation | Karthikeya KV et.al. | 2512.12595 | translate | read | null |
| 2025-12-14 | Beyond Static Scoring: Enhancing Assessment Validity via AI-Generated Interactive Verification | Tom Lee et.al. | 2512.12592 | translate | read | null |
| 2025-12-14 | StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding | Xinqi Jin et.al. | 2512.12560 | translate | read | null |
| 2025-12-14 | Large Language Newsvendor: Decision Biases and Cognitive Mechanisms | Jifei Liu et.al. | 2512.12552 | translate | read | null |
| 2025-12-14 | HyperEdit: Unlocking Instruction-based Text Editing in LLMs via Hypernetworks | Yiming Zeng et.al. | 2512.12544 | translate | read | null |
| 2025-12-14 | NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data | Agniva Maiti et.al. | 2512.12537 | translate | read | null |
| 2025-12-14 | Diverse LLMs vs. Vulnerabilities: Who Detects and Fixes Them Better? | Arastoo Zibaeirad et.al. | 2512.12536 | translate | read | null |
| 2025-12-14 | ATLAS: Automated Tree-based Language Analysis System for C and C++ source programs | Jaid Monwar Chowdhury et.al. | 2512.12507 | translate | read | null |
| 2025-12-14 | KidsArtBench: Multi-Dimensional Children’s Art Evaluation with Attribute-Aware MLLMs | Mingrui Ye et.al. | 2512.12503 | translate | read | null |
| 2025-12-14 | Explainable AI as a Double-Edged Sword in Dermatology: The Impact on Clinicians versus The Public | Xuhai Xu et.al. | 2512.12500 | translate | read | null |
| 2025-12-13 | The American Ghost in the Machine: How language models align culturally and the effects of cultural prompting | James Luther et.al. | 2512.12488 | translate | read | null |
| 2025-12-13 | HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments | Yongjun He et.al. | 2512.12476 | translate | read | null |
| 2025-12-13 | Large language models have learned to use language | Gary Lupyan et.al. | 2512.12447 | translate | read | null |
| 2025-12-13 | Can GPT replace human raters? Validity and reliability of machine-generated norms for metaphors | Veronica Mangiaterra et.al. | 2512.12444 | translate | read | null |
| 2025-12-11 | Towards Efficient and Effective Multi-Camera Encoding for End-to-End Driving | Jiawei Yang et.al. | 2512.10947 | translate | read | null |
| 2025-12-11 | FoundationMotion: Auto-Labeling and Reasoning about Spatial Movement in Videos | Yulu Gan et.al. | 2512.10927 | translate | read | null |
| 2025-12-11 | SparseSwaps: Tractable LLM Pruning Mask Refinement at Scale | Max Zimmer et.al. | 2512.10922 | translate | read | null |
| 2025-12-11 | CompanionCast: A Multi-Agent Conversational AI Framework with Spatial Audio for Social Co-Viewing Experiences | Yiyang Wang et.al. | 2512.10918 | translate | read | null |
| 2025-12-11 | Multi-Granular Node Pruning for Circuit Discovery | Muhammad Umair Haider et.al. | 2512.10903 | translate | read | null |
| 2025-12-11 | LLMs Can Assist with Proposal Selection at Large User Facilities | Lijie Ding et.al. | 2512.10895 | translate | read | null |
| 2025-12-11 | Computational emotion analysis with multimodal LLMs: Current evidence on an emerging methodological opportunity | Hauke Licht et.al. | 2512.10882 | translate | read | null |
| 2025-12-11 | Quantifying Emotional Tone in Tolkien’s The Hobbit: Dialogue Sentiment Analysis with RegEx, NRC-VAD, and Python | Lilin Qiu et.al. | 2512.10865 | translate | read | null |
| 2025-12-11 | Large Language Models for Superconductor Discovery | Suman Itani et.al. | 2512.10847 | translate | read | null |
| 2025-12-11 | LabelFusion: Learning to Fuse LLMs and Transformer Classifiers for Robust Text Classification | Michael Schlee et.al. | 2512.10793 | translate | read | null |
| 2025-12-11 | The FACTS Leaderboard: A Comprehensive Benchmark for Large Language Model Factuality | Aileen Cheng et.al. | 2512.10791 | translate | read | null |
| 2025-12-11 | Natural Language Interface for Firewall Configuration | F. Taghiyev et.al. | 2512.10789 | translate | read | null |
| 2025-12-11 | Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving | Holger Maus et.al. | 2512.10785 | translate | read | null |
| 2025-12-11 | Script Gap: Evaluating LLM Triage on Indian Languages in Native vs Roman Scripts in a Real World Setting | Manurag Khullar et.al. | 2512.10780 | translate | read | null |
| 2025-12-11 | OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification | Zijian Wu et.al. | 2512.10756 | translate | read | null |
| 2025-12-11 | LDP: Parameter-Efficient Fine-Tuning of Multimodal LLM for Medical Report Generation | Tianyu Zhou et.al. | 2512.10750 | translate | read | null |
| 2025-12-11 | Echoes of Automation: How Bots Shaped Political Discourse in Brazil | Merve Ipek Bal et.al. | 2512.10749 | translate | read | null |
| 2025-12-11 | TRIDENT: A Redundant Architecture for Caribbean-Accented Emergency Speech Triage | Elroy Galbraith et.al. | 2512.10741 | translate | read | null |
| 2025-12-11 | Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving | Songyang Gao et.al. | 2512.10739 | translate | read | null |
| 2025-12-11 | Textual Data Bias Detection and Mitigation - An Extensible Pipeline with Experimental Evaluation | Rebekka Görge et.al. | 2512.10734 | translate | read | null |
| 2025-12-11 | IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation | Yuan-Ming Li et.al. | 2512.10730 | translate | read | link |
| 2025-12-11 | Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality | Lingjing Kong et.al. | 2512.10720 | translate | read | null |
| 2025-12-11 | PACIFIC: a framework for generating benchmarks to check Precise Automatically Checked Instruction Following In Code | Itay Dreyfuss et.al. | 2512.10713 | translate | read | null |
| 2025-12-11 | COMPARE: Clinical Optimization with Modular Planning and Assessment via RAG-Enhanced AI-OCT: Superior Decision Support for Percutaneous Coronary Intervention Compared to ChatGPT-5 and Junior Operators | Wei Fang et.al. | 2512.10702 | translate | read | null |
| 2025-12-11 | Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution | Zouying Cao et.al. | 2512.10696 | translate | read | null |
| 2025-12-11 | Challenges of Evaluating LLM Safety for User Welfare | Manon Kempermann et.al. | 2512.10687 | translate | read | null |
| 2025-12-11 | On the Dynamics of Multi-Agent LLM Communities Driven by Value Diversity | Muhua Huang et.al. | 2512.10665 | translate | read | null |
| 2025-12-11 | Token Sample Complexity of Attention | Léa Bohbot et.al. | 2512.10656 | translate | read | null |
| 2025-12-11 | TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection | Jian-Yu Jiang-Lin et.al. | 2512.10652 | translate | read | null |
| 2025-12-11 | From Data Scarcity to Data Care: Reimagining Language Technologies for Serbian and other Low-Resource Languages | Smiljana Antonijevic Ubois et.al. | 2512.10630 | translate | read | null |
| 2025-12-11 | AgriGPT-Omni: A Unified Speech-Vision-Text Framework for Multilingual Agricultural Intelligence | Bo Yang et.al. | 2512.10624 | translate | read | null |
| 2025-12-11 | Phythesis: Physics-Guided Evolutionary Scene Synthesis for Energy-Efficient Data Center Design via LLMs | Minghao LI et.al. | 2512.10611 | translate | read | null |
| 2025-12-11 | Multi-Objective Reward and Preference Optimization: Theory and Algorithms | Akhil Agnihotri et.al. | 2512.10601 | translate | read | null |
| 2025-12-11 | Beyond Pixels: A Training-Free, Text-to-Text Framework for Remote Sensing Image Retrieval | J. Xiao et.al. | 2512.10596 | translate | read | null |
| 2025-12-11 | RoleRMBench & RoleRM: Towards Reward Modeling for Profile-Based Role Play in Dialogue Systems | Hang Ding et.al. | 2512.10575 | translate | read | null |
| 2025-12-11 | NormCode: A Semi-Formal Language for Context-Isolated AI Planning | Xin Guan et.al. | 2512.10563 | translate | read | null |
| 2025-12-11 | Causal Reasoning Favors Encoders: On The Limits of Decoder-Only Models | Amartya Roy et.al. | 2512.10561 | translate | read | null |
| 2025-12-11 | Grounding Everything in Tokens for Multimodal Large Language Models | Xiangxuan Ren et.al. | 2512.10554 | translate | read | null |
| 2025-12-11 | LLM-Auction: Generative Auction towards LLM-Native Advertising | Chujie Zhao et.al. | 2512.10551 | translate | read | null |
| 2025-12-11 | Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding | Yuchen Feng et.al. | 2512.10548 | translate | read | null |
| 2025-12-11 | Unlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse Autoencoders | Qingsen Ma et.al. | 2512.10547 | translate | read | null |
| 2025-12-11 | XDoGE: Multilingual Data Reweighting to Enhance Language Inclusivity in LLMs | Iñaki Lacunza et.al. | 2512.10545 | translate | read | null |
| 2025-12-11 | Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning | Haiteng Zhao et.al. | 2512.10534 | translate | read | null |
| 2025-12-11 | Zero-shot 3D Map Generation with LLM Agents: A Dual-Agent Architecture for Procedural Content Generation | Lim Chien Her et.al. | 2512.10501 | translate | read | null |
| 2025-12-11 | Decoding Human-LLM Collaboration in Coding: An Empirical Study of Multi-Turn Conversations in the Wild | Binquan Zhang et.al. | 2512.10493 | translate | read | null |
| 2025-12-11 | LLM-Assisted AHP for Explainable Cyber Range Evaluation | Vyron Kampourakis et.al. | 2512.10487 | translate | read | null |
| 2025-12-11 | From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection | Chaomeng Lu et.al. | 2512.10485 | translate | read | null |
| 2025-12-11 | Grammaticality Judgments in Humans and Language Models: Revisiting Generative Grammar with LLMs | Lars G. B. Johnsen et.al. | 2512.10453 | translate | read | null |
| 2025-12-11 | When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection | Devanshu Sahoo et.al. | 2512.10449 | translate | read | null |
| 2025-12-11 | Decoding Student Minds: Leveraging Conversational Agents for Psychological and Learning Analysis | Nour El Houda Ben Chaabene et.al. | 2512.10441 | translate | read | null |
| 2025-12-11 | Enhancing Next-Generation Language Models with Knowledge Graphs: Extending Claude, Mistral IA, and GPT-4 via KG-BERT | Nour El Houda Ben Chaabene et.al. | 2512.10440 | translate | read | null |
| 2025-12-11 | Semantic Reconstruction of Adversarial Plagiarism: A Context-Aware Framework for Detecting and Restoring “Tortured Phrases” in Scientific Literature | Agniva Maiti et.al. | 2512.10435 | translate | read | null |
| 2025-12-11 | Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers | Youmin Ko et.al. | 2512.10422 | translate | read | null |
| 2025-12-11 | How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation | Devanshu Sahoo et.al. | 2512.10415 | translate | read | null |
| 2025-12-11 | Sliding Window Attention Adaptation | Yijiong Yu et.al. | 2512.10411 | translate | read | null |
| 2025-12-11 | RoboNeuron: A Modular Framework Linking Foundation Models and ROS for Embodied AI | Weifan Guan et.al. | 2512.10394 | translate | read | null |
| 2025-12-11 | GPG: Generalized Policy Gradient Theorem for Transformer-based Policies | Hangyu Mao et.al. | 2512.10365 | translate | read | null |
| 2025-12-11 | Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models | Woojun Jung et.al. | 2512.10362 | translate | read | null |
| 2025-12-11 | Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task | Sunqi Fan et.al. | 2512.10359 | translate | read | null |
| 2025-12-11 | Dynamics of Agentic Loops in Large Language Models: A Geometric Theory of Trajectories | Nicolas Tacheny et.al. | 2512.10350 | translate | read | null |
| 2025-12-11 | EchoingPixels: Cross-Modal Adaptive Token Reduction for Efficient Audio-Visual LLMs | Chao Gong et.al. | 2512.10324 | translate | read | null |
| 2025-12-11 | EpiPlanAgent: Agentic Automated Epidemic Response Planning | Kangkun Mao et.al. | 2512.10313 | translate | read | null |
| 2025-12-11 | Efficient-VLN: A Training-Efficient Vision-Language Navigation Model | Duo Zheng et.al. | 2512.10310 | translate | read | null |
| 2025-12-11 | Reverse Thinking Enhances Missing Information Detection in Large Language Models | Yuxin Liu et.al. | 2512.10273 | translate | read | null |
| 2025-12-11 | VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models | Yuetong Su et.al. | 2512.10262 | translate | read | null |
| 2025-12-11 | Reject or Not?: A Benchmark for Voice Assistant Query Rejection in Smart Home Scenario and an Improved Method Based on LLMs | Huichao Men et.al. | 2512.10257 | translate | read | null |
| 2025-12-11 | InFerActive: Towards Scalable Human Evaluation of Large Language Models through Interactive Inference | Junhyeong Hwangbo et.al. | 2512.10234 | translate | read | null |
| 2025-12-11 | Adaptive Information Routing for Multimodal Time Series Forecasting | Jun Seo et.al. | 2512.10229 | translate | read | null |
| 2025-12-11 | Does SWE-Bench-Verified Test Agent Ability or Model Memory? | Thanosan Prathifkumar et.al. | 2512.10218 | translate | read | null |
| 2025-12-11 | CP-Env: Evaluating Large Language Models on Clinical Pathways in a Controllable Hospital Environment | Yakun Zhu et.al. | 2512.10206 | translate | read | null |
| 2025-12-11 | AutoMedic: An Automated Evaluation Framework for Clinical Conversational Agents with Medical Dataset Grounding | Gyutaek Oh et.al. | 2512.10195 | translate | read | null |
| 2025-12-11 | CIEGAD: Cluster-Conditioned Interpolative and Extrapolative Framework for Geometry-Aware and Domain-Aligned Data Augmentation | Keito Inoshita et.al. | 2512.10178 | translate | read | null |
| 2025-12-11 | ATLAS: Automated Toolkit for Large-Scale Verified Code Synthesis | Mantas Baksys et.al. | 2512.10173 | translate | read | null |
| 2025-12-11 | Offscript: Automated Auditing of Instruction Adherence in LLMs | Nicholas Clark et.al. | 2512.10172 | translate | read | null |
| 2025-12-10 | Enhancing Large Language Models for End-to-End Circuit Analysis Problem Solving | Liangliang Chen et.al. | 2512.10159 | translate | read | null |
| 2025-12-10 | Unforgotten Safety: Preserving Safety Alignment of Large Language Models with Continual Learning | Lama Alssum et.al. | 2512.10150 | translate | read | null |
| 2025-12-10 | PARAN: Persona-Augmented Review ANswering system on Food Delivery Review Dataset | Moonsoo Park et.al. | 2512.10148 | translate | read | null |
| 2025-12-10 | Workflow is All You Need: Escaping the “Statistical Smoothing Trap” via High-Entropy Information Foraging and Adversarial Pacing | Zhongjie Jiang et.al. | 2512.10121 | translate | read | null |
| 2025-12-10 | AgriRegion: Region-Aware Retrieval for High-Fidelity Agricultural Advice | Mesafint Fanuel et.al. | 2512.10114 | translate | read | null |
| 2025-12-10 | Generate-Then-Validate: A Novel Question Generation Approach Using Small Language Models | Yumou Wei et.al. | 2512.10110 | translate | read | null |
| 2025-12-10 | LLM-PEA: Leveraging Large Language Models Against Phishing Email Attacks | Najmul Hassan et.al. | 2512.10104 | translate | read | null |
| 2025-12-10 | What Kind of Reasoning (if any) is an LLM actually doing? On the Stochastic Nature and Abductive Appearance of Large Language Models | Luciano Floridi et.al. | 2512.10080 | translate | read | null |
| 2025-12-10 | Independent Density Estimation | Jiahao Liu et.al. | 2512.10067 | translate | read | null |
| 2025-12-10 | Linear socio-demographic representations emerge in Large Language Models from indirect cues | Paul Bouchaud et.al. | 2512.10065 | translate | read | null |
| 2025-12-10 | \textsc{Text2Graph}: Combining Lightweight LLMs and GNNs for Efficient Text Classification in Label-Scarce Scenarios | João Lucas Luz Lima Sarcinelli et.al. | 2512.10061 | translate | read | null |
| 2025-12-10 | Parallel Decoder Transformer: Model-Internal Parallel Decoding with Speculative Invariance via Note Conditioning | Logan Robbins et.al. | 2512.10054 | translate | read | null |
| 2025-12-10 | Detailed balance in large language model-driven agents | Zhuo-Yang Song et.al. | 2512.10047 | translate | read | null |
| 2025-12-10 | Local LLM Ensembles for Zero-shot Portuguese Named Entity Recognition | João Lucas Luz Lima Sarcinelli et.al. | 2512.10043 | translate | read | null |
| 2025-12-10 | Intelligently Weighting Multiple Reference Models for Direct Preference Optimization of LLMs | Skyler Wu et.al. | 2512.10040 | translate | read | null |
| 2025-12-10 | Exploring LLMs for Scientific Information Extraction Using The SciEx Framework | Sha Li et.al. | 2512.10004 | translate | read | null |
| 2025-12-10 | SCOPE: Language Models as One-Time Teacher for Hierarchical Planning in Text Environments | Haoye Lu et.al. | 2512.09897 | translate | read | null |
| 2025-12-10 | Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs | Pius Horn et.al. | 2512.09874 | translate | read | link |
| 2025-12-10 | FlipLLM: Efficient Bit-Flip Attacks on Multimodal LLMs using Reinforcement Learning | Khurram Khalil et.al. | 2512.09872 | translate | read | null |
| 2025-12-10 | MedForget: Hierarchy-Aware Multimodal Unlearning Testbed for Medical AI | Fengli Wu et.al. | 2512.09867 | translate | read | null |
| 2025-12-10 | UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving | Hao Lu et.al. | 2512.09864 | translate | read | null |
| 2025-12-10 | Mitigating Social Bias in English and Urdu Language Models Using PRM-Guided Candidate Selection and Sequential Refinement | Muneeb Ur Raheem Khan et.al. | 2512.09854 | translate | read | null |
| 2025-12-10 | ChronusOmni: Improving Time Awareness of Omni Large Language Models | Yijing Chen et.al. | 2512.09841 | translate | read | null |
| 2025-12-10 | LLMs in Interpreting Legal Documents | Simone Corbo et.al. | 2512.09830 | translate | read | null |
| 2025-12-10 | RIFT: A Scalable Methodology for LLM Accelerator Fault Assessment using Reinforcement Learning | Khurram Khalil et.al. | 2512.09829 | translate | read | null |
| 2025-12-10 | DeepSeek’s WEIRD Behavior: The cultural alignment of Large Language Models and the effects of prompt language and cultural prompting | James Luther et.al. | 2512.09772 | translate | read | null |
| 2025-12-10 | Defining Cost Function of Steganography with Large Language Models | Hanzhou Wu et.al. | 2512.09769 | translate | read | null |
| 2025-12-10 | Towards Language Model Guided TLA+ Proof Automation | Yuhao Zhou et.al. | 2512.09758 | translate | read | null |
| 2025-12-10 | Knowledge Graph Enrichment and Reasoning for Nobel Laureates | Thanh-Lam T. Nguyen et.al. | 2512.09707 | translate | read | null |
| 2025-12-10 | Exqutor: Extended Query Optimizer for Vector-augmented Analytical Queries | Hyunjoon Kim et.al. | 2512.09695 | translate | read | null |
| 2025-12-10 | Understanding Chain-of-Thought Effectiveness in Code Generation: An Empirical and Information-Theoretic Analysis | Naizhu Jin et.al. | 2512.09679 | translate | read | null |
| 2025-12-10 | The Ky Fan Norms and Beyond: Dual Norms and Combinations for Matrix Optimization | Alexey Kravatskiy et.al. | 2512.09678 | translate | read | null |
| 2025-12-10 | d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models | Leyi Pan et.al. | 2512.09675 | translate | read | null |
| 2025-12-10 | IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting | Tao Zhang et.al. | 2512.09663 | translate | read | link |
| 2025-12-10 | Can LLMs Evaluate What They Cannot Annotate? Revisiting LLM Reliability in Hate Speech Detection | Paloma Piot et.al. | 2512.09662 | translate | read | null |
| 2025-12-10 | Measuring Corruption from Text Data | Arieda Muço et.al. | 2512.09652 | translate | read | null |
| 2025-12-10 | MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment | Mengxi Xiao et.al. | 2512.09636 | translate | read | null |
| 2025-12-10 | Creation of the Estonian Subjectivity Dataset: Assessing the Degree of Subjectivity on a Scale | Karl Gustav Gailit et.al. | 2512.09634 | translate | read | null |
| 2025-12-10 | An End-to-end Planning Framework with Agentic LLMs and PDDL | Emanuele La Malfa et.al. | 2512.09629 | translate | read | null |
| 2025-12-10 | LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection | Jingwei Ye et.al. | 2512.09627 | translate | read | null |
| 2025-12-10 | Rethinking Chain-of-Thought Reasoning for Videos | Yiwu Zhong et.al. | 2512.09616 | translate | read | link |
| 2025-12-10 | ImageTalk: Designing a Multimodal AAC Text Generation System Driven by Image Recognition and Natural Language Generation | Boyin Yang et.al. | 2512.09610 | translate | read | null |
| 2025-12-10 | Investigate the Low-level Visual Perception in Vision-Language based Image Quality Assessment | Yuan Li et.al. | 2512.09573 | translate | read | null |
| 2025-12-10 | System Report for CCL25-Eval Task 10: Prompt-Driven Large Language Model Merge for Fine-Grained Chinese Hate Speech Detection | Binglin Wu et.al. | 2512.09563 | translate | read | null |
| 2025-12-10 | Systematic Framework of Application Methods for Large Language Models in Language Sciences | Kun Sun et.al. | 2512.09552 | translate | read | null |
| 2025-12-10 | Chasing Shadows: Pitfalls in LLM Security Research | Jonathan Evertz et.al. | 2512.09549 | translate | read | null |
| 2025-12-10 | Supporting Dynamic Agentic Workloads: How Data and Agents Interact | Ioana Giurgiu et.al. | 2512.09548 | translate | read | null |
| 2025-12-10 | Don’t Throw Away Your Beams: Improving Consistency-based Uncertainties in LLMs via Beam Search | Ekaterina Fadeeva et.al. | 2512.09538 | translate | read | null |
| 2025-12-10 | CNFinBench: A Benchmark for Safety and Compliance of Large Language Models in Finance | Jinru Ding et.al. | 2512.09506 | translate | read | null |
| 2025-12-10 | RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning | Yucan Guo et.al. | 2512.09487 | translate | read | null |
| 2025-12-10 | Advancing LLM-Based Security Automation with Customized Group Relative Policy Optimization for Zero-Touch Networks | Xinye Cao et.al. | 2512.09485 | translate | read | null |
| 2025-12-10 | An Efficient Interaction Human-AI Synergy System Bridging Visual Awareness and Large Language Model for Intensive Care Units | Yibowen Zhao et.al. | 2512.09473 | translate | read | null |
| 2025-12-10 | WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving | Chiheng Lou et.al. | 2512.09472 | translate | read | null |
| 2025-12-10 | Advancing Text Classification with Large Language Models and Neural Attention Mechanisms | Ning Lyu et.al. | 2512.09444 | translate | read | null |
| 2025-12-10 | Advancing Research via Human-AI Interactive Theorem Proving | Chenyi Li et.al. | 2512.09443 | translate | read | null |
| 2025-12-10 | Knowledge-Augmented Large Language Model Agents for Explainable Financial Decision-Making | Qingyuan Zhang et.al. | 2512.09440 | translate | read | null |
| 2025-12-10 | ODMA: On-Demand Memory Allocation Framework for LLM Serving on LPDDR-Class Accelerators | Guoqiang Zou et.al. | 2512.09427 | translate | read | null |
| 2025-12-10 | Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs | Sohely Jahan et.al. | 2512.09403 | translate | read | null |
| 2025-12-10 | Optimizing Data Extraction from Materials Science Literature: A Study of Tools Using Large Language Models | Wenkai Ning et.al. | 2512.09370 | translate | read | null |
| 2025-12-10 | Are Hypervectors Enough? Single-Call LLM Reasoning over Knowledge Graphs | Yezi Liu et.al. | 2512.09369 | translate | read | null |
| 2025-12-10 | Video-QTR: Query-Driven Temporal Reasoning Framework for Lightweight Video Understanding | Xinkui Zhao et.al. | 2512.09354 | translate | read | null |
| 2025-12-10 | Self Distillation Fine-Tuning of Protein Language Models Improves Versatility in Protein Design | Amin Tavakoli et.al. | 2512.09329 | translate | read | null |
| 2025-12-10 | RACAM: Enhancing DRAM with Reuse-Aware Computation and Automated Mapping for ML Inference | Siyuan Ma et.al. | 2512.09304 | translate | read | null |
| 2025-12-10 | Identifying Bias in Machine-generated Text Detection | Kevin Stowe et.al. | 2512.09292 | translate | read | null |
| 2025-12-10 | LongT2IBench: A Benchmark for Evaluating Long Text-to-Image Generation with Graph-structured Annotations | Zhichao Yang et.al. | 2512.09271 | translate | read | null |
| 2025-12-10 | From Forecast to Action: Uncertainty-Aware UAV Deployment for Ocean Drifter Recovery | Jingeun Kim et.al. | 2512.09260 | translate | read | null |
| 2025-12-10 | The Illusion of Rationality: Tacit Bias and Strategic Dominance in Frontier LLM Negotiation Games | Manuel S. Ríos et.al. | 2512.09254 | translate | read | null |
| 2025-12-10 | GLACIA: Instance-Aware Positional Reasoning for Glacial Lake Segmentation via Multimodal Large Language Model | Lalit Maurya et.al. | 2512.09251 | translate | read | null |
| 2025-12-10 | Training-free Context-adaptive Attention for Efficient Long Context Modeling | Zeng You et.al. | 2512.09238 | translate | read | null |
| 2025-12-10 | CORE: A Conceptual Reasoning Layer for Large Language Models | Vishwas Hegde et.al. | 2512.09222 | translate | read | null |
| 2025-12-10 | Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM Alignment | Zixuan Liu et.al. | 2512.09212 | translate | read | null |
| 2025-12-09 | LLMs for Analog Circuit Design Continuum (ACDC) | Yasaman Esfandiari et.al. | 2512.09199 | translate | read | null |
| 2025-12-09 | TritonForge: Profiling-Guided Framework for Automated Triton Kernel Optimization | Haonan Li et.al. | 2512.09196 | translate | read | null |
| 2025-12-09 | WOLF: Werewolf-based Observations for LLM Deception and Falsehoods | Mrinal Agarwal et.al. | 2512.09187 | translate | read | null |
| 2025-12-09 | MindShift: Analyzing Language Models’ Reactions to Psychological Prompts | Anton Vasiliuk et.al. | 2512.09149 | translate | read | null |
| 2025-12-09 | Detecting Hallucinations in Graph Retrieval-Augmented Generation via Attention Patterns and Semantic Alignment | Shanghao Li et.al. | 2512.09148 | translate | read | null |
| 2025-12-09 | Knowledge-Guided Large Language Model for Automatic Pediatric Dental Record Understanding and Safe Antibiotic Recommendation | Zihan Han et.al. | 2512.09127 | translate | read | null |
| 2025-12-09 | A Categorical Analysis of Large Language Models and Why LLMs Circumvent the Symbol Grounding Problem | Luciano Floridi et.al. | 2512.09117 | translate | read | null |
| 2025-12-09 | Evolving Excellence: Automated Optimization of LLM-based Agents | Paul Brookes et.al. | 2512.09108 | translate | read | null |
| 2025-12-09 | Learning Unmasking Policies for Diffusion Language Models | Metod Jazbec et.al. | 2512.09106 | translate | read | null |
| 2025-12-09 | Explaining the Unseen: Multimodal Vision-Language Reasoning for Situational Awareness in Underground Mining Disasters | Mizanur Rahman Jewel et.al. | 2512.09092 | translate | read | null |
| 2025-12-09 | Calibrated Trust in Dealing with LLM Hallucinations: A Qualitative Study | Adrian Ryser et.al. | 2512.09088 | translate | read | null |
| 2025-12-09 | AgentComp: From Agentic Reasoning to Compositional Mastery in Text-to-Image Models | Arman Zarei et.al. | 2512.09081 | translate | read | null |
| 2025-12-09 | Llama-based source code vulnerability detection: Prompt engineering vs Fine tuning | Dyna Soumhane Ouchebara et.al. | 2512.09006 | translate | read | null |
| 2025-12-09 | Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs | Angela van Sprang et.al. | 2512.08923 | translate | read | null |
| 2025-12-09 | Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training | Jakub Krajewski et.al. | 2512.08894 | translate | read | null |
| 2025-12-09 | Toward Faithful Retrieval-Augmented Generation with Sparse Autoencoders | Guangzhi Xiong et.al. | 2512.08892 | translate | read | null |
| 2025-12-09 | AI Didn’t Start the Fire: Examining the Stack Exchange Moderator and Contributor Strike | Yiwei Wu et.al. | 2512.08884 | translate | read | null |
| 2025-12-09 | When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation | Joshua Ward et.al. | 2512.08875 | translate | read | null |
| 2025-12-09 | Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning | Jing Jie Tan et.al. | 2512.08873 | translate | read | null |
| 2025-12-09 | SimpleDevQA: Benchmarking Large Language Models on Development Knowledge QA | Jing Zhang et.al. | 2512.08867 | translate | read | null |
| 2025-12-09 | Ask, Answer, and Detect: Role-Playing LLMs for Personality Detection with Question-Conditioned Mixture-of-Experts | Yifan Lyu et.al. | 2512.08814 | translate | read | null |
| 2025-12-09 | PrivTune: Efficient and Privacy-Preserving Fine-Tuning of Large Language Models via Device-Cloud Collaboration | Yi Liu et.al. | 2512.08809 | translate | read | null |
| 2025-12-09 | A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs | Mahmoud Srewa et.al. | 2512.08786 | translate | read | null |
| 2025-12-09 | A Practical Guide for Designing, Developing, and Deploying Production-Grade Agentic AI Workflows | Eranga Bandara et.al. | 2512.08769 | translate | read | null |
| 2025-12-09 | Financial News Summarization: Can extractive methods still offer a true alternative to LLMs? | Nicolas Reche et.al. | 2512.08764 | translate | read | null |
| 2025-12-09 | Towards Foundation Models with Native Multi-Agent Intelligence | Shuyue Hu et.al. | 2512.08743 | translate | read | null |
| 2025-12-09 | LaMoSys3.5D: Enabling 3.5D-IC-Based Large Language Model Inference Serving Systems via Hardware/Software Co-Design | Qipan Wang et.al. | 2512.08731 | translate | read | null |
| 2025-12-09 | Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search | Manos Plitsis et.al. | 2512.08724 | translate | read | null |
| 2025-12-09 | Multi-Agent Intelligence for Multidisciplinary Decision-Making in Gastrointestinal Oncology | Rongzhao Zhang et.al. | 2512.08674 | translate | read | null |
| 2025-12-09 | An Agentic AI System for Multi-Framework Communication Coding | Bohao Yang et.al. | 2512.08659 | translate | read | null |
| 2025-12-09 | QSTN: A Modular Framework for Robust Questionnaire Inference with Large Language Models | Maximilian Kreutner et.al. | 2512.08646 | translate | read | null |
| 2025-12-09 | Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation | Young Kyung Kim et.al. | 2512.08645 | translate | read | null |
| 2025-12-09 | See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm | Haoyu Zhao et.al. | 2512.08629 | translate | read | null |
| 2025-12-09 | HealthcareNLP: where are we and what is next? | Lifeng Han et.al. | 2512.08617 | translate | read | null |
| 2025-12-09 | CogMCTS: A Novel Cognitive-Guided Monte Carlo Tree Search Framework for Iterative Heuristic Evolution with Large Language Models | Hui Wang et.al. | 2512.08609 | translate | read | null |
| 2025-12-09 | Bridging Scale Discrepancies in Robotic Control via Language-Based Action Representations | Yuchi Zhang et.al. | 2512.08548 | translate | read | null |
| 2025-12-09 | Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks | Indrajit Kar et.al. | 2512.08545 | translate | read | null |
| 2025-12-09 | Principles2Plan: LLM-Guided System for Operationalising Ethical Principles into Plans | Tammy Zhong et.al. | 2512.08536 | translate | read | null |
| 2025-12-09 | Autonomous Issue Resolver: Towards Zero-Touch Code Maintenance | Aliaksei Kaliutau et.al. | 2512.08492 | translate | read | null |
| 2025-12-09 | Soft Inductive Bias Approach via Explicit Reasoning Perspectives in Inappropriate Utterance Detection Using Large Language Models | Ju-Young Kim et.al. | 2512.08480 | translate | read | null |
| 2025-12-09 | A Multi-Agent LLM Framework for Design Space Exploration in Autonomous Driving Systems | Po-An Shih et.al. | 2512.08476 | translate | read | null |
| 2025-12-09 | Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models III: Implementing the Bacterial Biothreat Benchmark (B3) Dataset | Gary Ackerman et.al. | 2512.08459 | translate | read | null |
| 2025-12-09 | Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models II: Benchmark Generation Process | Gary Ackerman et.al. | 2512.08451 | translate | read | null |
| 2025-12-09 | What Triggers my Model? Contrastive Explanations Inform Gender Choices by Translation Models | Janiça Hackenbuchner et.al. | 2512.08440 | translate | read | null |
| 2025-12-09 | Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs | Yinan Zhong et.al. | 2512.08417 | translate | read | null |
| 2025-12-09 | Towards Effective and Efficient Long Video Understanding of Multimodal Large Language Models via One-shot Clip Retrieval | Tao Chen et.al. | 2512.08410 | translate | read | null |
| 2025-12-09 | DFALLM: Achieving Generalizable Multitask Deepfake Detection by Optimizing Audio LLM Components | Yupei Li et.al. | 2512.08403 | translate | read | null |
| 2025-12-09 | The Unseen Bias: How Norm Discrepancy in Pre-Norm MLLMs Leads to Visual Information Loss | Bozhou Li et.al. | 2512.08374 | translate | read | null |
| 2025-12-09 | Reflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision Making | Wentao Zhang et.al. | 2512.08366 | translate | read | null |
| 2025-12-09 | The High Cost of Incivility: Quantifying Interaction Inefficiency via Multi-Agent Monte Carlo Simulations | Benedikt Mangold et.al. | 2512.08345 | translate | read | null |
| 2025-12-09 | Argus: A Multi-Agent Sensitive Information Leakage Detection Framework Based on Hierarchical Reference Relationships | Bin Wang et.al. | 2512.08326 | translate | read | null |
| 2025-12-09 | rSIM: Incentivizing Reasoning Capabilities of LLMs via Reinforced Strategy Injection | Sijia Chen et.al. | 2512.08300 | translate | read | null |
| 2025-12-09 | Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem | Shiva Gaire et.al. | 2512.08290 | translate | read | null |
| 2025-12-09 | Empowering smart app development with SolidGPT: an edge-cloud hybrid AI agent framework | Liao Hu et.al. | 2512.08286 | translate | read | null |
| 2025-12-09 | AgentEval: Generative Agents as Reliable Proxies for Human Evaluation of AI-Generated Content | Thanh Vu et.al. | 2512.08273 | translate | read | null |
| 2025-12-09 | Reasoning Models Ace the CFA Exams | Jaisal Patel et.al. | 2512.08270 | translate | read | null |
| 2025-12-09 | Token Sugar: Making Source Code Sweeter for LLMs through Token-Efficient Shorthand | Zhensu Sun et.al. | 2512.08266 | translate | read | null |
| 2025-12-09 | Beyond Traditional Diagnostics: Transforming Patient-Side Information into Predictive Insights with Knowledge Graphs and Prototypes | Yibowen Zhao et.al. | 2512.08261 | translate | read | null |
| 2025-12-09 | Chopper: A Multi-Level GPU Characterization Tool & Derived Insights Into LLM Training Inefficiency | Marco Kurzynski et.al. | 2512.08242 | translate | read | null |
| 2025-12-09 | SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection | Ching-Hung Cheng et.al. | 2512.08223 | translate | read | null |
| 2025-12-09 | Secure or Suspect? Investigating Package Hallucinations of Shell Command in Original and Quantized LLMs | Md Nazmul Haque et.al. | 2512.08213 | translate | read | null |
| 2025-12-09 | MobileFineTuner: A Unified End-to-End Framework for Fine-Tuning LLMs on Mobile Phones | Jiaxiang Geng et.al. | 2512.08211 | translate | read | null |
| 2025-12-09 | ClinicalTrialsHub: Bridging Registries and Literature for Comprehensive Clinical Trial Access | Jiwoo Park et.al. | 2512.08193 | translate | read | null |
| 2025-12-09 | A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties | Jinghao Wang et.al. | 2512.08185 | translate | read | null |
| 2025-12-09 | Framing Climate Change on YouTube: North-South Divides in Narratives and Public Engagement | Sanika Damle et.al. | 2512.08183 | translate | read | null |
| 2025-12-09 | Chat with UAV – Human-UAV Interaction Based on Large Language Models | Haoran Wang et.al. | 2512.08145 | translate | read | null |
| 2025-12-09 | PolyLingua: Margin-based Inter-class Transformer for Robust Cross-domain Language Detection | Ali Lotfi Rezaabad et.al. | 2512.08143 | translate | read | null |
| 2025-12-09 | Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models I: The Task-Query Architecture | Gary Ackerman et.al. | 2512.08130 | translate | read | null |
| 2025-12-09 | Universal Adversarial Suffixes Using Calibrated Gumbel-Softmax Relaxation | Sampriti Soor et.al. | 2512.08123 | translate | read | null |
| 2025-12-08 | Evolutionary perspective of large language models on shaping research insights into healthcare disparities | David An et.al. | 2512.08122 | translate | read | null |
| 2025-12-08 | Balanced Accuracy: The Right Metric for Evaluating LLM Judges – Explained through Youden’s J statistic | Stephane Collot et.al. | 2512.08121 | translate | read | null |
| 2025-12-08 | Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies | Stephan Carney et.al. | 2512.08107 | translate | read | null |
| 2025-12-08 | AgentCrypt: Advancing Privacy and (Secure) Computation in AI Agent Collaboration | Harish Karthikeyan et.al. | 2512.08104 | translate | read | null |
| 2025-12-08 | Training LLMs for Honesty via Confessions | Manas Joglekar et.al. | 2512.08093 | translate | read | null |
| 2025-12-08 | Adaptation of Embedding Models to Financial Filings via LLM Distillation | Eliot Brenner et.al. | 2512.08088 | translate | read | null |
| 2025-12-08 | Exploiting the Randomness of Large Language Models (LLM) in Text Classification Tasks: Locating Privileged Documents in Legal Matters | Keith Huffman et.al. | 2512.08083 | translate | read | null |
| 2025-12-08 | Short-Context Dominance: How Much Local Context Natural Language Actually Needs? | Vala Vakilian et.al. | 2512.08082 | translate | read | null |
| 2025-12-08 | Leveraging Machine Learning and Large Language Models for Automated Image Clustering and Description in Legal Discovery | Qiang Mao et.al. | 2512.08079 | translate | read | null |
| 2025-12-08 | A Comparative Study of Retrieval Methods in Azure AI Search | Qiang Mao et.al. | 2512.08078 | translate | read | null |
| 2025-12-08 | Unveiling Latent Knowledge in Chemistry Language Models through Sparse Autoencoders | Jaron Cohen et.al. | 2512.08077 | translate | read | null |
| 2025-12-08 | Large Language Models for Education and Research: An Empirical and User Survey-based Analysis | Md Mostafizer Rahman et.al. | 2512.08057 | translate | read | null |
| 2025-12-08 | CLARITY: Medical World Model for Guiding Treatment Decisions by Modeling Context-Aware Disease Trajectories in Latent Space | Tianxingjian Ding et.al. | 2512.08029 | translate | read | null |
| 2025-12-08 | Toward an AI Reasoning-Enabled System for Patient-Clinical Trial Matching | Caroline N. Leach et.al. | 2512.08026 | translate | read | null |
| 2025-12-08 | FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models | Jiyoon Pyo et.al. | 2512.08016 | translate | read | null |
| 2025-12-08 | Bridging the Clinical Expertise Gap: Development of a Web-Based Platform for Accessible Time Series Forecasting and Analysis | Aaron D. Mullen et.al. | 2512.07992 | translate | read | null |
| 2025-12-08 | DeepCode: Open Agentic Coding | Zongwei Li et.al. | 2512.07921 | translate | read | link |
| 2025-12-08 | Relational Visual Similarity | Thao Nguyen et.al. | 2512.07833 | translate | read | null |
| 2025-12-08 | Do Generalisation Results Generalise? | Matteo Boglioni et.al. | 2512.07832 | translate | read | null |
| 2025-12-08 | Understanding Privacy Risks in Code Models Through Training Dynamics: A Causal Approach | Hua Yang et.al. | 2512.07814 | translate | read | null |
| 2025-12-08 | LLM Use for Mental Health: Crowdsourcing Users’ Sentiment-based Perspectives and Values from Social Discussions | Lingyao Li et.al. | 2512.07797 | translate | read | null |
| 2025-12-08 | Large Causal Models from Large Language Models | Sridhar Mahadevan et.al. | 2512.07796 | translate | read | null |
| 2025-12-08 | ReasonBENCH: Benchmarking the (In)Stability of LLM Reasoning | Nearchos Potamitis et.al. | 2512.07795 | translate | read | null |
| 2025-12-08 | Automating High Energy Physics Data Analysis with LLM-Powered Agents | Eli Gendreau-Distler et.al. | 2512.07785 | translate | read | null |
| 2025-12-08 | Mary, the Cheeseburger-Eating Vegetarian: Do LLMs Recognize Incoherence in Narratives? | Karin de Langis et.al. | 2512.07777 | translate | read | null |
| 2025-12-08 | RL-MTJail: Reinforcement Learning for Automated Black-Box Multi-Turn Jailbreaking of Large Language Models | Xiqiao Xiong et.al. | 2512.07761 | translate | read | null |
| 2025-12-08 | SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery | Meng Cao et.al. | 2512.07733 | translate | read | null |
| 2025-12-08 | SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination | Sangha Park et.al. | 2512.07730 | translate | read | null |
| 2025-12-08 | Privacy Practices of Browser Agents | Alisha Ukani et.al. | 2512.07725 | translate | read | null |
| 2025-12-08 | In-Context and Few-Shots Learning for Forecasting Time Series Data based on Large Language Models | Saroj Gopali et.al. | 2512.07705 | translate | read | null |
| 2025-12-08 | HalluShift++: Bridging Language and Vision through Internal Representation Shifts for Hierarchical Hallucinations in MLLMs | Sujoy Nath et.al. | 2512.07687 | translate | read | null |
| 2025-12-08 | When Large Language Models Do Not Work: Online Incivility Prediction through Graph Neural Networks | Zihan Chen et.al. | 2512.07684 | translate | read | null |
| 2025-12-08 | Depth-Wise Activation Steering for Honest Language Models | Gracjan Góral et.al. | 2512.07667 | translate | read | null |
| 2025-12-08 | Bridging Code Graphs and Large Language Models for Better Code Understanding | Zeqi Chen et.al. | 2512.07666 | translate | read | null |
| 2025-12-08 | Reliable agent engineering should integrate machine-compatible organizational principles | R. Patrick Xian et.al. | 2512.07665 | translate | read | null |
| 2025-12-08 | An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research | Hamad Almazrouei et.al. | 2512.07652 | translate | read | null |
| 2025-12-08 | PCMind-2.1-Kaiyuan-2B Technical Report | Kairong Luo et.al. | 2512.07612 | translate | read | null |
| 2025-12-08 | Comparative Analysis and Parametric Tuning of PPO, GRPO, and DAPO for LLM Reasoning Enhancement | Yongsheng Lian et.al. | 2512.07611 | translate | read | null |
| 2025-12-08 | Metric-Fair Prompting: Treating Similar Samples Similarly | Jing Wang et.al. | 2512.07608 | translate | read | null |
| 2025-12-08 | Complementary Learning Approach for Text Classification using Large Language Models | Navid Asgari et.al. | 2512.07583 | translate | read | null |
| 2025-12-08 | All You Need Are Random Visual Tokens? Demystifying Token Pruning in VLLMs | Yahong Wang et.al. | 2512.07580 | translate | read | null |
| 2025-12-08 | A Simple Method to Enhance Pre-trained Language Models with Speech Tokens for Classification | Nicolas Calbucura et.al. | 2512.07571 | translate | read | null |
| 2025-12-08 | MoCoRP: Modeling Consistent Relations between Persona and Response for Persona-based Dialogue | Kyungro Lee et.al. | 2512.07544 | translate | read | null |
| 2025-12-08 | SwissGov-RSD: A Human-annotated, Cross-lingual Benchmark for Token-level Recognition of Semantic Differences Between Related Documents | Michelle Wastl et.al. | 2512.07538 | translate | read | null |
| 2025-12-08 | Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs | Xiaoran Liu et.al. | 2512.07525 | translate | read | link |
| 2025-12-08 | AutoICE: Automatically Synthesizing Verifiable C Code via LLM-driven Evolution | Weilin Luo et.al. | 2512.07501 | translate | read | null |
| 2025-12-08 | How Do LLMs Fail In Agentic Scenarios? A Qualitative Analysis of Success and Failure Scenarios of Various LLMs in Agentic Simulations | JV Roig et.al. | 2512.07497 | translate | read | null |
| 2025-12-08 | Enhancing Agentic RL with Progressive Reward Shaping and Value-based Sampling Policy Optimization | Zhuoran Zhuang et.al. | 2512.07478 | translate | read | null |
| 2025-12-08 | Understanding LLM Agent Behaviours via Game Theory: Strategy Recognition, Biases and Multi-Agent Dynamics | Trung-Kiet Huynh et.al. | 2512.07462 | translate | read | null |
| 2025-12-08 | Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning | Tong Wu et.al. | 2512.07461 | translate | read | link |
| 2025-12-08 | Persian-Phi: Efficient Cross-Lingual Adaptation of Compact LLMs via Curriculum Learning | Amir Mohammad Akhlaghi et.al. | 2512.07454 | translate | read | null |
| 2025-12-08 | From Show Programmes to Data: Designing a Workflow to Make Performing Arts Ephemera Accessible Through Language Models | Clarisse Bardiot et.al. | 2512.07452 | translate | read | null |
| 2025-12-08 | MIDG: Mixture of Invariant Experts with knowledge injection for Domain Generalization in Multimodal Sentiment Analysis | Yangle Li et.al. | 2512.07430 | translate | read | null |
| 2025-12-08 | Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models | Haidong Kang et.al. | 2512.07419 | translate | read | null |
| 2025-12-08 | Do LLMs Trust the Code They Write? | Francisco Ribeiro et.al. | 2512.07404 | translate | read | null |
| 2025-12-08 | LUNE: Efficient LLM Unlearning via LoRA Fine-Tuning with Negative Examples | Yezi Liu et.al. | 2512.07375 | translate | read | null |
| 2025-12-08 | Communication-Efficient Serving for Video Diffusion Models with Latent Parallelism | Zhiyuan Wu et.al. | 2512.07350 | translate | read | null |
| 2025-12-08 | Generalized Referring Expression Segmentation on Aerial Photos | Luís Marnoto et.al. | 2512.07338 | translate | read | link |
| 2025-12-08 | DCO: Dynamic Cache Orchestration for LLM Accelerators through Predictive Management | Zhongchun Zhou et.al. | 2512.07312 | translate | read | null |
| 2025-12-08 | Exact Synthetic Populations for Scalable Societal and Market Modeling | Thierry Petit et.al. | 2512.07306 | translate | read | null |
| 2025-12-08 | Towards Accurate UAV Image Perception: Guiding Vision-Language Models with Stronger Task Prompts | Mingning Guo et.al. | 2512.07302 | translate | read | null |
| 2025-12-08 | Investigating Training and Generalization in Faithful Self-Explanations of Large Language Models | Tomoki Doi et.al. | 2512.07288 | translate | read | null |
| 2025-12-08 | Automatic Syntax Error Repair for Discrete Controller Synthesis using Large Language Model | Yusei Ishimizu et.al. | 2512.07261 | translate | read | null |
| 2025-12-08 | Ensembling LLM-Induced Decision Trees for Explainable and Robust Error Detection | Mengqi Wang et.al. | 2512.07246 | translate | read | null |
| 2025-12-08 | NeSTR: A Neuro-Symbolic Abductive Framework for Temporal Reasoning in Large Language Models | Feng Liang et.al. | 2512.07218 | translate | read | null |
| 2025-12-08 | MMRPT: MultiModal Reinforcement Pre-Training via Masked Vision-Dependent Reasoning | Xuhui Zheng et.al. | 2512.07203 | translate | read | null |
| 2025-12-08 | Generating Storytelling Images with Rich Chains-of-Reasoning | Xiujie Song et.al. | 2512.07198 | translate | read | null |
| 2025-12-08 | START: Spatial and Textual Learning for Chart Understanding | Zhuoming Liu et.al. | 2512.07186 | translate | read | link |
| 2025-12-08 | ContextualSHAP : Enhancing SHAP Explanations Through Contextual Language Generation | Latifa Dwiyanti et.al. | 2512.07178 | translate | read | null |
| 2025-12-08 | SPACE: Noise Contrastive Estimation Stabilizes Self-Play Fine-Tuning for Large Language Models | Yibo Wang et.al. | 2512.07175 | translate | read | null |
| 2025-12-08 | Improving the Throughput of Diffusion-based Large Language Models via a Training-Free Confidence-Aware Calibration | Jucheng Shen et.al. | 2512.07173 | translate | read | null |
| 2025-12-08 | When Privacy Meets Recovery: The Overlooked Half of Surrogate-Driven Privacy Preservation for MLLM Editing | Siyuan Xu et.al. | 2512.07166 | translate | read | null |
| 2025-12-08 | A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning | Siyang Jiang et.al. | 2512.07136 | translate | read | null |
| 2025-12-08 | DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning | Nithin Sivakumaran et.al. | 2512.07132 | translate | read | null |
| 2025-12-08 | RisConFix: LLM-based Automated Repair of Risk-Prone Drone Configurations | Liping Han et.al. | 2512.07122 | translate | read | null |
| 2025-12-08 | FOAM: Blocked State Folding for Memory-Efficient LLM Training | Ziqing Wen et.al. | 2512.07112 | translate | read | null |
| 2025-12-08 | The Geometry of Persona: Disentangling Personality from Reasoning in Large Language Models | Zhixiang Wang et.al. | 2512.07092 | translate | read | null |
| 2025-12-08 | Leveraging KV Similarity for Online Structured Pruning in LLMs | Jungmin Lee et.al. | 2512.07090 | translate | read | null |
| 2025-12-08 | ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking | Yunzhe Li et.al. | 2512.07086 | translate | read | null |
| 2025-12-08 | Do Large Language Models Truly Understand Cross-cultural Differences? | Shiwei Guo et.al. | 2512.07075 | translate | read | null |
| 2025-12-08 | Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models | Richard Young et.al. | 2512.07059 | translate | read | null |
| 2025-12-07 | Reformulate, Retrieve, Localize: Agents for Repository-Level Bug Localization | Genevieve Caumartin et.al. | 2512.07022 | translate | read | null |
| 2025-12-07 | Latency-Response Theory Model: Evaluating Large Language Models via Response Accuracy and Chain-of-Thought Length | Zhiyu Xu et.al. | 2512.07019 | translate | read | null |
| 2025-12-07 | FVA-RAG: Falsification-Verification Alignment for Mitigating Sycophantic Hallucinations | Mayank Ravishankara et.al. | 2512.07015 | translate | read | null |
| 2025-12-07 | Block Sparse Flash Attention | Daniel Ohayon et.al. | 2512.07011 | translate | read | null |
| 2025-12-07 | Singing Timbre Popularity Assessment Based on Multimodal Large Foundation Model | Zihao Wang et.al. | 2512.06999 | translate | read | null |
| 2025-12-07 | Prompting-in-a-Series: Psychology-Informed Contents and Embeddings for Personality Recognition With Decoder-Only Models | Jing Jie Tan et.al. | 2512.06991 | translate | read | null |
| 2025-12-07 | Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation | Ivanhoé Botcazou et.al. | 2512.06938 | translate | read | null |
| 2025-12-07 | Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI | George Mikros et.al. | 2512.06922 | translate | read | null |
| 2025-12-07 | NeuroABench: A Multimodal Evaluation Benchmark for Neurosurgical Anatomy Identification | Ziyang Song et.al. | 2512.06921 | translate | read | null |
| 2025-12-07 | SoK: Trust-Authorization Mismatch in LLM Agent Interactions | Guanquan Shi et.al. | 2512.06914 | translate | read | null |
| 2025-12-07 | Robots with Attitudes: Influence of LLM-Driven Robot Personalities on Motivation and Performance | Dennis Becker et.al. | 2512.06910 | translate | read | null |
| 2025-12-07 | BabelCoder: Agentic Code Translation with Specification Alignment | Fazle Rabbi et.al. | 2512.06902 | translate | read | null |
| 2025-12-07 | An Analysis of Large Language Models for Simulating User Responses in Surveys | Ziyun Yu et.al. | 2512.06874 | translate | read | null |
| 2025-12-07 | Rhea: Role-aware Heuristic Episodic Attention for Conversational LLMs | Wanyang Hong et.al. | 2512.06869 | translate | read | null |
| 2025-12-07 | Do Persona-Infused LLMs Affect Performance in a Strategic Reasoning Game? | John Licato et.al. | 2512.06867 | translate | read | null |
| 2025-12-07 | Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior | Yulin Li et.al. | 2512.06866 | translate | read | null |
| 2025-12-07 | Spatial Retrieval Augmented Autonomous Driving | Xiaosong Jia et.al. | 2512.06865 | translate | read | null |
| 2025-12-07 | JT-DA: Enhancing Data Analysis with Tool-Integrated Table Reasoning Large Language Models | Ce Chi et.al. | 2512.06859 | translate | read | null |
| 2025-12-07 | Formal that “Floats” High: Formal Verification of Floating Point Arithmetic | Hansa Mohanty et.al. | 2512.06850 | translate | read | null |
| 2025-12-07 | CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs | Xiaoqi Li et.al. | 2512.06846 | translate | read | null |
| 2025-12-07 | Leveraging LLMs to support co-evolution between definitions and instances of textual DSLs | Weixing Zhang et.al. | 2512.06836 | translate | read | null |
| 2025-12-07 | Large Language Model-Based Generation of Discharge Summaries | Tiago Rodrigues et.al. | 2512.06812 | translate | read | null |
| 2025-12-07 | MMDuet2: Enhancing Proactive Interaction of Video MLLMs with Multi-Turn Reinforcement Learning | Yueqian Wang et.al. | 2512.06810 | translate | read | null |
| 2025-12-07 | Optimal and Diffusion Transports in Machine Learning | Gabriel Peyré et.al. | 2512.06797 | translate | read | null |
| 2025-12-07 | LLM4SFC: Sequential Function Chart Generation via Large Language Models | Ofek Glick et.al. | 2512.06787 | translate | read | null |
| 2025-12-07 | From Description to Score: Can LLMs Quantify Vulnerabilities? | Sima Jafarikhah et.al. | 2512.06781 | translate | read | null |
| 2025-12-07 | From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs | Yuchuan Tian et.al. | 2512.06776 | translate | read | link |
| 2025-12-07 | Becoming Experienced Judges: Selective Test-Time Learning for Evaluators | Seungyeon Jwa et.al. | 2512.06751 | translate | read | null |
| 2025-12-07 | DoVer: Intervention-Driven Auto Debugging for LLM Multi-Agent Systems | Ming Ma et.al. | 2512.06749 | translate | read | null |
| 2025-12-07 | PrivLLMSwarm: Privacy-Preserving LLM-Driven UAV Swarms for Secure IoT Surveillance | Jifar Wakuma Ayana et.al. | 2512.06747 | translate | read | null |
| 2025-12-07 | A Patient-Doctor-NLP-System to contest inequality for less privileged | Subrit Dikshit et.al. | 2512.06734 | translate | read | null |
| 2025-12-07 | “The Dentist is an involved parent, the bartender is not”: Revealing Implicit Biases in QA with Implicit BBQ | Aarushi Wagh et.al. | 2512.06732 | translate | read | null |
| 2025-12-07 | KV-CAR: KV Cache Compression using Autoencoders and KV Reuse in Large Language Models | Sourjya Roy et.al. | 2512.06727 | translate | read | null |
| 2025-12-07 | The Role of Entropy in Visual Grounding: Analysis and Optimization | Shuo Li et.al. | 2512.06726 | translate | read | null |
| 2025-12-07 | ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems | Bufang Yang et.al. | 2512.06721 | translate | read | null |
| 2025-12-07 | Cognitive Control Architecture (CCA): A Lifecycle Supervision Framework for Robustly Aligned AI Agents | Zhibo Liang et.al. | 2512.06716 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)