LLM - 2026-02
LLM - 2026-02
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2026-02-28 | Learning Nested Named Entity Recognition from Flat Annotations | Igor Rozhkov et.al. | 2603.00840 | translate | read | null |
| 2026-02-28 | Constitutional Black-Box Monitoring for Scheming in LLM Agents | Simon Storf et.al. | 2603.00829 | translate | read | null |
| 2026-02-28 | A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations | Hossein Javidnia et.al. | 2603.00824 | translate | read | null |
| 2026-02-28 | A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction | Ruihao Pan et.al. | 2603.00823 | translate | read | null |
| 2026-02-28 | ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files | Reshabh K Sharma et.al. | 2603.00822 | translate | read | null |
| 2026-02-28 | From Dyads to Groups: Rethinking Emotional Support with Conversational AI | Yuqing Hu et.al. | 2603.00797 | translate | read | null |
| 2026-02-28 | Identifying the Geographic Foci of US Local News | Gangani Ariyarathne et.al. | 2603.00787 | translate | read | null |
| 2026-02-28 | Structure Matters: Evaluating Multi-Agents Orchestration in Generative Therapeutic Chatbots | Sina Elahimanesh et.al. | 2603.00774 | translate | read | null |
| 2026-02-28 | LLM-Powered Automatic Theorem Proving and Synthesis for Hybrid Systems and Game | Aditi Kabra et.al. | 2603.00737 | translate | read | null |
| 2026-02-28 | RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models | Andrew Zhuoer Feng et.al. | 2603.00724 | translate | read | null |
| 2026-02-28 | MARS: Harmonizing Multimodal Convergence via Adaptive Rank Search | Minkyoung Cho et.al. | 2603.00720 | translate | read | null |
| 2026-02-28 | DRIV-EX: Counterfactual Explanations for Driving LLMs | Amaia Cardiel et.al. | 2603.00696 | translate | read | null |
| 2026-02-28 | Wild-Drive: Off-Road Scene Captioning and Path Planning via Robust Multi-modal Routing and Efficient Large Language Model | Zihang Wang et.al. | 2603.00694 | translate | read | null |
| 2026-02-28 | RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis | Andrew Zhuoer Feng et.al. | 2603.00686 | translate | read | null |
| 2026-02-28 | Stateful Cross-layer Vision Modulation | Ying Liu et.al. | 2603.00655 | translate | read | null |
| 2026-02-28 | Historian: Reducing Manual Validation in APR Benchmarking via Evidence-Based Assessment | Sahand Moslemi et.al. | 2603.00649 | translate | read | null |
| 2026-02-28 | RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation | Jin Zeng et.al. | 2603.00638 | translate | read | null |
| 2026-02-28 | TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces | Shu-Xun Yang et.al. | 2603.00623 | translate | read | null |
| 2026-02-28 | PlantWhisperer: Designing Conversational AI to Support Plant Care | Daniel Mejer Christensen et.al. | 2603.00598 | translate | read | null |
| 2026-02-28 | UNICBench: UNIfied Counting Benchmark for MLLM | Chenggang Rong et.al. | 2603.00595 | translate | read | null |
| 2026-02-28 | Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs | Yiran Zhao et.al. | 2603.00590 | translate | read | null |
| 2026-02-28 | Energy-Efficient Information Representation in MNIST Classification Using Biologically Inspired Learning | Patrick Stricker et.al. | 2603.00588 | translate | read | null |
| 2026-02-28 | Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research | Yubo Dong et.al. | 2603.00582 | translate | read | null |
| 2026-02-28 | CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging | Jie Cao et.al. | 2603.00573 | translate | read | null |
| 2026-02-28 | MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs | Yilian Liu et.al. | 2603.00565 | translate | read | null |
| 2026-02-28 | Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation | Zeyu Chen et.al. | 2603.00546 | translate | read | null |
| 2026-02-28 | LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks | Yucheng Zeng et.al. | 2603.00540 | translate | read | null |
| 2026-02-28 | Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement | Haolin Jin et.al. | 2603.00539 | translate | read | null |
| 2026-02-28 | CaptionFool: Universal Image Captioning Model Attacks | Swapnil Parekh et.al. | 2603.00529 | translate | read | null |
| 2026-02-28 | ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data | Haodong Zhao et.al. | 2603.00516 | translate | read | null |
| 2026-02-28 | MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence | Xingyilang Yin et.al. | 2603.00515 | translate | read | null |
| 2026-02-28 | Multimodal Adaptive Retrieval Augmented Generation through Internal Representation Learning | Ruoshuang Du et.al. | 2603.00511 | translate | read | null |
| 2026-02-28 | What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models | Yingqi Fan et.al. | 2603.00510 | translate | read | null |
| 2026-02-28 | M $^2$ : Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight Retrieval | Dawei Yan et.al. | 2603.00503 | translate | read | null |
| 2026-02-28 | WirelessAgent++: Automated Agentic Workflow Design and Benchmarking for Wireless Networks | Jingwen Tong et.al. | 2603.00501 | translate | read | null |
| 2026-02-28 | Zero-Shot Robotic Manipulation via 3D Gaussian Splatting-Enhanced Multimodal Retrieval-Augmented Generation | Zilong Xie et.al. | 2603.00500 | translate | read | null |
| 2026-02-28 | Antibody: Strengthening Defense Against Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Gradient Influence | Quoc Minh Nguyen et.al. | 2603.00498 | translate | read | null |
| 2026-02-28 | LifeEval: A Multimodal Benchmark for Assistive AI in Egocentric Daily Life Tasks | Hengjian Gao et.al. | 2603.00490 | translate | read | null |
| 2026-02-28 | Does My README File Need To Be Updated? Exploring LLM-Based README Maintenance | Haoyu Gao et.al. | 2603.00489 | translate | read | null |
| 2026-02-28 | Wireless Power Control Based on Large Language Models | Jiacheng Wang et.al. | 2603.00474 | translate | read | null |
| 2026-02-28 | Optimizing In-Context Demonstrations for LLM-based Automated Grading | Yucheng Chu et.al. | 2603.00465 | translate | read | null |
| 2026-02-28 | MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval | Shuheng Chen et.al. | 2603.00460 | translate | read | null |
| 2026-02-28 | Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training | Xi Wang et.al. | 2603.00454 | translate | read | null |
| 2026-02-28 | Confusion-Aware Rubric Optimization for LLM-based Automated Grading | Yucheng Chu et.al. | 2603.00451 | translate | read | null |
| 2026-02-28 | SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment | Zhuoran Zhao et.al. | 2603.00443 | translate | read | null |
| 2026-02-28 | ROKA: Robust Knowledge Unlearning against Adversaries | Jinmyeong Shin et.al. | 2603.00436 | translate | read | null |
| 2026-02-28 | Personalities at Play: Probing Alignment in AI Teammates | Mohammad Amin Samadi et.al. | 2603.00429 | translate | read | null |
| 2026-02-28 | LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation | Cunyuan Yang et.al. | 2603.00426 | translate | read | null |
| 2026-02-28 | SSR: Pushing the Limit of Spatial Intelligence with Structured Scene Reasoning | Yi Zhang et.al. | 2603.00409 | translate | read | null |
| 2026-02-28 | A Data-Driven Analysis for Engineering Conferences: The Institute of Industrial and Systems Engineering (IISE) Annual Conference Proceedings (2002-2005) | H. Sinan Bank et.al. | 2603.00399 | translate | read | null |
| 2026-02-26 | MediX-R1: Open Ended Medical Reinforcement Learning | Sahal Shaji Mullappilly et.al. | 2602.23363 | translate | read | null |
| 2026-02-26 | Utilizing LLMs for Industrial Process Automation | Salim Fares et.al. | 2602.23331 | translate | read | null |
| 2026-02-26 | Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks | Kunihiro Miyazaki et.al. | 2602.23330 | translate | read | null |
| 2026-02-26 | LLM Novice Uplift on Dual-Use, In Silico Biology Tasks | Chen Bo Calvin Zhang et.al. | 2602.23329 | translate | read | null |
| 2026-02-26 | Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction | Rafael R. Baptista et.al. | 2602.23312 | translate | read | null |
| 2026-02-26 | ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding | Yiran Guan et.al. | 2602.23306 | translate | read | null |
| 2026-02-26 | A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations | Soumya Dutta et.al. | 2602.23300 | translate | read | null |
| 2026-02-26 | CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays | Hyungyung Lee et.al. | 2602.23276 | translate | read | null |
| 2026-02-26 | Mitigating Legibility Tax with Decoupled Prover-Verifier Games | Yegon Kim et.al. | 2602.23248 | translate | read | null |
| 2026-02-26 | Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive | Radha Sarma et.al. | 2602.23239 | translate | read | null |
| 2026-02-26 | MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction | Yizhi Li et.al. | 2602.23228 | translate | read | null |
| 2026-02-26 | STELLAR: Storage Tuning Engine Leveraging LLM Autonomous Reasoning for High Performance Parallel File Systems | Chris Egersdoerfer et.al. | 2602.23220 | translate | read | null |
| 2026-02-26 | InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models | Sayed Mohammadreza Tayaranian Hosseini et.al. | 2602.23200 | translate | read | null |
| 2026-02-26 | SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation | Jiahao Zhao et.al. | 2602.23199 | translate | read | null |
| 2026-02-26 | Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models | Chungpa Lee et.al. | 2602.23197 | translate | read | null |
| 2026-02-26 | ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering | Elzo Brito dos Santos Filho et.al. | 2602.23193 | translate | read | null |
| 2026-02-26 | MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations | Sara Rosenthal et.al. | 2602.23184 | translate | read | null |
| 2026-02-26 | A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring | Usman Anwar et.al. | 2602.23163 | translate | read | null |
| 2026-02-26 | Multi-Agent Large Language Model Based Emotional Detoxification Through Personalized Intensity Control for Consumer Protection | Keito Inoshita et.al. | 2602.23123 | translate | read | null |
| 2026-02-26 | Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design | Zhuoliang Xie et.al. | 2602.23092 | translate | read | null |
| 2026-02-26 | Cytoarchitecture in Words: Weakly Supervised Vision-Language Modeling for Human Brain Microscopy | Matthew Sutton et.al. | 2602.23088 | translate | read | null |
| 2026-02-26 | Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent | Boyang Zhang et.al. | 2602.23079 | translate | read | null |
| 2026-02-26 | CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery | Mengze Hong et.al. | 2602.23075 | translate | read | null |
| 2026-02-26 | TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment | Trung Dang et.al. | 2602.23068 | translate | read | null |
| 2026-02-26 | LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer | Kunpeng Zhang et.al. | 2602.23065 | translate | read | null |
| 2026-02-26 | Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department | Gabriela Anna Kaczmarek et.al. | 2602.23062 | translate | read | null |
| 2026-02-26 | CL4SE: A Context Learning Benchmark For Software Engineering Tasks | Haichuan Hu et.al. | 2602.23047 | translate | read | null |
| 2026-02-26 | LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure | Jaehong Cho et.al. | 2602.23036 | translate | read | null |
| 2026-02-26 | WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval | Tianyue Wang et.al. | 2602.23029 | translate | read | null |
| 2026-02-26 | Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization | Zeyuan Liu et.al. | 2602.23008 | translate | read | null |
| 2026-02-26 | Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search | Xun Huang et.al. | 2602.22983 | translate | read | null |
| 2026-02-26 | Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots | Dimitrios P. Panagoulias et.al. | 2602.22973 | translate | read | null |
| 2026-02-26 | SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy | Peiyao Xiao et.al. | 2602.22971 | translate | read | null |
| 2026-02-26 | Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression | Yifeng Guan et.al. | 2602.22967 | translate | read | null |
| 2026-02-26 | FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning | Zehao Li et.al. | 2602.22963 | translate | read | null |
| 2026-02-26 | Can Agents Distinguish Visually Hard-to-Separate Diseases in a Zero-Shot Setting? A Pilot Study | Zihao Zhao et.al. | 2602.22959 | translate | read | null |
| 2026-02-26 | ClawMobile: Rethinking Smartphone-Native Agentic Systems | Hongchao Du et.al. | 2602.22942 | translate | read | null |
| 2026-02-26 | MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding | Wenhui Tan et.al. | 2602.22932 | translate | read | null |
| 2026-02-26 | SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress | Yang Yu et.al. | 2602.22913 | translate | read | null |
| 2026-02-26 | PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA | Yunpeng Hong et.al. | 2602.22903 | translate | read | null |
| 2026-02-26 | Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space | Xingcheng Fu et.al. | 2602.22879 | translate | read | null |
| 2026-02-26 | Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching | Roy Miles et.al. | 2602.22871 | translate | read | null |
| 2026-02-26 | Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference | Yushi Ye et.al. | 2602.22868 | translate | read | null |
| 2026-02-26 | TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought | Jianmin Li et.al. | 2602.22828 | translate | read | null |
| 2026-02-26 | TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models | Reihaneh Iranmanesh et.al. | 2602.22827 | translate | read | null |
| 2026-02-26 | Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks | Shuo He et.al. | 2602.22817 | translate | read | null |
| 2026-02-26 | MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks | Shiqian Su et.al. | 2602.22808 | translate | read | null |
| 2026-02-26 | Natural Language Declarative Prompting (NLD-P): A Modular Governance Method for Prompt Design Under Model Drift | Hyunwoo Kim et.al. | 2602.22790 | translate | read | null |
| 2026-02-26 | Probing for Knowledge Attribution in Large Language Models | Ivo Brink et.al. | 2602.22787 | translate | read | null |
| 2026-02-26 | ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making | Yusuke Watanabe et.al. | 2602.22771 | translate | read | null |
| 2026-02-26 | AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications | Yujie Zhao et.al. | 2602.22769 | translate | read | null |
| 2026-02-26 | Imagination Helps Visual Reasoning, But Not Yet in Latent Space | You Li et.al. | 2602.22766 | translate | read | null |
| 2026-02-26 | Towards Better RL Training Data Utilization via Second-Order Rollout | Zhe Yang et.al. | 2602.22765 | translate | read | null |
| 2026-02-26 | Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study | Philipp Wiesner et.al. | 2602.22760 | translate | read | null |
| 2026-02-26 | Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction | Nils Schwager et.al. | 2602.22752 | translate | read | null |
| 2026-02-26 | Generative Recommendation for Large-Scale Advertising | Ben Xue et.al. | 2602.22732 | translate | read | null |
| 2026-02-26 | Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks | Jakub Šmíd et.al. | 2602.22730 | translate | read | null |
| 2026-02-26 | AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification | Tian Zhang et.al. | 2602.22724 | translate | read | null |
| 2026-02-26 | Replacing Multi-Step Assembly of Data Preparation Pipelines with One-Step LLM Pipeline Generation for Table QA | Fengyu Li et.al. | 2602.22721 | translate | read | null |
| 2026-02-26 | RLHFless: Serverless Computing for Efficient RLHF | Rui Wei et.al. | 2602.22718 | translate | read | null |
| 2026-02-26 | SoPE: Spherical Coordinate-Based Positional Embedding for Enhancing Spatial Perception of 3D LVLMs | Guanting Ye et.al. | 2602.22716 | translate | read | null |
| 2026-02-26 | LLM-driven discovery for carbon allotropes with bond-network entropy | Yuzhou Hao et.al. | 2602.22706 | translate | read | null |
| 2026-02-26 | IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation | Yanpei Guo et.al. | 2602.22700 | translate | read | null |
| 2026-02-26 | Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs | Siyue Su et.al. | 2602.22698 | translate | read | null |
| 2026-02-26 | Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue | Ning Gao et.al. | 2602.22697 | translate | read | null |
| 2026-02-26 | SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses | Zhuohang Jiang et.al. | 2602.22683 | translate | read | null |
| 2026-02-26 | Accelerating LLM Pre-Training through Flat-Direction Dynamics Enhancement | Shuchen Zhu et.al. | 2602.22681 | translate | read | null |
| 2026-02-26 | Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions | Yue Xu et.al. | 2602.22680 | translate | read | null |
| 2026-02-26 | Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning | Qin-Wen Luo et.al. | 2602.22642 | translate | read | null |
| 2026-02-26 | MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios | Zhiheng Song et.al. | 2602.22638 | translate | read | null |
| 2026-02-26 | Fine-grained Semantics Integration for Large Language Model-based Recommendation | Jiawen Feng et.al. | 2602.22632 | translate | read | null |
| 2026-02-26 | Instruction-based Image Editing with Planning, Reasoning, and Generation | Liya Ji et.al. | 2602.22624 | translate | read | null |
| 2026-02-26 | Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA | Hai Huang et.al. | 2602.22617 | translate | read | null |
| 2026-02-26 | Transformers converge to invariant algorithmic cores | Joshua S. Schiffman et.al. | 2602.22600 | translate | read | null |
| 2026-02-26 | FLYING SERVING: On-the-Fly Parallelism Switching for Large Language Model Serving | Shouwei Gao et.al. | 2602.22593 | translate | read | null |
| 2026-02-26 | pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training | Wenzheng Zhang et.al. | 2602.22592 | translate | read | null |
| 2026-02-26 | Where Relevance Emerges: A Layer-Wise Study of Internal Attention for Zero-Shot Re-Ranking | Haodong Chen et.al. | 2602.22591 | translate | read | null |
| 2026-02-26 | Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training | Tianle Xia et.al. | 2602.22576 | translate | read | null |
| 2026-02-26 | Addressing Climate Action Misperceptions with Generative AI | Miriam Remshard et.al. | 2602.22564 | translate | read | null |
| 2026-02-26 | Layer-Targeted Multilingual Knowledge Erasure in Large Language Models | Taoran Li et.al. | 2602.22562 | translate | read | null |
| 2026-02-26 | CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety | Umid Suleymanov et.al. | 2602.22557 | translate | read | null |
| 2026-02-26 | Autoregressive Visual Decoding from EEG Signals | Sicheng Dai et.al. | 2602.22555 | translate | read | null |
| 2026-02-26 | Multilingual Safety Alignment Via Sparse Weight Editing | Jiaming Liang et.al. | 2602.22554 | translate | read | null |
| 2026-02-26 | Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention | Zhiming Wang et.al. | 2602.22546 | translate | read | null |
| 2026-02-26 | Ruyi2 Technical Report | Huan Song et.al. | 2602.22543 | translate | read | null |
| 2026-02-26 | Agentic AI for Intent-driven Optimization in Cell-free O-RAN | Mohammad Hossein Shokouhi et.al. | 2602.22539 | translate | read | null |
| 2026-02-26 | Generative Agents Navigating Digital Libraries | Saber Zerhoudi et.al. | 2602.22529 | translate | read | null |
| 2026-02-26 | Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o | Samay Bhojwani et.al. | 2602.22524 | translate | read | null |
| 2026-02-26 | Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents | Ryan Liu et.al. | 2602.22523 | translate | read | null |
| 2026-02-26 | Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning | Guoyizhe Wei et.al. | 2602.22510 | translate | read | null |
| 2026-02-26 | Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models | Ik-hwan Kim et.al. | 2602.22508 | translate | read | null |
| 2026-02-26 | Mapping the Landscape of Artificial Intelligence in Life Cycle Assessment Using Large Language Models | Anastasija Mensikova et.al. | 2602.22500 | translate | read | null |
| 2026-02-26 | Reinforcement-aware Knowledge Distillation for LLM Reasoning | Zhaoyang Zhang et.al. | 2602.22495 | translate | read | null |
| 2026-02-25 | Importance of Prompt Optimisation for Error Detection in Medical Notes Using Language Models | Craig Myles et.al. | 2602.22483 | translate | read | null |
| 2026-02-25 | Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models | Binchi Zhang et.al. | 2602.22475 | translate | read | null |
| 2026-02-25 | ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization | Joseph Tso et.al. | 2602.22465 | translate | read | null |
| 2026-02-25 | CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling | Dong Xu et.al. | 2602.22457 | translate | read | null |
| 2026-02-25 | Automating the Detection of Requirement Dependencies Using Large Language Models | Ikram Darif et.al. | 2602.22456 | translate | read | null |
| 2026-02-25 | Exploring Multimodal LMMs for Online Episodic Memory Question Answering on the Edge | Giuseppe Lando et.al. | 2602.22455 | translate | read | null |
| 2026-02-25 | CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines | Chayan Banerjee et.al. | 2602.22452 | translate | read | null |
| 2026-02-25 | Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace | Qianlong Lan et.al. | 2602.22450 | translate | read | null |
| 2026-02-25 | A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines | Gaoyuan Du et.al. | 2602.22442 | translate | read | null |
| 2026-02-25 | HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems | Idan Habler et.al. | 2602.22427 | translate | read | null |
| 2026-02-25 | SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read | Yibo Peng et.al. | 2602.22426 | translate | read | null |
| 2026-02-25 | Causality $\neq$ Invariance: Function and Concept Vectors in LLMs | Gustaw Opiełka et.al. | 2602.22424 | translate | read | null |
| 2026-02-25 | Seeing Graphs Like Humans: Benchmarking Computational Measures and MLLMs for Similarity Assessment | Seokweon Jung et.al. | 2602.22416 | translate | read | null |
| 2026-02-25 | Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents | Cosmo Santoni et.al. | 2602.22402 | translate | read | null |
| 2026-02-25 | VoiceAlign: A Shimming Layer for Enhancing the Usability of Legacy Voice User Interface Systems | Md Ehtesham-Ul-Haque et.al. | 2602.22374 | translate | read | null |
| 2026-02-25 | EyeLayer: Integrating Human Attention Patterns into LLM-Based Code Summarization | Jiahao Zhang et.al. | 2602.22368 | translate | read | null |
| 2026-02-25 | E3VA: Enhancing Emotional Expressiveness in Virtual Conversational Agents | Abhishek Kulkarni et.al. | 2602.22362 | translate | read | null |
| 2026-02-25 | Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts | Arno Simons et.al. | 2602.22359 | translate | read | null |
| 2026-02-25 | STILTS-NLI: A Natural Language Interface for STILTS | R. A. Shaw et.al. | 2602.22357 | translate | read | null |
| 2026-02-25 | Decoder-based Sense Knowledge Distillation | Qitong Wang et.al. | 2602.22351 | translate | read | null |
| 2026-02-25 | Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory | Davide Ettori et.al. | 2602.22345 | translate | read | null |
| 2026-02-25 | Conversational Successes and Breakdowns in Everyday Non-Display Smart Glasses Use | Xiuqi Tommy Zhu et.al. | 2602.22340 | translate | read | null |
| 2026-02-25 | Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads | Kunpeng Zhang et.al. | 2602.22299 | translate | read | null |
| 2026-02-25 | UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs | Devan Shah et.al. | 2602.22296 | translate | read | null |
| 2026-02-25 | Manifold of Failure: Behavioral Attraction Basins in Language Models | Sarthak Munshi et.al. | 2602.22291 | translate | read | null |
| 2026-02-25 | OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data | Yan Zhao et.al. | 2602.22286 | translate | read | null |
| 2026-02-25 | BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning | Mingi Kim et.al. | 2602.22284 | translate | read | null |
| 2026-02-25 | Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion | Md. Tahsin Amin et.al. | 2602.22280 | translate | read | null |
| 2026-02-25 | RETLLM: Training and Data-Free MLLMs for Multimodal Information Retrieval | Dawei Su et.al. | 2602.22278 | translate | read | null |
| 2026-02-25 | EmpiRE-Compass: A Neuro-Symbolic Dashboard for Sustainable and Dynamic Knowledge Exploration, Synthesis, and Reuse | Oliver Karras et.al. | 2602.22276 | translate | read | null |
| 2026-02-25 | Sustainable LLM Inference using Context-Aware Model Switching | Yuvarani et.al. | 2602.22261 | translate | read | null |
| 2026-02-24 | A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM | Morteza Eskandarian et.al. | 2602.22250 | translate | read | null |
| 2026-02-24 | Accelerating Incident Response: A Hybrid Approach for Data Breach Reporting | Aurora Arrus et.al. | 2602.22244 | translate | read | null |
| 2026-02-24 | Analysis of LLMs Against Prompt Injection and Jailbreak Attacks | Piyush Jaiswal et.al. | 2602.22242 | translate | read | null |
| 2026-02-24 | From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation | Linus Bantel et.al. | 2602.22240 | translate | read | null |
| 2026-02-23 | CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction | Rabeya Tus Sadia et.al. | 2602.22236 | translate | read | null |
| 2026-02-25 | Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets | Hanna Yukhymenko et.al. | 2602.22207 | translate | read | null |
| 2026-02-25 | A Taxonomy of Human–MLLM Interaction in Early-Stage Sketch-Based Design Ideation | Weiayn Shi et.al. | 2602.22171 | translate | read | null |
| 2026-02-25 | LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models | Minqiu Sun et.al. | 2602.22158 | translate | read | null |
| 2026-02-25 | Dynamic Personality Adaptation in Large Language Models via State Machines | Leon Pielage et.al. | 2602.22157 | translate | read | null |
| 2026-02-25 | Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual | Yining Li et.al. | 2602.22146 | translate | read | null |
| 2026-02-25 | When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models | Satyam Kumar Navneet et.al. | 2602.22145 | translate | read | null |
| 2026-02-25 | WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs | Yulin Zhang et.al. | 2602.22142 | translate | read | null |
| 2026-02-25 | Confidence-Driven Multi-Scale Model Selection for Cost-Efficient Inference | Bo-Wei Chen et.al. | 2602.22090 | translate | read | null |
| 2026-02-25 | ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent | Chunggi Lee et.al. | 2602.22077 | translate | read | null |
| 2026-02-25 | Understanding Artificial Theory of Mind: Perturbed Tasks and Reasoning in Large Language Models | Christian Nickel et.al. | 2602.22072 | translate | read | null |
| 2026-02-25 | Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts | Jessica Y. Bo et.al. | 2602.22070 | translate | read | null |
| 2026-02-25 | DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain | Walter Hernandez Cruz et.al. | 2602.22045 | translate | read | null |
| 2026-02-25 | RT-RMOT: A Dataset and Framework for RGB-Thermal Referring Multi-Object Tracking | Yanqiu Yu et.al. | 2602.22033 | translate | read | null |
| 2026-02-25 | Enhancing LLM-Based Test Generation by Eliminating Covered Code | WeiZhe Xu et.al. | 2602.21997 | translate | read | null |
| 2026-02-25 | CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models | Miyu Oba et.al. | 2602.21978 | translate | read | null |
| 2026-02-25 | Global-Aware Edge Prioritization for Pose Graph Initialization | Tong Wei et.al. | 2602.21963 | translate | read | null |
| 2026-02-25 | Global-Local Dual Perception for MLLMs in High-Resolution Text-Rich Image Translation | Junxin Lu et.al. | 2602.21956 | translate | read | null |
| 2026-02-25 | RADAR: Reasoning as Discrimination with Aligned Representations for LLM-based Knowledge Graph Reasoning | Bo Xue et.al. | 2602.21951 | translate | read | null |
| 2026-02-25 | MEDSYN: Benchmarking Multi-EviDence SYNthesis in Complex Clinical Cases for Multimodal Large Language Models | Boqi Chen et.al. | 2602.21950 | translate | read | null |
| 2026-02-25 | Large Language Models are Algorithmically Blind | Sohan Venkatesh et.al. | 2602.21947 | translate | read | null |
| 2026-02-25 | Hidden Topics: Measuring Sensitive AI Beliefs with List Experiments | Maxim Chupilkin et.al. | 2602.21939 | translate | read | null |
| 2026-02-25 | Small Wins Big: Comparing Large Language Models and Domain Fine-Tuned Models for Sarcasm Detection in Code-Mixed Hinglish Text | Bitan Majumder et.al. | 2602.21933 | translate | read | null |
| 2026-02-25 | EmoOmni: Bridging Emotional Understanding and Expression in Omni-Modal LLMs | Wenjie Tian et.al. | 2602.21900 | translate | read | null |
| 2026-02-25 | APFuzz: Towards Automatic Greybox Protocol Fuzzing | Yu Wang et.al. | 2602.21892 | translate | read | null |
| 2026-02-25 | How to Take a Memorable Picture? Empowering Users with Actionable Feedback | Francesco Laiti et.al. | 2602.21877 | translate | read | null |
| 2026-02-25 | Personalized Graph-Empowered Large Language Model for Proactive Information Access | Chia Cheng Chang et.al. | 2602.21862 | translate | read | null |
| 2026-02-25 | ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices | Dezhi Kong et.al. | 2602.21858 | translate | read | null |
| 2026-02-25 | FewMMBench: A Benchmark for Multimodal Few-Shot Learning | Mustafa Dogan et.al. | 2602.21854 | translate | read | null |
| 2026-02-25 | From Restructuring to Stabilization: A Large-Scale Experiment on Iterative Code Readability Refactoring with Large Language Models | Norman Peitek et.al. | 2602.21833 | translate | read | null |
| 2026-02-25 | A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios | Kimberly T. Mai et.al. | 2602.21831 | translate | read | null |
| 2026-02-25 | SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model | Guibin Chen et.al. | 2602.21818 | translate | read | null |
| 2026-02-25 | Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem | Heejin Jo et.al. | 2602.21814 | translate | read | null |
| 2026-02-25 | An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention | Madhusudan Ghosh et.al. | 2602.21800 | translate | read | null |
| 2026-02-25 | DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism | Yifan Niu et.al. | 2602.21788 | translate | read | null |
| 2026-02-25 | D-COT: Disciplined Chain-of-Thought Learning for Efficient Reasoning in Small Language Models | Shunsuke Ubukata et.al. | 2602.21786 | translate | read | null |
| 2026-02-25 | Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control | Beatrice Luciani et.al. | 2602.21783 | translate | read | null |
| 2026-02-25 | Generalisation of RLHF under Reward Shift and Clipped KL Regularisation | Kenton Tang et.al. | 2602.21765 | translate | read | null |
| 2026-02-25 | Improving Implicit Discourse Relation Recognition with Natural Language Explanations from LLMs | Heng Wang et.al. | 2602.21763 | translate | read | null |
| 2026-02-25 | Offline Reasoning for Efficient Recommendation: LLM-Empowered Persona-Profiled Item Indexing | Deogyong Kim et.al. | 2602.21756 | translate | read | null |
| 2026-02-25 | From Words to Amino Acids: Does the Curse of Depth Persist? | Aleena Siji et.al. | 2602.21750 | translate | read | null |
| 2026-02-25 | Enhancing Multi-Modal LLMs Reasoning via Difficulty-Aware Group Normalization | Jinghan Li et.al. | 2602.21743 | translate | read | null |
| 2026-02-25 | Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling | Shiqi Yan et.al. | 2602.21728 | translate | read | null |
| 2026-02-25 | TranX-Adapter: Bridging Artifacts and Semantics within MLLMs for Robust AI-generated Image Detection | Wenbin Wang et.al. | 2602.21716 | translate | read | null |
| 2026-02-25 | Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach | Xu Yang et.al. | 2602.21715 | translate | read | null |
| 2026-02-25 | EditFlow: Benchmarking and Optimizing Code Edit Recommendation Systems via Reconstruction of Developer Flows | Chenyan Liu et.al. | 2602.21697 | translate | read | null |
| 2026-02-25 | Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning | Tomoya Kawabe et.al. | 2602.21670 | translate | read | null |
| 2026-02-25 | DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation | Duc Trung Vu et.al. | 2602.21669 | translate | read | null |
| 2026-02-25 | CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning | Zhijiang Tang et.al. | 2602.21655 | translate | read | null |
| 2026-02-25 | Irresponsible Counselors: Large Language Models and the Loneliness of Modern Humans | Abas Bertina et.al. | 2602.21653 | translate | read | null |
| 2026-02-25 | Sparsity Induction for Accurate Post-Training Pruning of Large Language Models | Minhao Jiang et.al. | 2602.21652 | translate | read | null |
| 2026-02-25 | Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration | Tangsang Chongbang et.al. | 2602.21647 | translate | read | null |
| 2026-02-25 | Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion | Yexing Du et.al. | 2602.21646 | translate | read | null |
| 2026-02-25 | RuCL: Stratified Rubric-Based Curriculum Learning for Multimodal Large Language Model Reasoning | Yukun Chen et.al. | 2602.21628 | translate | read | null |
| 2026-02-25 | Multi-Layer Scheduling for MoE-Based LLM Reasoning | Yifan Sun et.al. | 2602.21626 | translate | read | null |
| 2026-02-25 | Structurally Aligned Subtask-Level Memory for Software Engineering Agents | Kangning Shen et.al. | 2602.21611 | translate | read | null |
| 2026-02-25 | MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification | Kazi Samin Yasar Alam et.al. | 2602.21608 | translate | read | null |
| 2026-02-25 | Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation | Qiange Wang et.al. | 2602.21604 | translate | read | null |
| 2026-02-25 | AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking | Ganap Ashit Tewary et.al. | 2602.21600 | translate | read | null |
| 2026-02-25 | SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints | Hyungmin Kim et.al. | 2602.21595 | translate | read | null |
| 2026-02-25 | Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection | Zheng Gao et.al. | 2602.21593 | translate | read | null |
| 2026-02-25 | Revisiting RAG Retrievers: An Information Theoretic Benchmark | Wenqing Zheng et.al. | 2602.21553 | translate | read | null |
| 2026-02-25 | RAC: Relation-Aware Cache Replacement for Large Language Models | Yuchong Wu et.al. | 2602.21547 | translate | read | null |
| 2026-02-25 | Muon+: Towards Better Muon via One Additional Normalization Step | Ruijie Zhang et.al. | 2602.21545 | translate | read | null |
| 2026-02-25 | Reasoning-Driven Design of Single Atom Catalysts via a Multi-Agent Large Language Model Framework | Dong Hyeon Mok et.al. | 2602.21533 | translate | read | null |
| 2026-02-25 | One Brain, Omni Modalities: Towards Unified Non-Invasive Brain Decoding with Large Language Models | Changli Tang et.al. | 2602.21522 | translate | read | null |
| 2026-02-25 | Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information | Umid Suleymanov et.al. | 2602.21496 | translate | read | null |
| 2026-02-25 | GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning | Ningyuan Yang et.al. | 2602.21492 | translate | read | null |
| 2026-02-25 | Evaluating the Usage of African-American Vernacular English in Large Language Models | Deja Dunlap et.al. | 2602.21485 | translate | read | null |
| 2026-02-25 | The Design Space of Tri-Modal Masked Diffusion Models | Louis Bethune et.al. | 2602.21472 | translate | read | null |
| 2026-02-25 | iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis | Sofoklis Kakouros et.al. | 2602.21464 | translate | read | null |
| 2026-02-25 | Revisiting Text Ranking in Deep Research | Chuan Meng et.al. | 2602.21456 | translate | read | null |
| 2026-02-24 | MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning | Jesse He et.al. | 2602.21442 | translate | read | null |
| 2026-02-24 | Causal Decoding for Hallucination-Resistant Multimodal Large Language Models | Shiwei Tan et.al. | 2602.21441 | translate | read | null |
| 2026-02-24 | Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning | Yuanda Xu et.al. | 2602.21420 | translate | read | null |
| 2026-02-24 | MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection | Xuan Chen et.al. | 2602.21394 | translate | read | null |
| 2026-02-24 | Interleaved Head Attention | Sai Surya Duvvuri et.al. | 2602.21371 | translate | read | null |
| 2026-02-24 | A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives | Dmitrii Pantiukhin et.al. | 2602.21351 | translate | read | null |
| 2026-02-24 | Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment | Mengxuan Hu et.al. | 2602.21346 | translate | read | null |
| 2026-02-24 | Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data | Emre Can Acikgoz et.al. | 2602.21320 | translate | read | null |
| 2026-02-24 | Shared Nature, Unique Nurture: PRISM for Pluralistic Reasoning via In-context Structure Modeling | Guancheng Tu et.al. | 2602.21317 | translate | read | null |
| 2026-02-24 | Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space | Wang Zixian et.al. | 2602.21269 | translate | read | null |
| 2026-02-24 | Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models | Sasha Robinson et.al. | 2602.21262 | translate | read | null |
| 2026-02-23 | Structured Prompt Language: Declarative Context Management for LLMs | Wen G. Gong et.al. | 2602.21257 | translate | read | null |
| 2026-02-23 | A General Equilibrium Theory of Orchestrated AI Agent Systems | Jean-Philippe Garnier et.al. | 2602.21255 | translate | read | null |
| 2026-02-24 | On Data Engineering for Scaling LLM Terminal Capabilities | Renjie Pi et.al. | 2602.21193 | translate | read | null |
| 2026-02-24 | Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training | Anas Barakat et.al. | 2602.21189 | translate | read | null |
| 2026-02-24 | Seeing Through Words: Controlling Visual Retrieval Quality with Language Models | Jianglin Lu et.al. | 2602.21175 | translate | read | null |
| 2026-02-24 | PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data | Samah Fodeh et.al. | 2602.21165 | translate | read | null |
| 2026-02-24 | ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking | Guangming Wang et.al. | 2602.21161 | translate | read | null |
| 2026-02-24 | SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards | Dengjia Zhang et.al. | 2602.21158 | translate | read | null |
| 2026-02-24 | Scaling State-Space Models on Multiple GPUs with Tensor Parallelism | Anurag Dutt et.al. | 2602.21144 | translate | read | null |
| 2026-02-24 | A Benchmark for Deep Information Synthesis | Debjit Paul et.al. | 2602.21143 | translate | read | null |
| 2026-02-24 | SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery | David Anugraha et.al. | 2602.21136 | translate | read | null |
| 2026-02-24 | “Are You Sure?”: An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems | Xinfeng Li et.al. | 2602.21127 | translate | read | null |
| 2026-02-24 | Turning Semantics into Topology: LLM-Driven Attribute Augmentation for Collaborative Filtering | Junjie Meng et.al. | 2602.21099 | translate | read | null |
| 2026-02-24 | Can Interest-Bearing Positions Solve the Long-Horizon Problem in Prediction Markets? | Caleb Maresca et.al. | 2602.21091 | translate | read | null |
| 2026-02-24 | Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification | Vishal Patil et.al. | 2602.21082 | translate | read | null |
| 2026-02-24 | An Expert Schema for Evaluating Large Language Model Errors in Scholarly Question-Answering Systems | Anna Martin-Boyle et.al. | 2602.21059 | translate | read | null |
| 2026-02-24 | PaperTrail: A Claim-Evidence Interface for Grounding Provenance in LLM-based Scholarly Q&A | Anna Martin-Boyle et.al. | 2602.21045 | translate | read | null |
| 2026-02-24 | LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification | Yanrui Wu et.al. | 2602.21044 | translate | read | null |
| 2026-02-24 | Generative Pseudo-Labeling for Pre-Ranking with LLMs | Junyu Bi et.al. | 2602.20995 | translate | read | null |
| 2026-02-24 | CrystaL: Spontaneous Emergence of Visual Latents in MLLMs | Yang Zhang et.al. | 2602.20980 | translate | read | null |
| 2026-02-24 | Evaluating Proactive Risk Awareness of Large Language Models | Xuan Luo et.al. | 2602.20976 | translate | read | null |
| 2026-02-24 | Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving | Yuliang Ji et.al. | 2602.20973 | translate | read | null |
| 2026-02-24 | Are Multimodal Large Language Models Good Annotators for Image Tagging? | Ming-Kun Xie et.al. | 2602.20972 | translate | read | null |
| 2026-02-24 | Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models | Paola Merlo et.al. | 2602.20966 | translate | read | null |
| 2026-02-24 | The Art of Efficient Reasoning: Data, Reward, and Optimization | Taiqiang Wu et.al. | 2602.20945 | translate | read | null |
| 2026-02-24 | Extending $μ$ P: Spectral Conditions for Feature Learning Across Optimizers | Akshita Gupta et.al. | 2602.20937 | translate | read | null |
| 2026-02-24 | Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence | ChengYou Li et.al. | 2602.20934 | translate | read | null |
| 2026-02-24 | HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG | Yuqi Huang et.al. | 2602.20926 | translate | read | null |
| 2026-02-24 | Predicting Sentence Acceptability Judgments in Multimodal Contexts | Hyewon Jang et.al. | 2602.20918 | translate | read | null |
| 2026-02-24 | LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding | Jihao Qiu et.al. | 2602.20913 | translate | read | null |
| 2026-02-24 | TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering | Hanshen Zhu et.al. | 2602.20903 | translate | read | null |
| 2026-02-24 | SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models | Yuechen Xie et.al. | 2602.20901 | translate | read | null |
| 2026-02-24 | Exa-PSD: a new Persian sentiment analysis dataset on Twitter | Seyed Himan Ghaderi et.al. | 2602.20892 | translate | read | null |
| 2026-02-24 | Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs | Dhita Putri Pratama et.al. | 2602.20878 | translate | read | null |
| 2026-02-24 | MUSE: Harnessing Precise and Diverse Semantics for Few-Shot Whole Slide Image Classification | Jiahao Xu et.al. | 2602.20873 | translate | read | null |
| 2026-02-24 | Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset | Jia-Rui Lin et.al. | 2602.20812 | translate | read | null |
| 2026-02-24 | Unseen-Codebases-Domain Data Synthesis and Training Based on Code Graphs | Guangsheng Ou et.al. | 2602.20799 | translate | read | null |
| 2026-02-24 | SPP-SCL: Semi-Push-Pull Supervised Contrastive Learning for Image-Text Sentiment Analysis and Beyond | Jiesheng Wu et.al. | 2602.20767 | translate | read | null |
| 2026-02-24 | Overton Pluralistic Reinforcement Learning for Large Language Models | Yu Fu et.al. | 2602.20759 | translate | read | null |
| 2026-02-24 | Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI Feedback | Chenyang Zhao et.al. | 2602.20728 | translate | read | null |
| 2026-02-24 | ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition | Xindian Ma et.al. | 2602.20727 | translate | read | null |
| 2026-02-24 | Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning | Xu Wan et.al. | 2602.20722 | translate | read | null |
| 2026-02-24 | AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs | Che Wang et.al. | 2602.20720 | translate | read | null |
| 2026-02-24 | PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring | Xiting Liu et.al. | 2602.20717 | translate | read | null |
| 2026-02-24 | ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction | Che Wang et.al. | 2602.20708 | translate | read | null |
| 2026-02-24 | PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding | Baolong Bi et.al. | 2602.20696 | translate | read | null |
| 2026-02-24 | Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment | Mohamed Shamseldein et.al. | 2602.20683 | translate | read | null |
| 2026-02-24 | CAMEL: Confidence-Gated Reflection for Reward Modeling | Zirui Zhu et.al. | 2602.20670 | translate | read | null |
| 2026-02-24 | ICSSPulse: A Modular LLM-Assisted Platform for Industrial Control System Penetration Testing | Michail Takaronis et.al. | 2602.20663 | translate | read | null |
| 2026-02-24 | TOM: A Ternary Read-only Memory Accelerator for LLM-powered Edge Intelligence | Hongyi Guan et.al. | 2602.20662 | translate | read | null |
| 2026-02-24 | CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models | Anqi Li et.al. | 2602.20648 | translate | read | null |
| 2026-02-24 | An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation | Fida Khandaker Safa et.al. | 2602.20644 | translate | read | null |
| 2026-02-24 | Grounding LLMs in Scientific Discovery via Embodied Actions | Bo Zhang et.al. | 2602.20639 | translate | read | null |
| 2026-02-24 | QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs | Santiago Gonzalez et.al. | 2602.20629 | translate | read | null |
| 2026-02-24 | Physics-based phenomenological characterization of cross-modal bias in multimodal models | Hyeongmo Kim et.al. | 2602.20624 | translate | read | null |
| 2026-02-24 | SpecMind: Cognitively Inspired, Interactive Multi-Turn Framework for Postcondition Inference | Cuong Chi Le et.al. | 2602.20610 | translate | read | null |
| 2026-02-24 | Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion | Jiaru Zhang et.al. | 2602.20577 | translate | read | null |
| 2026-02-24 | From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production | Yucheng Shi et.al. | 2602.20558 | translate | read | null |
| 2026-02-24 | Standard Transformers Achieve the Minimax Rate in Nonparametric Regression with $C^{s,λ}$ Targets | Yanming Lai et.al. | 2602.20555 | translate | read | null |
| 2026-02-24 | What Drives Students’ Use of AI Chatbots? Technology Acceptance in Conversational AI | Griffin Pitts et.al. | 2602.20547 | translate | read | null |
| 2026-02-24 | Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training | Zhengyao Gu et.al. | 2602.20532 | translate | read | null |
| 2026-02-24 | FAST-Prefill: FPGA Accelerated Sparse Attention for Long Context LLM Prefill | Rakshith Jayanth et.al. | 2602.20515 | translate | read | null |
| 2026-02-24 | From Performance to Purpose: A Sociotechnical Taxonomy for Evaluating Large Language Model Utility | Gavin Levinson et.al. | 2602.20513 | translate | read | null |
| 2026-02-24 | AWCP: A Workspace Delegation Protocol for Deep-Engagement Collaboration across Remote Agents | Xiaohang Nie et.al. | 2602.20493 | translate | read | null |
| 2026-02-24 | Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA | Nuocheng Yang et.al. | 2602.20492 | translate | read | null |
| 2026-02-24 | Application of Large Language Models for Container Throughput Forecasting: Incorporating Contextual Information in Port Logistics | Minseop Kim et.al. | 2602.20489 | translate | read | null |
| 2026-02-24 | Hybrid LLM-Embedded Dialogue Agents for Learner Reflection: Designing Responsive and Theory-Driven Interactions | Paras Sharma et.al. | 2602.20486 | translate | read | null |
| 2026-02-24 | Oracle-Robust Online Alignment for Large Language Models | Zimeng Li et.al. | 2602.20457 | translate | read | null |
| 2026-02-23 | Emergent Manifold Separability during Reasoning in Large Language Models | Alexandre Polo et.al. | 2602.20338 | translate | read | null |
| 2026-02-23 | DMCD: Semantic-Statistical Framework for Causal Discovery | Samarth KaPatel et.al. | 2602.20333 | translate | read | null |
| 2026-02-23 | No One Size Fits All: QueryBandits for Hallucination Mitigation | Nicole Cho et.al. | 2602.20332 | translate | read | null |
| 2026-02-23 | An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models | Cathy Shyr et.al. | 2602.20324 | translate | read | null |
| 2026-02-23 | What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance | William Watson et.al. | 2602.20300 | translate | read | null |
| 2026-02-23 | InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation | Yu Li et.al. | 2602.20294 | translate | read | null |
| 2026-02-23 | PhantomRun: Auto Repair of Compilation Errors in Embedded Open Source Software | Han Fu et.al. | 2602.20284 | translate | read | null |
| 2026-02-23 | The Truthfulness Spectrum Hypothesis | Zhuofan Josh Ying et.al. | 2602.20273 | translate | read | null |
| 2026-02-23 | HieraMAS: Optimizing Intra-Node LLM Mixtures and Inter-Node Topology for Multi-Agent Systems | Tianjun Yao et.al. | 2602.20229 | translate | read | null |
| 2026-02-23 | Exploring Anti-Aging Literature via ConvexTopics and Large Language Models | Lana E. Yeganova et.al. | 2602.20224 | translate | read | null |
| 2026-02-23 | An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction | Guanting Shen et.al. | 2602.20219 | translate | read | null |
| 2026-02-23 | CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions | Jingwei Shi et.al. | 2602.20213 | translate | read | null |
| 2026-02-22 | Golden Layers and Where to Find Them: Improved Knowledge Editing for Large Language Models Via Layer Gradient Analysis | Shrestha Datta et.al. | 2602.20207 | translate | read | null |
| 2026-02-22 | Mitigating “Epistemic Debt” in Generative AI-Scaffolded Novice Programming using Metacognitive Scripts | Sreecharan Sankaranarayanan et.al. | 2602.20206 | translate | read | null |
| 2026-02-22 | OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport | Xiwen Chen et.al. | 2602.20205 | translate | read | null |
| 2026-02-22 | Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study | Jeel Piyushkumar Khatiwala et.al. | 2602.20202 | translate | read | null |
| 2026-02-22 | Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning | Zhuoxu Huang et.al. | 2602.20197 | translate | read | null |
| 2026-02-23 | Do Large Language Models Understand Data Visualization Rules? | Martin Sinnona et.al. | 2602.20137 | translate | read | null |
| 2026-02-23 | KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration | Mohammad Amanlou et.al. | 2602.20135 | translate | read | null |
| 2026-02-23 | AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization | Mert Cemri et.al. | 2602.20133 | translate | read | null |
| 2026-02-23 | To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering | Zaifu Zhan et.al. | 2602.20130 | translate | read | null |
| 2026-02-23 | NanoKnow: How to Know What Your Language Model Knows | Lingwei Gu et.al. | 2602.20122 | translate | read | null |
| 2026-02-23 | BarrierSteer: LLM Safety via Learning Barrier Steering | Thanh Q. Tran et.al. | 2602.20102 | translate | read | null |
| 2026-02-23 | CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching | Yuzhe Wang et.al. | 2602.20094 | translate | read | null |
| 2026-02-23 | How Retrieved Context Shapes Internal Representations in RAG | Samuel Yeh et.al. | 2602.20091 | translate | read | null |
| 2026-02-23 | Do Large Language Models Understand Data Visualization Principles? | Martin Sinnona et.al. | 2602.20084 | translate | read | null |
| 2026-02-23 | Multilingual Large Language Models do not comprehend all natural languages to equal degrees | Natalia Moskvina et.al. | 2602.20065 | translate | read | null |
| 2026-02-23 | The LLMbda Calculus: AI Agents, Conversations, and Information Flow | Zac Garby et.al. | 2602.20064 | translate | read | null |
| 2026-02-23 | Can You Tell It’s AI? Human Perception of Synthetic Voices in Vishing Scenarios | Zoha Hayat Bhatti et.al. | 2602.20061 | translate | read | null |
| 2026-02-23 | Entropy in Large Language Models | Marco Scharringhausen et.al. | 2602.20052 | translate | read | null |
| 2026-02-23 | Closing the gap in multimodal medical representation alignment | Eleonora Grassucci et.al. | 2602.20046 | translate | read | null |
| 2026-02-23 | Let There Be Claws: An Early Social Network Analysis of AI Agents on Moltbook | H. C. W. Price et.al. | 2602.20044 | translate | read | null |
| 2026-02-23 | Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously | Han Bao et.al. | 2602.20042 | translate | read | null |
| 2026-02-23 | AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization | Fahmida Liza Piya et.al. | 2602.20040 | translate | read | null |
| 2026-02-23 | gencat: Generative computerized adaptive testing | Wanyong Feng et.al. | 2602.20020 | translate | read | null |
| 2026-02-23 | ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting | Yuxing Tian et.al. | 2602.19969 | translate | read | null |
| 2026-02-23 | Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval | Yibo Yan et.al. | 2602.19961 | translate | read | null |
| 2026-02-23 | Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming | Ian Steenstra et.al. | 2602.19948 | translate | read | null |
| 2026-02-23 | A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs | Zijie Liu et.al. | 2602.19938 | translate | read | null |
| 2026-02-23 | BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models | Chenran Kou et.al. | 2602.19929 | translate | read | null |
| 2026-02-23 | Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models | Jin Liu et.al. | 2602.19926 | translate | read | null |
| 2026-02-23 | DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning | Zhongwei Wan et.al. | 2602.19895 | translate | read | null |
| 2026-02-23 | SHIELD: Semantic Heterogeneity Integrated Embedding for Latent Discovery in Clinical Trial Safety Signals | Francois Vandenhende et.al. | 2602.19855 | translate | read | null |
| 2026-02-23 | LLM-enabled Applications Require System-Level Threat Monitoring | Yedi Zhang et.al. | 2602.19844 | translate | read | null |
| 2026-02-23 | SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation | Jingzhuo Wu et.al. | 2602.19840 | translate | read | null |
| 2026-02-23 | An Explainable Memory Forensics Approach for Malware Analysis | Silvia Lucia Sanna et.al. | 2602.19831 | translate | read | null |
| 2026-02-23 | TextShield-R1: Reinforced Reasoning for Tampered Text Detection | Chenfan Qu et.al. | 2602.19828 | translate | read | null |
| 2026-02-23 | Universal Pose Pretraining for Generalizable Vision-Language-Action Policies | Haitao Lin et.al. | 2602.19710 | translate | read | null |
| 2026-02-23 | “The explanation makes sense”: An Empirical Study on LLM Performance in News Classification and its Influence on Judgment in Human-AI Collaborative Annotation | Qile Wang et.al. | 2602.19690 | translate | read | null |
| 2026-02-23 | KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge | Alex Robertson et.al. | 2602.19643 | translate | read | null |
| 2026-02-23 | Evaluating the Impact of Data Anonymization on Image Retrieval | Marvin Chen et.al. | 2602.19641 | translate | read | null |
| 2026-02-23 | Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering | Chih-Hong Cheng et.al. | 2602.19614 | translate | read | null |
| 2026-02-23 | Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning | Borisiuk Anna et.al. | 2602.19612 | translate | read | null |
| 2026-02-23 | CLCR: Cross-Level Semantic Collaborative Representation for Multimodal Learning | Chunlei Meng et.al. | 2602.19605 | translate | read | null |
| 2026-02-23 | Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis | Chunlei Meng et.al. | 2602.19585 | translate | read | null |
| 2026-02-23 | CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment | Hanwen Liu et.al. | 2602.19574 | translate | read | null |
| 2026-02-23 | Identifying, Explaining, and Correcting Ableist Language with AI | Kynnedy Simone Smith et.al. | 2602.19560 | translate | read | null |
| 2026-02-23 | Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains | Xiaochong Jiang et.al. | 2602.19555 | translate | read | null |
| 2026-02-23 | Vinedresser3D: Agentic Text-guided 3D Editing | Yankuan Chi et.al. | 2602.19542 | translate | read | null |
| 2026-02-23 | Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial | Yousef Emami et.al. | 2602.19534 | translate | read | null |
| 2026-02-23 | Ada-RS: Adaptive Rejection Sampling for Selective Thinking | Yirou Ge et.al. | 2602.19519 | translate | read | null |
| 2026-02-23 | Anticipate, Adapt, Act: A Hybrid Framework for Task Planning | Nabanita Dash et.al. | 2602.19518 | translate | read | null |
| 2026-02-23 | Classroom Final Exam: An Instructor-Tested Reasoning Benchmark | Chongyang Gao et.al. | 2602.19517 | translate | read | null |
| 2026-02-23 | Pixel2Phys: Distilling Governing Laws from Visual Dynamics | Ruikun Li et.al. | 2602.19516 | translate | read | null |
| 2026-02-23 | Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference | Arindam Khaled et.al. | 2602.19509 | translate | read | null |
| 2026-02-23 | Conversational AI for Automated Patient Questionnaire Completion: Development Insights and Design Principles | David Fraile Navarro et.al. | 2602.19507 | translate | read | null |
| 2026-02-23 | Test-Time Computing for Referring Multimodal Large Language Models | Mingrui Wu et.al. | 2602.19505 | translate | read | null |
| 2026-02-23 | MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models | Mingrui Wu et.al. | 2602.19497 | translate | read | null |
| 2026-02-23 | Botson: An Accessible and Low-Cost Platform for Social Robotics Research | Samuel Bellaire et.al. | 2602.19491 | translate | read | null |
| 2026-02-23 | Can Large Language Models Replace Human Coders? Introducing ContentBench | Michael Haman et.al. | 2602.19467 | translate | read | null |
| 2026-02-23 | SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning | Zelin He et.al. | 2602.19455 | translate | read | null |
| 2026-02-23 | Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments | Kunal Mukherjee et.al. | 2602.19450 | translate | read | null |
| 2026-02-23 | Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images | Yuxuan Yang et.al. | 2602.19424 | translate | read | null |
| 2026-02-23 | AuditoryHuM: Auditory Scene Label Generation and Clustering using Human-MLLM Collaboration | Henry Zhong et.al. | 2602.19409 | translate | read | null |
| 2026-02-23 | Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases | Indira Vats et.al. | 2602.19407 | translate | read | null |
| 2026-02-23 | Personalized Prediction of Perceived Message Effectiveness Using Large Language Model Based Digital Twins | Jasmin Han et.al. | 2602.19403 | translate | read | null |
| 2026-02-23 | Hiding in Plain Text: Detecting Concealed Jailbreaks via Activation Disentanglement | Amirhossein Farzam et.al. | 2602.19396 | translate | read | null |
| 2026-02-22 | LLMs Can Learn to Reason Via Off-Policy RL | Daniel Ritter et.al. | 2602.19362 | translate | read | null |
| 2026-02-22 | Compliance Management for Federated Data Processing | Natallia Kokash et.al. | 2602.19360 | translate | read | null |
| 2026-02-22 | Smooth Gate Functions for Soft Advantage Policy Optimization | Egor Denisov et.al. | 2602.19345 | translate | read | null |
| 2026-02-22 | Soft Sequence Policy Optimization | Svetlana Glazyrina et.al. | 2602.19327 | translate | read | null |
| 2026-02-22 | Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations | Dongming Jiang et.al. | 2602.19320 | translate | read | null |
| 2026-02-22 | A Power Market Model with Hypersaclers and Modular Datacenters | Yihsu Chen et.al. | 2602.19310 | translate | read | null |
| 2026-02-22 | Scaling Inference-Time Computation via Opponent Simulation: Enabling Online Strategic Adaptation in Repeated Negotiation | Xiangyu Liu et.al. | 2602.19309 | translate | read | null |
| 2026-02-22 | The Path to Conversational AI Tutors: Integrating Tutoring Best Practices and Targeted Technologies to Produce Scalable AI Agents | Kirk Vanacore et.al. | 2602.19303 | translate | read | null |
| 2026-02-22 | Automated Generation of Microfluidic Netlists using Large Language Models | Jasper Davidson et.al. | 2602.19297 | translate | read | null |
| 2026-02-22 | Towards Automated Page Object Generation for Web Testing using Large Language Models | Betül Karagöz et.al. | 2602.19294 | translate | read | null |
| 2026-02-22 | Limited Reasoning Space: The cage of long-horizon reasoning in LLMs | Zhenyu Li et.al. | 2602.19281 | translate | read | null |
| 2026-02-22 | ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback | Jingyu Xiao et.al. | 2602.19276 | translate | read | null |
| 2026-02-22 | KUDA: Knowledge Unlearning by Deviating Representation for Large Language Models | Ce Fang et.al. | 2602.19275 | translate | read | null |
| 2026-02-22 | No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection | Zunkai Dai et.al. | 2602.19248 | translate | read | null |
| 2026-02-22 | Topology of Reasoning: Retrieved Cell Complex-Augmented Generation for Textual Graph Question Answering | Sen Zhao et.al. | 2602.19240 | translate | read | null |
| 2026-02-22 | Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations | Ahmed Karim et.al. | 2602.19239 | translate | read | null |
| 2026-02-22 | Knowledge-aware Visual Question Generation for Remote Sensing Images | Siran Li et.al. | 2602.19224 | translate | read | null |
| 2026-02-22 | Gecko: A Simulation Environment with Stateful Feedback for Refining Agent Tool Calls | Zeyu Zhang et.al. | 2602.19218 | translate | read | null |
| 2026-02-22 | Questions beyond Pixels: Integrating Commonsense Knowledge in Visual Question Generation for Remote Sensing | Siran Li et.al. | 2602.19217 | translate | read | null |
| 2026-02-22 | Statistical Measures for Explainable Aspect-Based Sentiment Analysis: A Case Study on Environmental Discourse in Reddit | Luisa Stracqualursi et.al. | 2602.19216 | translate | read | null |
| 2026-02-22 | How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization | Yangyi Fang et.al. | 2602.19208 | translate | read | null |
| 2026-02-22 | PositionOCR: Augmenting Positional Awareness in Multi-Modal Models via Hybrid Specialist Integration | Chen Duan et.al. | 2602.19188 | translate | read | null |
| 2026-02-22 | Next Reply Prediction X Dataset: Linguistic Discrepancies in Naively Generated Content | Simon Münker et.al. | 2602.19177 | translate | read | null |
| 2026-02-22 | TurkicNLP: An NLP Toolkit for Turkic Languages | Sherzod Hakimov et.al. | 2602.19174 | translate | read | null |
| 2026-02-22 | Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing | Maciej Świechowski et.al. | 2602.19160 | translate | read | null |
| 2026-02-22 | DoAtlas-1: A Causal Compilation Paradigm for Clinical AI | Yulong Li et.al. | 2602.19158 | translate | read | null |
| 2026-02-22 | Facet-Level Persona Control by Trait-Activated Routing with Contrastive SAE for Role-Playing LLMs | Wenqiu Tang et.al. | 2602.19157 | translate | read | null |
| 2026-02-22 | A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions | Stefanie Schneider et.al. | 2602.19133 | translate | read | null |
| 2026-02-22 | K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model | Shiyi Cao et.al. | 2602.19128 | translate | read | null |
| 2026-02-22 | AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG | Qijie You et.al. | 2602.19127 | translate | read | null |
| 2026-02-22 | Dark and Bright Side of Participatory Red-Teaming with Targets of Stereotyping for Eliciting Harmful Behaviors from Large Language Models | Sieun Kim et.al. | 2602.19124 | translate | read | null |
| 2026-02-22 | How Do LLMs Encode Scientific Quality? An Empirical Study Using Monosemantic Features from Sparse Autoencoders | Michael McCoubrey et.al. | 2602.19115 | translate | read | null |
| 2026-02-22 | Universal 3D Shape Matching via Coarse-to-Fine Language Guidance | Qinfeng Xiao et.al. | 2602.19112 | translate | read | null |
| 2026-02-22 | Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models | Kainan Liu et.al. | 2602.19111 | translate | read | null |
| 2026-02-22 | Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models | Seong Hah Cho et.al. | 2602.19101 | translate | read | null |
| 2026-02-22 | CREM: Compression-Driven Representation Enhancement for Multimodal Retrieval and Comprehension | Lihao Liu et.al. | 2602.19091 | translate | read | null |
| 2026-02-22 | TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes | Roman Egger et.al. | 2602.19079 | translate | read | null |
| 2026-02-22 | Evaluation and Benchmarking Suite for Financial Large Language Models and Agents | Shengyuan Lin et.al. | 2602.19073 | translate | read | null |
| 2026-02-22 | IDLM: Inverse-distilled Diffusion Language Models | David Li et.al. | 2602.19066 | translate | read | null |
| 2026-02-22 | Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents | Chanjin Park et.al. | 2602.19065 | translate | read | null |
| 2026-02-22 | Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer | Chenhang Cui et.al. | 2602.19058 | translate | read | null |
| 2026-02-22 | IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning | Yinhan He et.al. | 2602.19049 | translate | read | null |
| 2026-02-22 | Uncovering Context Reliance in Unstructured Knowledge Editing | Zisheng Zhou et.al. | 2602.19043 | translate | read | null |
| 2026-02-22 | Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning | Jiahao Zhang et.al. | 2602.19041 | translate | read | null |
| 2026-02-05 | Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning | Xuejun Zhang et.al. | 2602.06041 | translate | read | null |
| 2026-02-05 | SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs | Jintao Tong et.al. | 2602.06040 | translate | read | link |
| 2026-02-05 | DyTopo: Dynamic Topology Routing for Multi-Agent Reasoning via Semantic Matching | Yuxing Lu et.al. | 2602.06039 | translate | read | null |
| 2026-02-05 | Thinking with Geometry: Active Geometry Integration for Spatial Reasoning | Haoyuan Li et.al. | 2602.06037 | translate | read | link |
| 2026-02-05 | DFlash: Block Diffusion for Flash Speculative Decoding | Jian Chen et.al. | 2602.06036 | translate | read | link |
| 2026-02-05 | V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval | Dongyang Chen et.al. | 2602.06034 | translate | read | link |
| 2026-02-05 | PhysicsAgentABM: Physics-Guided Generative Agent-Based Modeling | Kavana Venkatesh et.al. | 2602.06030 | translate | read | null |
| 2026-02-05 | Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory | Haozhen Zhang et.al. | 2602.06025 | translate | read | null |
| 2026-02-05 | Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering | Miranda Muqing Miao et.al. | 2602.06022 | translate | read | null |
| 2026-02-05 | A Systematic Evaluation of Large Language Models for PTSD Severity Estimation: The Role of Contextual Knowledge and Modeling Strategies | Panagiotis Kaliosis et.al. | 2602.06015 | translate | read | null |
| 2026-02-05 | AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions | Xianyang Liu et.al. | 2602.06008 | translate | read | null |
| 2026-02-05 | VisRefiner: Learning from Visual Differences for Screenshot-to-Code Generation | Jie Deng et.al. | 2602.05998 | translate | read | null |
| 2026-02-05 | DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs | Lizhuo Luo et.al. | 2602.05992 | translate | read | link |
| 2026-02-05 | Layer-wise LoRA fine-tuning: a similarity metric approach | Keith Ando Ogawa et.al. | 2602.05988 | translate | read | null |
| 2026-02-05 | From Human-Human Collaboration to Human-Agent Collaboration: A Vision, Design Philosophy, and an Empirical Framework for Achieving Successful Partnerships Between Humans and LLM Agents | Bingsheng Yao et.al. | 2602.05987 | translate | read | null |
| 2026-02-05 | Inverse Depth Scaling From Most Layers Being Similar | Yizhou Liu et.al. | 2602.05970 | translate | read | null |
| 2026-02-05 | Orthogonal Model Merging | Sihan Yang et.al. | 2602.05943 | translate | read | null |
| 2026-02-05 | Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions | Léo Labat et.al. | 2602.05932 | translate | read | null |
| 2026-02-05 | Compound Deception in Elite Peer Review: A Failure Mode Taxonomy of 100 Fabricated Citations at NeurIPS 2025 | Samar Ansari et.al. | 2602.05930 | translate | read | null |
| 2026-02-05 | KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs | Jian Chen et.al. | 2602.05929 | translate | read | null |
| 2026-02-05 | Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences | Siquan Li et.al. | 2602.05927 | translate | read | null |
| 2026-02-05 | CLIP-Map: Structured Matrix Mapping for Parameter-Efficient CLIP Compression | Kangjie Zhang et.al. | 2602.05909 | translate | read | null |
| 2026-02-05 | Codified Finite-state Machines for Role-playing | Letian Peng et.al. | 2602.05905 | translate | read | null |
| 2026-02-05 | Regularized Calibration with Successive Rounding for Post-Training Quantization | Seohyeon Cha et.al. | 2602.05902 | translate | read | null |
| 2026-02-05 | Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models | Shuo Nie et.al. | 2602.05897 | translate | read | null |
| 2026-02-05 | When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models | Shenyu Zheng et.al. | 2602.05891 | translate | read | null |
| 2026-02-05 | A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges | Philippe J. Giabbanelli et.al. | 2602.05883 | translate | read | null |
| 2026-02-05 | EuroLLM-22B: Technical Report | Miguel Moura Ramos et.al. | 2602.05879 | translate | read | null |
| 2026-02-05 | Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy | Lukas Stappen et.al. | 2602.05877 | translate | read | null |
| 2026-02-05 | xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection | Adrián Girón et.al. | 2602.05874 | translate | read | null |
| 2026-02-05 | DLM-Scope: Mechanistic Interpretability of Diffusion Language Models via Sparse Autoencoders | Xu Wang et.al. | 2602.05859 | translate | read | null |
| 2026-02-05 | BABE: Biology Arena BEnchmark | Junting Zhou et.al. | 2602.05857 | translate | read | null |
| 2026-02-05 | “It Talks Like a Patient, But Feels Different”: Co-Designing AI Standardized Patients with Medical Learners | Zhiqi Gao et.al. | 2602.05856 | translate | read | null |
| 2026-02-05 | RRAttention: Dynamic Block Sparse Attention via Per-Head Round-Robin Shifts for Long-Context Inference | Siran Liu et.al. | 2602.05853 | translate | read | null |
| 2026-02-05 | OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions | Fangzhi Xu et.al. | 2602.05843 | translate | read | null |
| 2026-02-05 | Reinforcement World Model Learning for LLM-based Agents | Xiao Yu et.al. | 2602.05842 | translate | read | null |
| 2026-02-05 | Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation | Hai Zhang et.al. | 2602.05827 | translate | read | null |
| 2026-02-05 | Whispers of the Butterfly: A Research-through-Design Exploration of In-Situ Conversational AI Guidance in Large-Scale Outdoor MR Exhibitions | Dongyijie Primo Pan et.al. | 2602.05826 | translate | read | null |
| 2026-02-05 | ToMigo: Interpretable Design Concept Graphs for Aligning Generative AI with Creative Intent | Lena Hegemann et.al. | 2602.05825 | translate | read | null |
| 2026-02-05 | Authorship Drift: How Self-Efficacy and Trust Evolve During LLM-Assisted Writing | Yeon Su Park et.al. | 2602.05819 | translate | read | null |
| 2026-02-05 | TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning | Zihao Jiang et.al. | 2602.05818 | translate | read | null |
| 2026-02-05 | Where Does Warm-Up Come From? Adaptive Scheduling for Norm-Constrained Optimizers | Artem Riabinin et.al. | 2602.05813 | translate | read | null |
| 2026-02-05 | NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking | Kang Chen et.al. | 2602.05805 | translate | read | null |
| 2026-02-05 | Task-Oriented Robot-Human Handovers on Legged Manipulators | Andreea Tulbure et.al. | 2602.05760 | translate | read | null |
| 2026-02-05 | Towards Green AI: Decoding the Energy of LLM Inference in Software Development | Lola Solovyeva et.al. | 2602.05712 | translate | read | null |
| 2026-02-05 | Determining Energy Efficiency Sweet Spots in Production LLM Inference | Hiari Pizzini Cavagna et.al. | 2602.05695 | translate | read | null |
| 2026-02-05 | Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation | Shuting Jiang et.al. | 2602.05694 | translate | read | null |
| 2026-02-05 | MedErrBench: A Fine-Grained Multilingual Benchmark for Medical Error Detection and Correction with Clinical Expert Annotations | Congbo Ma et.al. | 2602.05692 | translate | read | null |
| 2026-02-05 | Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction | Pavithren V S Pakianathan et.al. | 2602.05687 | translate | read | null |
| 2026-02-05 | Graph-based Agent Memory: Taxonomy, Techniques, and Applications | Chang Yang et.al. | 2602.05665 | translate | read | null |
| 2026-02-05 | Alignment Verifiability in Large Language Models: Normative Indistinguishability under Behavioral Evaluation | Igor Santos-Grueiro et.al. | 2602.05656 | translate | read | null |
| 2026-02-05 | Generative Ontology: When Structured Knowledge Learns to Create | Benny Cheung et.al. | 2602.05636 | translate | read | null |
| 2026-02-05 | CASTLE: A Comprehensive Benchmark for Evaluating Student-Tailored Personalized Safety in Large Language Models | Rui Jia et.al. | 2602.05633 | translate | read | null |
| 2026-02-05 | Rewards as Labels: Revisiting RLVR from a Classification Perspective | Zepeng Zhai et.al. | 2602.05630 | translate | read | null |
| 2026-02-05 | AI chatbots versus human healthcare professionals: a systematic review and meta-analysis of empathy in patient care | Alastair Howcroft et.al. | 2602.05628 | translate | read | null |
| 2026-02-05 | Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents | Stephen Pilli et.al. | 2602.05597 | translate | read | null |
| 2026-02-05 | Multi-Task GRPO: Reliable LLM Reasoning Across Tasks | Shyam Sundhar Ramesh et.al. | 2602.05547 | translate | read | null |
| 2026-02-05 | Reasoning-guided Collaborative Filtering with Language Models for Explainable Recommendation | Fahad Anwaar et.al. | 2602.05544 | translate | read | null |
| 2026-02-05 | Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities | Florian Dietz et.al. | 2602.05532 | translate | read | null |
| 2026-02-05 | AI Agent Systems for Supply Chains: Structured Decision Prompts and Memory Retrieval | Konosuke Yoshizato et.al. | 2602.05524 | translate | read | null |
| 2026-02-05 | Capture the Flags: Family-Based Evaluation of Agentic LLMs via Semantics-Preserving Transformations | Shahin Honarvar et.al. | 2602.05523 | translate | read | null |
| 2026-02-05 | A Human-in-the-Loop, LLM-Centered Architecture for Knowledge-Graph Question Answering | Larissa Pusch et.al. | 2602.05512 | translate | read | null |
| 2026-02-05 | Relying on LLMs: Student Practices and Instructor Norms are Changing in Computer Science Education | Xinrui Lin et.al. | 2602.05506 | translate | read | null |
| 2026-02-05 | SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration | Hanyu Wei et.al. | 2602.05499 | translate | read | null |
| 2026-02-05 | Transport and Merge: Cross-Architecture Merging for Large Language Models | Chenhang Cui et.al. | 2602.05495 | translate | read | null |
| 2026-02-05 | A Unified Framework for Rethinking Policy Divergence Measures in GRPO | Qingyuan Wu et.al. | 2602.05494 | translate | read | null |
| 2026-02-05 | LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation | Bingru Li et.al. | 2602.05493 | translate | read | null |
| 2026-02-05 | Fine-Tuning Large Language Models for Automatic Detection of Sexually Explicit Content in Spanish-Language Song Lyrics | Dolores Zamacola Sánchez de Lamadrid et.al. | 2602.05485 | translate | read | null |
| 2026-02-05 | Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection | Takashi Koide et.al. | 2602.05484 | translate | read | null |
| 2026-02-05 | LMMRec: LLM-driven Motivation-aware Multimodal Recommendation | Yicheng Di et.al. | 2602.05474 | translate | read | null |
| 2026-02-05 | ALIVE: Awakening LLM Reasoning via Adversarial Learning and Instructive Verbal Evaluation | Yiwen Duan et.al. | 2602.05472 | translate | read | null |
| 2026-02-05 | Can We Classify Flaky Tests Using Only Test Code? An LLM-Based Empirical Study | Alexander Berndt et.al. | 2602.05465 | translate | read | null |
| 2026-02-05 | DistillER: Knowledge Distillation in Entity Resolution with Large Language Models | Alexandros Zeakis et.al. | 2602.05452 | translate | read | null |
| 2026-02-05 | BLITZRANK: Principled Zero-shot Ranking Agents with Tournament Graphs | Sheshansh Agrawal et.al. | 2602.05448 | translate | read | null |
| 2026-02-05 | Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale | Damon McMillan et.al. | 2602.05447 | translate | read | null |
| 2026-02-05 | DiLLS: Interactive Diagnosis of LLM-based Multi-agent Systems via Layered Summary of Agent Behaviors | Rui Sheng et.al. | 2602.05446 | translate | read | null |
| 2026-02-05 | Causal Front-Door Adjustment for Robust Jailbreak Attacks on LLMs | Yao Zhou et.al. | 2602.05444 | translate | read | null |
| 2026-02-05 | SciDef: Automating Definition Extraction from Academic Literature with Large Language Models | Filip Kučera et.al. | 2602.05413 | translate | read | null |
| 2026-02-05 | BadTemplate: A Training-Free Backdoor Attack via Chat Template Against Large Language Models | Zihan Wang et.al. | 2602.05401 | translate | read | null |
| 2026-02-05 | OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration | Shaobo Wang et.al. | 2602.05400 | translate | read | null |
| 2026-02-05 | Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better | Ji Zhao et.al. | 2602.05393 | translate | read | null |
| 2026-02-05 | Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening | Zhenxiong Yu et.al. | 2602.05386 | translate | read | null |
| 2026-02-05 | IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models | Tao Liu et.al. | 2602.05385 | translate | read | null |
| 2026-02-05 | Clinical Validation of Medical-based Large Language Model Chatbots on Ophthalmic Patient Queries with LLM-based Evaluation | Ting Fang Tan et.al. | 2602.05381 | translate | read | null |
| 2026-02-05 | Cross-Lingual Empirical Evaluation of Large Language Models for Arabic Medical Tasks | Chaimae Abouzahir et.al. | 2602.05374 | translate | read | null |
| 2026-02-05 | Speech-XL: Towards Long-Form Speech Understanding in Large Speech Language Models | Haoqin Sun et.al. | 2602.05373 | translate | read | null |
| 2026-02-05 | PACE: Defying the Scaling Hypothesis of Exploration in Iterative Alignment for Mathematical Reasoning | Jun Rao et.al. | 2602.05370 | translate | read | null |
| 2026-02-05 | RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs | Youngcheon You et.al. | 2602.05367 | translate | read | null |
| 2026-02-05 | Multi-Field Tool Retrieval | Yichen Tang et.al. | 2602.05366 | translate | read | null |
| 2026-02-05 | Multimodal Latent Reasoning via Hierarchical Visual Cues Injection | Yiming Zhang et.al. | 2602.05359 | translate | read | null |
| 2026-02-05 | AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction | Ruijie Shi et.al. | 2602.05353 | translate | read | null |
| 2026-02-05 | SynAT: Enhancing Security Knowledge Bases via Automatic Synthesizing Attack Tree from Crowd Discussions | Ziyou Jiang et.al. | 2602.05329 | translate | read | null |
| 2026-02-05 | ProAct: Agentic Lookahead in Interactive Environments | Yangbin Yu et.al. | 2602.05327 | translate | read | null |
| 2026-02-05 | ORACL: Optimized Reasoning for Autoscaling via Chain of Thought with LLMs for Microservices | Haoyu Bai et.al. | 2602.05292 | translate | read | null |
| 2026-02-05 | Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science | Jingru Fan et.al. | 2602.05289 | translate | read | null |
| 2026-02-05 | Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities | Pengyi Li et.al. | 2602.05281 | translate | read | null |
| 2026-02-05 | Hallucination-Resistant Security Planning with a Large Language Model | Kim Hammar et.al. | 2602.05279 | translate | read | null |
| 2026-02-05 | Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs | Qi Li et.al. | 2602.05275 | translate | read | null |
| 2026-02-05 | PatchGuru: Patch Oracle Inference from Natural Language Artifacts with Large Language Models | Thanh Le-Cong et.al. | 2602.05270 | translate | read | null |
| 2026-02-05 | Hybrid Gated Flow (HGF): Stabilizing 1.58-bit LLMs via Selective Low-Rank Correction | David Alejandro Trejo Pizzo et.al. | 2602.05269 | translate | read | null |
| 2026-02-05 | Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR | Fanfan Liu et.al. | 2602.05261 | translate | read | null |
| 2026-02-05 | CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs | Haoran Li et.al. | 2602.05258 | translate | read | null |
| 2026-02-05 | EGSS: Entropy-guided Stepwise Scaling for Reliable Software Engineering | Chenhui Mao et.al. | 2602.05242 | translate | read | null |
| 2026-02-05 | FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters | Zhilin Liang et.al. | 2602.05235 | translate | read | null |
| 2026-02-05 | Surgery: Mitigating Harmful Fine-Tuning for Large Language Models via Attention Sink | Guozhi Liu et.al. | 2602.05228 | translate | read | null |
| 2026-02-05 | E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching | Jiahao Nie et.al. | 2602.05215 | translate | read | null |
| 2026-02-05 | Aligning Large Language Model Behavior with Human Citation Preferences | Kenichiro Ando et.al. | 2602.05205 | translate | read | null |
| 2026-02-05 | Double-P: Hierarchical Top-P Sparse Attention for Long-Context LLMs | Wentao Ni et.al. | 2602.05191 | translate | read | null |
| 2026-02-05 | Are Open-Weight LLMs Ready for Social Media Moderation? A Comparative Study on Bluesky | Hsuan-Yu Chou et.al. | 2602.05189 | translate | read | null |
| 2026-02-05 | Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning | John Yan et.al. | 2602.05183 | translate | read | null |
| 2026-02-05 | EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization | Kevin Han et.al. | 2602.05165 | translate | read | null |
| 2026-02-05 | GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek | Yang Zhang et.al. | 2602.05150 | translate | read | null |
| 2026-02-05 | CoSA: Compressed Sensing-Based Adaptation of Large Language Models | Songtao Wei et.al. | 2602.05148 | translate | read | null |
| 2026-02-04 | HugRAG: Hierarchical Causal Knowledge Graph Design for RAG | Nengbo Wang et.al. | 2602.05143 | translate | read | null |
| 2026-02-04 | SemPipes – Optimizable Semantic Data Operators for Tabular Machine Learning Pipelines | Olga Ovcharenko et.al. | 2602.05134 | translate | read | null |
| 2026-02-04 | SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers | Keyang Xuan et.al. | 2602.05115 | translate | read | null |
| 2026-02-04 | Understanding LLM Evaluator Behavior: A Structured Multi-Evaluator Framework for Merchant Risk Assessment | Liang Wang et.al. | 2602.05110 | translate | read | null |
| 2026-02-04 | GAMMS: Graph based Adversarial Multiagent Modeling Simulator | Rohan Patil et.al. | 2602.05105 | translate | read | null |
| 2026-02-04 | VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health | Kate H. Bentley et.al. | 2602.05088 | translate | read | null |
| 2026-02-04 | Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents | Changdae Oh et.al. | 2602.05073 | translate | read | null |
| 2026-02-04 | Evaluating Large Language Models on Solved and Unsolved Problems in Graph Theory: Implications for Computing Education | Adithya Kulkarni et.al. | 2602.05059 | translate | read | null |
| 2026-02-04 | DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search | Zhanli Li et.al. | 2602.05014 | translate | read | null |
| 2026-02-04 | Private PoEtry: Private In-Context Learning via Product of Experts | Rob Romijnders et.al. | 2602.05012 | translate | read | null |
| 2026-02-04 | CoWork-X: Experience-Optimized Co-Evolution for Multi-Agent Collaboration System | Zexin Lin et.al. | 2602.05004 | translate | read | null |
| 2026-02-04 | Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning | Yu-Ang Lee et.al. | 2602.04998 | translate | read | null |
| 2026-02-04 | BioACE: An Automated Framework for Biomedical Answer and Citation Evaluations | Deepak Gupta et.al. | 2602.04982 | translate | read | null |
| 2026-02-04 | Learning Context Matters: Measuring and Diagnosing Personalization Gaps in LLM-Based Instructional Design | Johaun Hatchett et.al. | 2602.04972 | translate | read | null |
| 2026-02-04 | Large Language Models in Software Documentation and Modeling: A Literature Review and Findings | Lukas Radosky et.al. | 2602.04938 | translate | read | null |
| 2026-02-04 | Linear Model Merging Unlocks Simple and Scalable Multimodal Data Mixture Optimization | Davide Berasi et.al. | 2602.04937 | translate | read | null |
| 2026-02-04 | Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models | Shahar Haim et.al. | 2602.04931 | translate | read | null |
| 2026-02-04 | TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation | Junhan Kim et.al. | 2602.04929 | translate | read | link |
| 2026-02-04 | PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM | Gautam Savaliya et.al. | 2602.04927 | translate | read | null |
| 2026-02-04 | Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering | Dinh Phu Tran et.al. | 2602.04924 | translate | read | null |
| 2026-02-04 | Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog | Yiran Zhao et.al. | 2602.04919 | translate | read | null |
| 2026-02-04 | Simulated Adoption: Decoupling Magnitude and Direction in LLM In-Context Conflict Resolution | Long Zhang et.al. | 2602.04918 | translate | read | null |
| 2026-02-04 | AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design | Ling Luo et.al. | 2602.04916 | translate | read | null |
| 2026-02-04 | From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM | Penglei Sun et.al. | 2602.04914 | translate | read | null |
| 2026-02-04 | A $^2$ -LLM: An End-to-end Conversational Audio Avatar Large Language Model | Xiaolin Hu et.al. | 2602.04913 | translate | read | null |
| 2026-02-04 | Reducing the Costs of Proof Synthesis on Rust Systems by Scaling Up a Seed Training Set | Nongyu Di et.al. | 2602.04910 | translate | read | null |
| 2026-02-04 | Learning Where It Matters: Geometric Anchoring for Robust Preference Alignment | Youngjae Cho et.al. | 2602.04909 | translate | read | null |
| 2026-02-03 | Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization | Sai Sindhur Malleni et.al. | 2602.04900 | translate | read | null |
| 2026-02-03 | Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models | Chen Xiong et.al. | 2602.04896 | translate | read | null |
| 2026-02-04 | Reinforced Attention Learning | Bangzheng Li et.al. | 2602.04884 | translate | read | null |
| 2026-02-04 | Rethinking the Trust Region in LLM Reinforcement Learning | Penghui Qi et.al. | 2602.04879 | translate | read | null |
| 2026-02-04 | Multi-Head LatentMoE and Head Parallel: Communication-Efficient and Deterministic MoE Parallelism | Chenwei Cui et.al. | 2602.04870 | translate | read | null |
| 2026-02-04 | Subliminal Effects in Your Data: A General Mechanism via Log-Linearity | Ishaq Aden-Ali et.al. | 2602.04863 | translate | read | null |
| 2026-02-04 | CoT is Not the Chain of Truth: An Empirical Internal Analysis of Reasoning LLMs for Fake News Generation | Zhao Tong et.al. | 2602.04856 | translate | read | null |
| 2026-02-04 | Decomposed Prompting Does Not Fix Knowledge Gaps, But Helps Models Say “I Don’t Know” | Dhruv Madhwal et.al. | 2602.04853 | translate | read | null |
| 2026-02-04 | Horizon-LM: A RAM-Centric Architecture for LLM Training | Zhengqing Yuan et.al. | 2602.04816 | translate | read | link |
| 2026-02-04 | Agentic AI in Healthcare & Medicine: A Seven-Dimensional Taxonomy for Empirical Evaluation of LLM-based Agents | Shubham Vatsal et.al. | 2602.04813 | translate | read | null |
| 2026-02-04 | OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models | Yue Ding et.al. | 2602.04804 | translate | read | null |
| 2026-02-04 | Team, Then Trim: An Assembly-Line LLM Framework for High-Quality Tabular Data Generation | Congjing Zhang et.al. | 2602.04785 | translate | read | null |
| 2026-02-04 | NeuroCanvas: VLLM-Powered Robust Seizure Detection by Reformulating Multichannel EEG as Image | Yan Chen et.al. | 2602.04769 | translate | read | null |
| 2026-02-04 | Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation | Luis Frentzen Salim et.al. | 2602.04764 | translate | read | null |
| 2026-02-04 | When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond? | Xinyu Zhou et.al. | 2602.04755 | translate | read | null |
| 2026-02-04 | Decomposing Query-Key Feature Interactions Using Contrastive Covariances | Andrew Lee et.al. | 2602.04752 | translate | read | null |
| 2026-02-04 | Exploiting contextual information to improve stance detection in informal political discourse with LLMs | Arman Engin Sucu et.al. | 2602.04750 | translate | read | null |
| 2026-02-04 | Inference-Time Reasoning Selectively Reduces Implicit Social Bias in Large Language Models | Molly Apsel et.al. | 2602.04742 | translate | read | null |
| 2026-02-04 | Alignment Drift in Multimodal LLMs: A Two-Phase, Longitudinal Evaluation of Harm Across Eight Model Releases | Casey Ford et.al. | 2602.04739 | translate | read | null |
| 2026-02-04 | From Data to Behavior: Predicting Unintended Model Behaviors Before Training | Mengru Wang et.al. | 2602.04735 | translate | read | link |
| 2026-02-04 | Less Finetuning, Better Retrieval: Rethinking LLM Adaptation for Biomedical Retrievers via Synthetic Data and Model Merging | Sameh Khattab et.al. | 2602.04731 | translate | read | null |
| 2026-02-04 | “Be My Cheese?”: Cultural Nuance Benchmarking for Machine Translation in Multilingual LLMs | Madison Van Doren et.al. | 2602.04729 | translate | read | null |
| 2026-02-04 | Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation | Marian Kica et.al. | 2602.04726 | translate | read | null |
| 2026-02-04 | SAR-RAG: ATR Visual Question Answering by Semantic Search, Retrieval, and MLLM Generation | David F. Ramirez et.al. | 2602.04712 | translate | read | null |
| 2026-02-04 | LinGO: A Linguistic Graph Optimization Framework with LLMs for Interpreting Intents of Online Uncivil Discourse | Yuan Zhang et.al. | 2602.04693 | translate | read | null |
| 2026-02-04 | UniAudio 2.0: A Unified Audio Language Model with Text-Aligned Factorized Audio Tokenization | Dongchao Yang et.al. | 2602.04683 | translate | read | link |
| 2026-02-04 | Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility | Eun Cheol Choi et.al. | 2602.04674 | translate | read | null |
| 2026-02-04 | Relational Scene Graphs for Object Grounding of Natural Language Commands | Julia Kuhn et.al. | 2602.04635 | translate | read | null |
| 2026-02-04 | WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning | Zelai Xu et.al. | 2602.04634 | translate | read | link |
| 2026-02-04 | Disentangling meaning from language in LLM-based machine translation | Théo Lasnier et.al. | 2602.04613 | translate | read | null |
| 2026-02-04 | Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection | Junhao Liu et.al. | 2602.04607 | translate | read | null |
| 2026-02-04 | Automated Extraction of Multicomponent Alloy Data Using Large Language Models for Sustainable Design | Aravindan Kamatchi Sundaram et.al. | 2602.04602 | translate | read | null |
| 2026-02-04 | Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference | Xinyu Wang et.al. | 2602.04595 | translate | read | null |
| 2026-02-04 | AIANO: Enhancing Information Retrieval with AI-Augmented Annotation | Sameh Khattab et.al. | 2602.04579 | translate | read | null |
| 2026-02-04 | Semantic Self-Distillation for Language Model Uncertainty | Edward Phillips et.al. | 2602.04577 | translate | read | null |
| 2026-02-04 | Can LLMs capture stable human-generated sentence entropy measures? | Estrella Pivel-Villanueva et.al. | 2602.04570 | translate | read | null |
| 2026-02-04 | LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding | Gang Lin et.al. | 2602.04541 | translate | read | null |
| 2026-02-04 | HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing | Xuenan Xu et.al. | 2602.04535 | translate | read | null |
| 2026-02-04 | Landscape-aware Automated Algorithm Design: An Efficient Framework for Real-world Optimization | Haoran Yin et.al. | 2602.04529 | translate | read | null |
| 2026-02-04 | OSCAgent: Accelerating the Discovery of Organic Solar Cells with LLM Agents | Zhaolin Hu et.al. | 2602.04510 | translate | read | null |
| 2026-02-04 | Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models | Hyeontaek Hwang et.al. | 2602.04509 | translate | read | null |
| 2026-02-04 | ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control | Zhentao Tang et.al. | 2602.04496 | translate | read | null |
| 2026-02-04 | PersoDPO: Scalable Preference Optimization for Instruction-Adherent, Persona-Grounded Dialogue via Multi-LLM Evaluation | Saleh Afzoon et.al. | 2602.04493 | translate | read | null |
| 2026-02-04 | The Supportiveness-Safety Tradeoff in LLM Well-Being Agents | Himanshi Lalwani et.al. | 2602.04487 | translate | read | null |
| 2026-02-04 | Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition | Jinlong Ma et.al. | 2602.04486 | translate | read | null |
| 2026-02-04 | Vision-aligned Latent Reasoning for Multi-modal Large Language Model | Byungwoo Jeon et.al. | 2602.04476 | translate | read | null |
| 2026-02-04 | LLM-Empowered Cooperative Content Caching in Vehicular Fog Caching-Assisted Platoon Networks | Bowen Tan et.al. | 2602.04471 | translate | read | null |
| 2026-02-04 | DOS: Dual-Flow Orthogonal Semantic IDs for Recommendation in Meituan | Junwei Yin et.al. | 2602.04460 | translate | read | null |
| 2026-02-04 | Growth First, Care Second? Tracing the Landscape of LLM Value Preferences in Everyday Dilemmas | Zhiyi Chen et.al. | 2602.04456 | translate | read | null |
| 2026-02-04 | Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search | Tianming Liang et.al. | 2602.04454 | translate | read | link |
| 2026-02-04 | SDR-CIR: Semantic Debias Retrieval Framework for Training-Free Zero-Shot Composed Image Retrieval | Yi Sun et.al. | 2602.04451 | translate | read | null |
| 2026-02-04 | What’s in a Benchmark? The Case of SWE-Bench in Automated Program Repair | Matias Martinez et.al. | 2602.04449 | translate | read | null |
| 2026-02-04 | Fine-Grained Activation Steering: Steering Less, Achieving More | Zijian Feng et.al. | 2602.04428 | translate | read | null |
| 2026-02-04 | Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning | Heqing Yang et.al. | 2602.04419 | translate | read | null |
| 2026-02-04 | EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL | Lunjun Zhang et.al. | 2602.04417 | translate | read | null |
| 2026-02-04 | History-Guided Iterative Visual Reasoning with Self-Correction | Xinglong Yang et.al. | 2602.04413 | translate | read | null |
| 2026-02-04 | Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts | Yujie Lin et.al. | 2602.04398 | translate | read | null |
| 2026-02-04 | Evaluating the Presence of Sex Bias in Clinical Reasoning by Large Language Models | Isabel Tsintsiper et.al. | 2602.04392 | translate | read | null |
| 2026-02-04 | Beyond Rejection Sampling: Trajectory Fusion for Scaling Mathematical Reasoning | Jie Deng et.al. | 2602.04391 | translate | read | null |
| 2026-02-04 | On the use of LLMs to generate a dataset of Neural Networks | Nadia Daoudi et.al. | 2602.04388 | translate | read | null |
| 2026-02-04 | Multi-scale hypergraph meets LLMs: Aligning large language models for time series analysis | Zongjiang Shang et.al. | 2602.04369 | translate | read | null |
| 2026-02-04 | EXaMCaP: Subset Selection with Entropy Gain Maximization for Probing Capability Gains of Large Chart Understanding Training Sets | Jiapeng Liu et.al. | 2602.04365 | translate | read | null |
| 2026-02-04 | Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models | Stefan Otten et.al. | 2602.04358 | translate | read | null |
| 2026-02-04 | Can Vision Replace Text in Working Memory? Evidence from Spatial n-Back in Vision-Language Models | Sichu Liang et.al. | 2602.04355 | translate | read | null |
| 2026-02-04 | UnMaskFork: Test-Time Scaling for Masked Diffusion via Deterministic Action Branching | Kou Misaki et.al. | 2602.04344 | translate | read | null |
| 2026-02-04 | From Assumptions to Actions: Turning LLM Reasoning into Uncertainty-Aware Planning for Embodied Agents | SeungWon Seo et.al. | 2602.04326 | translate | read | null |
| 2026-02-04 | A Domain-Specific Curated Benchmark for Entity and Document-Level Relation Extraction | Marco Martinelli et.al. | 2602.04320 | translate | read | null |
| 2026-02-04 | DeFrame: Debiasing Large Language Models Against Framing Effects | Kahee Lim et.al. | 2602.04306 | translate | read | null |
| 2026-02-04 | Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification | Branislav Pecher et.al. | 2602.04297 | translate | read | null |
| 2026-02-04 | ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas | Wenjun Peng et.al. | 2602.04296 | translate | read | link |
| 2026-02-04 | How Few-shot Demonstrations Affect Prompt-based Defenses Against LLM Jailbreak Attacks | Yanshu Wang et.al. | 2602.04294 | translate | read | null |
| 2026-02-04 | Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration | Sudipto Ghosh et.al. | 2602.04291 | translate | read | null |
| 2026-02-04 | Guided Verifier: Collaborative Multimodal Reasoning via Dynamic Process Supervision | Lingzhuang Sun et.al. | 2602.04290 | translate | read | null |
| 2026-02-04 | Contextual Drag: How Errors in the Context Affect LLM Reasoning | Yun Cheng et.al. | 2602.04288 | translate | read | null |
| 2026-02-04 | ECG-R1: Protocol-Guided and Modality-Agnostic MLLM for Reliable ECG Interpretation | Jiarui Jin et.al. | 2602.04279 | translate | read | link |
| 2026-02-04 | MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation | Lin Wang et.al. | 2602.04278 | translate | read | null |
| 2026-02-04 | KVSmooth: Mitigating Hallucination in Multi-modal Large Language Models through Key-Value Smoothing | Siyu Jiang et.al. | 2602.04268 | translate | read | null |
| 2026-02-04 | Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning | Wenze Lin et.al. | 2602.04265 | translate | read | null |
| 2026-02-04 | Data Agents: Levels, State of the Art, and Open Problems | Yuyu Luo et.al. | 2602.04261 | translate | read | null |
| 2026-02-04 | Scaling Agentic Verifier for Competitive Coding | Zeyao Ma et.al. | 2602.04254 | translate | read | null |
| 2026-02-04 | Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search | Hao Lu et.al. | 2602.04248 | translate | read | null |
| 2026-02-04 | CoLT: Reasoning with Chain of Latent Tool Calls | Fangwei Zhu et.al. | 2602.04246 | translate | read | null |
| 2026-02-04 | On the Uncertainty of Large Language Model-Based Multi-Agent Systems | Yuxuan Zhao et.al. | 2602.04234 | translate | read | null |
| 2026-02-04 | Following the TRAIL: Predicting and Explaining Tomorrow’s Hits with a Fine-Tuned LLM | Yinan Zhang et.al. | 2602.04225 | translate | read | null |
| 2026-02-04 | Language Models Struggle to Use Representations Learned In-Context | Michael A. Lepori et.al. | 2602.04212 | translate | read | null |
| 2026-02-04 | Steering LLMs via Scalable Interactive Oversight | Enyu Zhou et.al. | 2602.04210 | translate | read | null |
| 2026-02-04 | Enforcing Monotonic Progress in Legal Cross-Examination: Preventing Long-Horizon Stagnation in LLM-Based Inquiry | Hsien-Jyh Liao et.al. | 2602.04206 | translate | read | null |
| 2026-02-04 | Semantic Consensus Decoding: Backdoor Defense for Verilog Code Generation | Guang Yang et.al. | 2602.04195 | translate | read | null |
| 2026-02-04 | SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow | Suyu Ma et.al. | 2602.04185 | translate | read | null |
| 2026-02-04 | I Can’t Believe It’s Not a Valid Exploit | Derin Gezgin et.al. | 2602.04165 | translate | read | null |
| 2026-02-04 | BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models | Junyu Chen et.al. | 2602.04163 | translate | read | null |
| 2026-02-04 | Paint by Odor: An Exploration of Odor Visualization through Large Language Model and Generative AI | Gang Yu et.al. | 2602.04159 | translate | read | null |
| 2026-02-04 | A Modern System Recipe for Situated Embodied Human-Robot Conversation with Real-Time Multimodal LLMs and Tool-Calling | Dong Won Lee et.al. | 2602.04157 | translate | read | null |
| 2026-02-04 | JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset built with Large Language Models | Hiroshi Sasaki et.al. | 2602.04142 | translate | read | null |
| 2026-02-04 | Semantic Pilot Design for Data-Aided Channel Estimation Using a Large Language Model | Sojeong Park et.al. | 2602.04126 | translate | read | null |
| 2026-02-04 | Making Videos Accessible for Blind and Low Vision Users Using a Multimodal Agent Video Player | Adriana Olmos et.al. | 2602.04104 | translate | read | null |
| 2026-02-04 | Rethinking Perplexity: Revealing the Impact of Input Length on Perplexity Evaluation in LLMs | Letian Cheng et.al. | 2602.04099 | translate | read | null |
| 2026-02-03 | Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL | Xiaofeng Lin et.al. | 2602.04089 | translate | read | null |
| 2026-02-03 | Abstraction Induces the Brain Alignment of Language and Speech Models | Emily Cheng et.al. | 2602.04081 | translate | read | null |
| 2026-02-03 | Stroke Lesions as a Rosetta Stone for Language Model Interpretability | Julius Fridriksson et.al. | 2602.04074 | translate | read | null |
| 2026-02-03 | Data Verification is the Future of Quantum Computing Copilots | Junhao Song et.al. | 2602.04072 | translate | read | null |
| 2026-02-03 | Exploring the Potential of Large Language Models in Simulink-Stateflow Mutant Generation | Pablo Valle et.al. | 2602.04066 | translate | read | null |
| 2026-02-03 | The CitizenQuery Benchmark: A Novel Dataset and Evaluation Pipeline for Measuring LLM Performance in Citizen Query Tasks | Neil Majithia et.al. | 2602.04064 | translate | read | null |
| 2026-02-03 | RareCollab – An Agentic System Diagnosing Mendelian Disorders with Integrated Phenotypic and Molecular Evidence | Guantong Qi et.al. | 2602.04058 | translate | read | null |
| 2026-02-03 | Evaluating the Vulnerability Landscape of LLM-Generated Smart Contracts | Hoang Long Do et.al. | 2602.04039 | translate | read | null |
| 2026-02-03 | On the Credibility of Evaluating LLMs using Survey Questions | Jindřich Libovický et.al. | 2602.04033 | translate | read | null |
| 2026-02-03 | Understanding and Guiding Layer Placement in Parameter-Efficient Fine-Tuning of Large Language Models | Yichen Xu et.al. | 2602.04019 | translate | read | null |
| 2026-02-03 | Chaplains’ Reflections on the Design and Usage of AI for Conversational Care | Joel Wester et.al. | 2602.04017 | translate | read | null |
| 2026-02-03 | PromptSplit: Revealing Prompt-Level Disagreement in Generative Models | Mehdi Lotfian et.al. | 2602.04009 | translate | read | null |
| 2026-02-03 | StraTyper: Automated Semantic Type Discovery and Multi-Type Annotation for Dataset Collections | Christos Koutras et.al. | 2602.04004 | translate | read | null |
| 2026-02-03 | When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Making | Shutong Fan et.al. | 2602.04003 | translate | read | null |
| 2026-02-03 | After Talking with 1,000 Personas: Learning Preference-Aligned Proactive Assistants From Large-Scale Persona Interactions | Ziyi Xuan et.al. | 2602.04000 | translate | read | null |
| 2026-02-03 | When Chains of Thought Don’t Matter: Causal Bypass in Large Language Models | Anish Sathyanarayanan et.al. | 2602.03994 | translate | read | null |
| 2026-02-03 | Likelihood-Based Reward Designs for General LLM Reasoning | Ariel Kwiatkowski et.al. | 2602.03979 | translate | read | null |
| 2026-02-03 | Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure | Shuhui Qu et.al. | 2602.03975 | translate | read | null |
| 2026-02-03 | Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem | Shama Magnur et.al. | 2602.03969 | translate | read | null |
| 2026-02-03 | Automatic Classification of Pedagogical Materials against CS Curriculum Guidelines | Erik Saule et.al. | 2602.03962 | translate | read | null |
| 2026-02-03 | AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent | Yinyi Luo et.al. | 2602.03955 | translate | read | link |
| 2026-02-03 | SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild? | Azmine Toushik Wasi et.al. | 2602.03916 | translate | read | link |
| 2026-02-03 | Knowledge Model Prompting Increases LLM Performance on Planning Tasks | Erik Goh et.al. | 2602.03900 | translate | read | null |
| 2026-02-03 | Audit After Segmentation: Reference-Free Mask Quality Assessment for Language-Referred Audio-Visual Segmentation | Jinxing Zhou et.al. | 2602.03892 | translate | read | null |
| 2026-02-03 | 4DPC $^2$ hat: Towards Dynamic Point Cloud Understanding with Failure-Aware Bootstrapping | Xindan Zhang et.al. | 2602.03890 | translate | read | null |
| 2026-02-03 | Understanding and Exploiting Weight Update Sparsity for Communication-Efficient Distributed RL | Erfan Miahi et.al. | 2602.03839 | translate | read | null |
| 2026-02-03 | Accelerating Scientific Research with Gemini: Case Studies and Common Techniques | David P. Woodruff et.al. | 2602.03837 | translate | read | null |
| 2026-02-03 | Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning | Dingkun Zhang et.al. | 2602.03815 | translate | read | null |
| 2026-02-03 | Conformal Thinking: Risk Control for Reasoning on a Compute Budget | Xi Wang et.al. | 2602.03814 | translate | read | null |
| 2026-02-03 | Antidistillation Fingerprinting | Yixuan Even Xu et.al. | 2602.03812 | translate | read | null |
| 2026-02-03 | Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation | Ziru Chen et.al. | 2602.03806 | translate | read | link |
| 2026-02-03 | Context Compression via Explicit Information Transmission | Jiangnan Ye et.al. | 2602.03784 | translate | read | null |
| 2026-02-03 | Efficient Estimation of Kernel Surrogate Models for Task Attribution | Zhenshuo Zhang et.al. | 2602.03783 | translate | read | null |
| 2026-02-03 | QVLA: Not All Channels Are Equal in Vision-Language-Action Model’s Quantization | Yuhao Xu et.al. | 2602.03782 | translate | read | null |
| 2026-02-03 | A Scene Graph Backed Approach to Open Set Semantic Mapping | Martin Günther et.al. | 2602.03781 | translate | read | null |
| 2026-02-03 | An Empirical Study of Collective Behaviors and Social Dynamics in Large Language Model Agents | Farnoosh Hashemi et.al. | 2602.03775 | translate | read | null |
| 2026-02-03 | Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL | Ian Wu et.al. | 2602.03773 | translate | read | null |
| 2026-02-03 | UniGeM: Unifying Data Mixing and Selection via Geometric Exploration and Mining | Changhao Wang et.al. | 2602.03772 | translate | read | null |
| 2026-02-03 | Training Multi-Turn Search Agent via Contrastive Dynamic Branch Sampling | Yubao Zhao et.al. | 2602.03719 | translate | read | null |
| 2026-02-03 | SWE-Refactor: A Repository-Level Benchmark for Real-World LLM-Based Code Refactoring | Yisen Xu et.al. | 2602.03712 | translate | read | null |
| 2026-02-03 | No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding | Vynska Amalia Permadi et.al. | 2602.03709 | translate | read | null |
| 2026-02-03 | Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States | Ximing Dong et.al. | 2602.03708 | translate | read | null |
| 2026-02-03 | Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models | Yu Tian et.al. | 2602.03704 | translate | read | null |
| 2026-02-03 | Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging | Alexandru Meterez et.al. | 2602.03702 | translate | read | null |
| 2026-02-03 | Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates | Duy Nguyen et.al. | 2602.03696 | translate | read | null |
| 2026-02-03 | LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization | Zishi Zhang et.al. | 2602.03690 | translate | read | null |
| 2026-02-03 | Universal One-third Time Scaling in Learning Peaked Distributions | Yizhou Liu et.al. | 2602.03685 | translate | read | null |
| 2026-02-03 | Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration | Yu Zhang et.al. | 2602.03677 | translate | read | null |
| 2026-02-03 | Mitigating Conversational Inertia in Multi-Turn Agents | Yang Wan et.al. | 2602.03664 | translate | read | null |
| 2026-02-03 | Reinforcement Fine-Tuning for History-Aware Dense Retriever in RAG | Yicheng Zhang et.al. | 2602.03645 | translate | read | null |
| 2026-02-03 | TRE: Encouraging Exploration in the Trust Region | Chao Huang et.al. | 2602.03635 | translate | read | link |
| 2026-02-03 | Can LLMs Do Rocket Science? Exploring the Limits of Complex Reasoning with GTOC 12 | Iñaki del Campo et.al. | 2602.03630 | translate | read | null |
| 2026-02-03 | Toward a new AI winter? How diffusion of technological innovation on networks leads to chaotic boom-bust cycles | Sabin Roman et.al. | 2602.03620 | translate | read | null |
| 2026-02-03 | Controlling Output Rankings in Generative Engines for LLM-based Search | Haibo Jin et.al. | 2602.03608 | translate | read | null |
| 2026-02-03 | Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation | Haichao Jiang et.al. | 2602.03595 | translate | read | null |
| 2026-02-03 | SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM | Ming Nie et.al. | 2602.03589 | translate | read | null |
| 2026-02-03 | $V_0$ : A Generalist Value Model for Any Policy at State Zero | Yi-Kai Zhang et.al. | 2602.03584 | translate | read | null |
| 2026-02-03 | Don’t believe everything you read: Understanding and Measuring MCP Behavior under Misleading Tool Descriptions | Zhihao Li et.al. | 2602.03580 | translate | read | null |
| 2026-02-03 | Use Graph When It Needs: Efficiently and Adaptively Integrating Retrieval-Augmented Generation with Graphs | Su Dong et.al. | 2602.03578 | translate | read | null |
| 2026-02-03 | EHRWorld: A Patient-Centric Medical World Model for Long-Horizon Clinical Trajectories | Linjie Mu et.al. | 2602.03569 | translate | read | null |
| 2026-02-03 | CoGenCast: A Coupled Autoregressive-Flow Generative Framework for Time Series Forecasting | Yaguo Liu et.al. | 2602.03564 | translate | read | null |
| 2026-02-03 | Scaling Test-Driven Code Generation from Functions to Classes: An Empirical Study | Yunhao Liang et.al. | 2602.03557 | translate | read | null |
| 2026-02-03 | When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs | Bogdan Zagribelnyy et.al. | 2602.03554 | translate | read | null |
| 2026-02-03 | Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models | Vitalii Hirak et.al. | 2602.03551 | translate | read | null |
| 2026-02-03 | SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue | Yuqin Dai et.al. | 2602.03548 | translate | read | link |
| 2026-02-03 | Persona Generators: Generating Diverse Synthetic Personas at Scale | Davide Paglieri et.al. | 2602.03545 | translate | read | null |
| 2026-02-03 | Can Large Language Models Generalize Procedures Across Representations? | Fangru Lin et.al. | 2602.03542 | translate | read | null |
| 2026-02-03 | PnP-U3D: Plug-and-Play 3D Framework Bridging Autoregression and Diffusion for Unified Understanding and Generation | Yongwei Chen et.al. | 2602.03533 | translate | read | null |
| 2026-02-03 | Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning | Zixiang Di et.al. | 2602.03516 | translate | read | null |
| 2026-02-03 | Learning to Reason Faithfully through Step-Level Faithfulness Maximization | Runquan Gui et.al. | 2602.03507 | translate | read | null |
| 2026-02-03 | Lookahead Path Likelihood Optimization for Diffusion LLMs | Xuejie Liu et.al. | 2602.03496 | translate | read | null |
| 2026-02-03 | IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning | Haohao Luo et.al. | 2602.03468 | translate | read | null |
| 2026-02-03 | Quantum Circuit Generation via test-time learning with large language models | Adriano Macarone-Palmieri et.al. | 2602.03466 | translate | read | null |
| 2026-02-03 | RAL-Bench: Benchmarking for Application-Level Functional Correctness and Non-Functional Quality Attributes | Ruwei Pan et.al. | 2602.03462 | translate | read | null |
| 2026-02-03 | Contextualized Visual Personalization in Vision-Language Models | Yeongtak Oh et.al. | 2602.03454 | translate | read | null |
| 2026-02-03 | Beyond Variance: Prompt-Efficient RLVR via Rare-Event Amplification and Bidirectional Pairing | Xin Sheng et.al. | 2602.03452 | translate | read | null |
| 2026-02-03 | Ontology-to-tools compilation for executable semantic constraint enforcement in LLM agents | Xiaochi Zhou et.al. | 2602.03439 | translate | read | null |
| 2026-02-03 | When control meets large language models: From words to dynamics | Komeil Nosrati et.al. | 2602.03433 | translate | read | null |
| 2026-02-03 | ProAct: A Benchmark and Multimodal Framework for Structure-Aware Proactive Response | Xiaomeng Zhu et.al. | 2602.03430 | translate | read | null |
| 2026-02-03 | DiscoverLLM: From Executing Intents to Discovering Them | Tae Soo Kim et.al. | 2602.03429 | translate | read | null |
| 2026-02-03 | RankSteer: Activation Steering for Pointwise LLM Ranking | Yumeng Wang et.al. | 2602.03422 | translate | read | null |
| 2026-02-03 | SWE-World: Building Software Engineering Agents in Docker-Free Environments | Shuang Sun et.al. | 2602.03419 | translate | read | link |
| 2026-02-03 | Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction | Zhengbo Jiao et.al. | 2602.03414 | translate | read | null |
| 2026-02-03 | Verified Critical Step Optimization for LLM Agents | Mukai Li et.al. | 2602.03412 | translate | read | null |
| 2026-02-03 | Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility | Mengxuan Wang et.al. | 2602.03402 | translate | read | null |
| 2026-02-03 | Precision in Practice: Knowledge Guided Code Summarizing Grounded in Industrial Expectations | Jintai Li et.al. | 2602.03400 | translate | read | null |
| 2026-02-03 | Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective | Hao Fang et.al. | 2602.03396 | translate | read | null |
| 2026-02-03 | On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models | Shumin Wang et.al. | 2602.03392 | translate | read | null |
| 2026-02-03 | Pursuing Best Industrial Practices for Retrieval-Augmented Generation in the Medical Domain | Wei Zhu et.al. | 2602.03368 | translate | read | null |
| 2026-02-03 | MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling | Ning Ding et.al. | 2602.03359 | translate | read | null |
| 2026-02-03 | MentalSeek-Dx: Towards Progressive Hypothetico-Deductive Reasoning for Real-world Psychiatric Diagnosis | Xiao Sun et.al. | 2602.03340 | translate | read | null |
| 2026-02-03 | The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas | Jacopo Amidei et.al. | 2602.03334 | translate | read | null |
| 2026-02-03 | MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning | Shengyuan Liu et.al. | 2602.03320 | translate | read | null |
| 2026-02-03 | MIRROR: A Multi-Agent Framework with Iterative Adaptive Revision and Hierarchical Retrieval for Optimization Modeling in Operations Research | Yifan Shi et.al. | 2602.03318 | translate | read | null |
| 2026-02-03 | Multi-Level Testing of Conversational AI Systems | Elena Masserini et.al. | 2602.03311 | translate | read | null |
| 2026-02-03 | Entropy-Gated Selective Policy Optimization:Token-Level Gradient Allocation for Hybrid Training of Large Language Models | Yuelin Hu et.al. | 2602.03309 | translate | read | null |
| 2026-02-03 | medR: Reward Engineering for Clinical Offline Reinforcement Learning via Tri-Drive Potential Functions | Qianyi Xu et.al. | 2602.03305 | translate | read | null |
| 2026-02-03 | R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model? | Jingyi Zhang et.al. | 2602.03300 | translate | read | null |
| 2026-02-03 | POP: Prefill-Only Pruning for Efficient Large Model Inference | Junhui He et.al. | 2602.03295 | translate | read | null |
| 2026-02-03 | Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis | Zhengbo Jiao et.al. | 2602.03279 | translate | read | null |
| 2026-02-03 | LogicScan: An LLM-driven Framework for Detecting Business Logic Vulnerabilities in Smart Contracts | Jiaqi Gao et.al. | 2602.03271 | translate | read | null |
| 2026-02-03 | Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models | Hicham Eddoubi et.al. | 2602.03265 | translate | read | null |
| 2026-02-03 | CSR-Bench: A Benchmark for Evaluating the Cross-modal Safety and Reliability of MLLMs | Yuxuan Liu et.al. | 2602.03263 | translate | read | null |
| 2026-02-03 | The Necessity of a Unified Framework for LLM-Based Agent Evaluation | Pengyu Zhu et.al. | 2602.03238 | translate | read | null |
| 2026-02-03 | Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations | Yuxuan Yao et.al. | 2602.03237 | translate | read | null |
| 2026-02-03 | EventFlash: Towards Efficient MLLMs for Event-Based Vision | Shaoyu Liu et.al. | 2602.03230 | translate | read | null |
| 2026-02-03 | Spiral RoPE: Rotate Your Rotary Positional Embeddings in the 2D Plane | Haoyu Liu et.al. | 2602.03227 | translate | read | null |
| 2026-02-03 | ATACompressor: Adaptive Task-Aware Compression for Efficient Long-Context Processing in LLMs | Xuancheng Li et.al. | 2602.03226 | translate | read | null |
| 2026-02-03 | Beyond Quantity: Trajectory Diversity Scaling for Code Agents | Guhong Chen et.al. | 2602.03219 | translate | read | null |
| 2026-02-03 | Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection | Dongwon Jo et.al. | 2602.03216 | translate | read | null |
| 2026-02-03 | ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution | Zican Dong et.al. | 2602.03203 | translate | read | null |
| 2026-02-03 | Reinforcement Learning with Promising Tokens for Large Language Models | Jing-Cheng Pang et.al. | 2602.03195 | translate | read | null |
| 2026-02-03 | Prompt Augmentation Scales up GRPO Training on Mathematical Reasoning | Wenquan Lu et.al. | 2602.03190 | translate | read | null |
| 2026-02-03 | DynSplit-KV: Dynamic Semantic Splitting for KVCache Compression in Efficient Long-Context LLM Inference | Jiancai Ye et.al. | 2602.03184 | translate | read | null |
| 2026-02-03 | Privasis: Synthesizing the Largest “Public” Private Dataset from Scratch | Hyunwoo Kim et.al. | 2602.03183 | translate | read | null |
| 2026-02-03 | VALUEFLOW: Toward Pluralistic and Steerable Value-based Alignment in Large Language Models | Woojin Kim et.al. | 2602.03160 | translate | read | null |
| 2026-02-03 | PAMAS: Self-Adaptive Multi-Agent System with Perspective Aggregation for Misinformation Detection | Zongwei Wang et.al. | 2602.03158 | translate | read | null |
| 2026-02-03 | Is It Possible to Make Chatbots Virtuous? Investigating a Virtue-Based Design Methodology Applied to LLMs | Matthew P. Lad et.al. | 2602.03155 | translate | read | null |
| 2026-02-03 | FASA: Frequency-aware Sparse Attention | Yifei Wang et.al. | 2602.03152 | translate | read | null |
| 2026-02-03 | Internet of Agentic AI: Incentive-Compatible Distributed Teaming and Workflow | Ya-Ting Yang et.al. | 2602.03145 | translate | read | null |
| 2026-02-03 | Self-Hinting Language Models Enhance Reinforcement Learning | Baohao Liao et.al. | 2602.03143 | translate | read | null |
| 2026-02-03 | Contrastive Concept-Tree Search for LLM-Assisted Algorithm Discovery | Timothee Leleu et.al. | 2602.03132 | translate | read | null |
| 2026-02-03 | Understanding Multi-Agent LLM Frameworks: A Unified Benchmark and Experimental Analysis | Abdelghny Orogat et.al. | 2602.03128 | translate | read | null |
| 2026-02-03 | Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost | Yinggan Xu et.al. | 2602.03120 | translate | read | null |
| 2026-02-03 | Digital Lifelong Learning in the Age of AI: Trends and Insights | Geeta Puri et.al. | 2602.03114 | translate | read | null |
| 2026-02-03 | ChemPro: A Progressive Chemistry Benchmark for Large Language Models | Aaditya Baranwal et.al. | 2602.03108 | translate | read | null |
| 2026-02-03 | The Mask of Civility: Benchmarking Chinese Mock Politeness Comprehension in Large Language Models | Yitong Zhang et.al. | 2602.03107 | translate | read | null |
| 2026-02-03 | Task–Specificity Score: Measuring How Much Instructions Really Matter for Supervision | Pritam Kadasi et.al. | 2602.03103 | translate | read | null |
| 2026-02-03 | Consensus Group Relative Policy Optimization for Text Generation | Yuki Ichihara et.al. | 2602.03102 | translate | read | null |
| 2026-02-03 | Risky-Bench: Probing Agentic Safety Risks under Real-World Deployment | Jingnan Zheng et.al. | 2602.03100 | translate | read | null |
| 2026-02-03 | De-conflating Preference and Qualification: Constrained Dual-Perspective Reasoning for Job Recommendation with Large Language Models | Bryce Kan et.al. | 2602.03097 | translate | read | null |
| 2026-02-03 | Test-time Recursive Thinking: Self-Improvement without External Feedback | Yufan Zhuang et.al. | 2602.03094 | translate | read | null |
| 2026-02-03 | AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback | Zhitao Gao et.al. | 2602.03084 | translate | read | null |
| 2026-02-03 | ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution | Junjie Huang et.al. | 2602.03075 | translate | read | null |
| 2026-02-03 | TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT | Rana Muhammad Shahroz Khan et.al. | 2602.03073 | translate | read | null |
| 2026-02-03 | ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling | Chao Shen et.al. | 2602.03070 | translate | read | null |
| 2026-02-03 | Skill-Based Autonomous Agents for Material Creep Database Construction | Yue Wu et.al. | 2602.03069 | translate | read | null |
| 2026-02-03 | ALPBench: A Benchmark for Attribution-level Long-term Personal Behavior Understanding | Lu Ren et.al. | 2602.03056 | translate | read | null |
| 2026-02-03 | MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems | Vishal Venkataramani et.al. | 2602.03053 | translate | read | null |
| 2026-02-03 | SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression | Xing Hu et.al. | 2602.03051 | translate | read | null |
| 2026-02-03 | Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation | Bo Yuan et.al. | 2602.03045 | translate | read | null |
| 2026-02-03 | LatentMem: Customizing Latent Memory for Multi-Agent Systems | Muxin Fu et.al. | 2602.03036 | translate | read | null |
| 2026-02-03 | Generalizable and Interpretable RF Fingerprinting with Shapelet-Enhanced Large Language Models | Tianya Zhao et.al. | 2602.03035 | translate | read | null |
| 2026-02-03 | RC-GRPO: Reward-Conditioned Group Relative Policy Optimization for Multi-Turn Tool Calling Agents | Haitian Zhong et.al. | 2602.03025 | translate | read | null |
| 2026-02-03 | Rethinking Music Captioning with Music Metadata LLMs | Irmak Bukey et.al. | 2602.03023 | translate | read | null |
| 2026-02-03 | STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models | Jiliang Ni et.al. | 2602.03022 | translate | read | null |
| 2026-02-03 | FedKRSO: Communication and Memory Efficient Federated Fine-Tuning of Large Language Models | Guohao Yang et.al. | 2602.03019 | translate | read | null |
| 2026-02-03 | VOILA: Value-of-Information Guided Fidelity Selection for Cost-Aware Multimodal Question Answering | Rahul Atul Bhope et.al. | 2602.03007 | translate | read | null |
| 2026-02-03 | Distilling LLM Reasoning into Graph of Concept Predictors | Ziyang Yu et.al. | 2602.03006 | translate | read | null |
| 2026-02-03 | Methods and Open Problems in Differentiable Social Choice: Learning Mechanisms, Decisions, and Alignment | Zhiyu An et.al. | 2602.03003 | translate | read | null |
| 2026-02-03 | Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation | Jiaze Li et.al. | 2602.02994 | translate | read | null |
| 2026-02-03 | Large Language Models Can Take False First Steps at Inference-time Planning | Haijiang Yan et.al. | 2602.02991 | translate | read | null |
| 2026-02-03 | NLI:Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference | Jiangyong Yu et.al. | 2602.02988 | translate | read | null |
| 2026-02-03 | Large-Scale LLM Inference with Heterogeneous Workloads: Prefill-Decode Contention and Asymptotically Optimal Control | Ruihan Lin et.al. | 2602.02987 | translate | read | null |
| 2026-02-03 | Are LLMs Biased Like Humans? Causal Reasoning as a Function of Prior Knowledge, Irrelevant Information, and Reasoning Budget | Hanna M. Dettki et.al. | 2602.02983 | translate | read | null |
| 2026-02-03 | CPMobius: Iterative Coach-Player Reasoning for Data-Free Reinforcement Learning | Ran Li et.al. | 2602.02979 | translate | read | null |
| 2026-02-03 | Where Norms and References Collide: Evaluating LLMs on Normative Reasoning | Mitchell Abrams et.al. | 2602.02975 | translate | read | null |
| 2026-02-03 | Testing Framework Migration with Large Language Models | Altino Alves et.al. | 2602.02964 | translate | read | null |
| 2026-02-03 | Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth | Faye Zhang et.al. | 2602.02961 | translate | read | null |
| 2026-02-03 | Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning | Yihong Huang et.al. | 2602.02951 | translate | read | null |
| 2026-02-03 | Equal Access, Unequal Interaction: A Counterfactual Audit of LLM Fairness | Alireza Amiri-Margavi et.al. | 2602.02932 | translate | read | null |
| 2026-02-02 | FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights | Zhen Wang et.al. | 2602.02905 | translate | read | null |
| 2026-02-02 | Failure-Aware Enhancements for Large Language Model (LLM) Code Generation: An Empirical Study on Decision Framework | Jianru Shen et.al. | 2602.02896 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)