LLM - 2026-02 | Paper Arxiv Daily

LLM - 2026-02

Publish Date	Title	Authors	PDF	Translate	Read	Code
2026-02-28	Learning Nested Named Entity Recognition from Flat Annotations	Igor Rozhkov et.al.	2603.00840	translate	read	null
2026-02-28	Constitutional Black-Box Monitoring for Scheming in LLM Agents	Simon Storf et.al.	2603.00829	translate	read	null
2026-02-28	A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations	Hossein Javidnia et.al.	2603.00824	translate	read	null
2026-02-28	A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction	Ruihao Pan et.al.	2603.00823	translate	read	null
2026-02-28	ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files	Reshabh K Sharma et.al.	2603.00822	translate	read	null
2026-02-28	From Dyads to Groups: Rethinking Emotional Support with Conversational AI	Yuqing Hu et.al.	2603.00797	translate	read	null
2026-02-28	Identifying the Geographic Foci of US Local News	Gangani Ariyarathne et.al.	2603.00787	translate	read	null
2026-02-28	Structure Matters: Evaluating Multi-Agents Orchestration in Generative Therapeutic Chatbots	Sina Elahimanesh et.al.	2603.00774	translate	read	null
2026-02-28	LLM-Powered Automatic Theorem Proving and Synthesis for Hybrid Systems and Game	Aditi Kabra et.al.	2603.00737	translate	read	null
2026-02-28	RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models	Andrew Zhuoer Feng et.al.	2603.00724	translate	read	null
2026-02-28	MARS: Harmonizing Multimodal Convergence via Adaptive Rank Search	Minkyoung Cho et.al.	2603.00720	translate	read	null
2026-02-28	DRIV-EX: Counterfactual Explanations for Driving LLMs	Amaia Cardiel et.al.	2603.00696	translate	read	null
2026-02-28	Wild-Drive: Off-Road Scene Captioning and Path Planning via Robust Multi-modal Routing and Efficient Large Language Model	Zihang Wang et.al.	2603.00694	translate	read	null
2026-02-28	RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis	Andrew Zhuoer Feng et.al.	2603.00686	translate	read	null
2026-02-28	Stateful Cross-layer Vision Modulation	Ying Liu et.al.	2603.00655	translate	read	null
2026-02-28	Historian: Reducing Manual Validation in APR Benchmarking via Evidence-Based Assessment	Sahand Moslemi et.al.	2603.00649	translate	read	null
2026-02-28	RAIE: Region-Aware Incremental Preference Editing with LoRA for LLM-based Recommendation	Jin Zeng et.al.	2603.00638	translate	read	null
2026-02-28	TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces	Shu-Xun Yang et.al.	2603.00623	translate	read	null
2026-02-28	PlantWhisperer: Designing Conversational AI to Support Plant Care	Daniel Mejer Christensen et.al.	2603.00598	translate	read	null
2026-02-28	UNICBench: UNIfied Counting Benchmark for MLLM	Chenggang Rong et.al.	2603.00595	translate	read	null
2026-02-28	Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs	Yiran Zhao et.al.	2603.00590	translate	read	null
2026-02-28	Energy-Efficient Information Representation in MNIST Classification Using Biologically Inspired Learning	Patrick Stricker et.al.	2603.00588	translate	read	null
2026-02-28	Super Research: Answering Highly Complex Questions with Large Language Models through Super Deep and Super Wide Research	Yubo Dong et.al.	2603.00582	translate	read	null
2026-02-28	CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging	Jie Cao et.al.	2603.00573	translate	read	null
2026-02-28	MIDAS: Multi-Image Dispersion and Semantic Reconstruction for Jailbreaking MLLMs	Yilian Liu et.al.	2603.00565	translate	read	null
2026-02-28	Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation	Zeyu Chen et.al.	2603.00546	translate	read	null
2026-02-28	LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks	Yucheng Zeng et.al.	2603.00540	translate	read	null
2026-02-28	Are LLMs Reliable Code Reviewers? Systematic Overcorrection in Requirement Conformance Judgement	Haolin Jin et.al.	2603.00539	translate	read	null
2026-02-28	CaptionFool: Universal Image Captioning Model Attacks	Swapnil Parekh et.al.	2603.00529	translate	read	null
2026-02-28	ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data	Haodong Zhao et.al.	2603.00516	translate	read	null
2026-02-28	MLLM-4D: Towards Visual-based Spatial-Temporal Intelligence	Xingyilang Yin et.al.	2603.00515	translate	read	null
2026-02-28	Multimodal Adaptive Retrieval Augmented Generation through Internal Representation Learning	Ruoshuang Du et.al.	2603.00511	translate	read	null
2026-02-28	What Do Visual Tokens Really Encode? Uncovering Sparsity and Redundancy in Multimodal Large Language Models	Yingqi Fan et.al.	2603.00510	translate	read	null
2026-02-28	M $^2$ : Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight Retrieval	Dawei Yan et.al.	2603.00503	translate	read	null
2026-02-28	WirelessAgent++: Automated Agentic Workflow Design and Benchmarking for Wireless Networks	Jingwen Tong et.al.	2603.00501	translate	read	null
2026-02-28	Zero-Shot Robotic Manipulation via 3D Gaussian Splatting-Enhanced Multimodal Retrieval-Augmented Generation	Zilong Xie et.al.	2603.00500	translate	read	null
2026-02-28	Antibody: Strengthening Defense Against Harmful Fine-Tuning for Large Language Models via Attenuating Harmful Gradient Influence	Quoc Minh Nguyen et.al.	2603.00498	translate	read	null
2026-02-28	LifeEval: A Multimodal Benchmark for Assistive AI in Egocentric Daily Life Tasks	Hengjian Gao et.al.	2603.00490	translate	read	null
2026-02-28	Does My README File Need To Be Updated? Exploring LLM-Based README Maintenance	Haoyu Gao et.al.	2603.00489	translate	read	null
2026-02-28	Wireless Power Control Based on Large Language Models	Jiacheng Wang et.al.	2603.00474	translate	read	null
2026-02-28	Optimizing In-Context Demonstrations for LLM-based Automated Grading	Yucheng Chu et.al.	2603.00465	translate	read	null
2026-02-28	MED-COPILOT: A Medical Assistant Powered by GraphRAG and Similar Patient Case Retrieval	Shuheng Chen et.al.	2603.00460	translate	read	null
2026-02-28	Rooted Absorbed Prefix Trajectory Balance with Submodular Replay for GFlowNet Training	Xi Wang et.al.	2603.00454	translate	read	null
2026-02-28	Confusion-Aware Rubric Optimization for LLM-based Automated Grading	Yucheng Chu et.al.	2603.00451	translate	read	null
2026-02-28	SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment	Zhuoran Zhao et.al.	2603.00443	translate	read	null
2026-02-28	ROKA: Robust Knowledge Unlearning against Adversaries	Jinmyeong Shin et.al.	2603.00436	translate	read	null
2026-02-28	Personalities at Play: Probing Alignment in AI Teammates	Mohammad Amin Samadi et.al.	2603.00429	translate	read	null
2026-02-28	LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation	Cunyuan Yang et.al.	2603.00426	translate	read	null
2026-02-28	SSR: Pushing the Limit of Spatial Intelligence with Structured Scene Reasoning	Yi Zhang et.al.	2603.00409	translate	read	null
2026-02-28	A Data-Driven Analysis for Engineering Conferences: The Institute of Industrial and Systems Engineering (IISE) Annual Conference Proceedings (2002-2005)	H. Sinan Bank et.al.	2603.00399	translate	read	null
2026-02-26	MediX-R1: Open Ended Medical Reinforcement Learning	Sahal Shaji Mullappilly et.al.	2602.23363	translate	read	null
2026-02-26	Utilizing LLMs for Industrial Process Automation	Salim Fares et.al.	2602.23331	translate	read	null
2026-02-26	Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks	Kunihiro Miyazaki et.al.	2602.23330	translate	read	null
2026-02-26	LLM Novice Uplift on Dual-Use, In Silico Biology Tasks	Chen Bo Calvin Zhang et.al.	2602.23329	translate	read	null
2026-02-26	Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction	Rafael R. Baptista et.al.	2602.23312	translate	read	null
2026-02-26	ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding	Yiran Guan et.al.	2602.23306	translate	read	null
2026-02-26	A Mixture-of-Experts Model for Multimodal Emotion Recognition in Conversations	Soumya Dutta et.al.	2602.23300	translate	read	null
2026-02-26	CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays	Hyungyung Lee et.al.	2602.23276	translate	read	null
2026-02-26	Mitigating Legibility Tax with Decoupled Prover-Verifier Games	Yegon Kim et.al.	2602.23248	translate	read	null
2026-02-26	Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive	Radha Sarma et.al.	2602.23239	translate	read	null
2026-02-26	MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction	Yizhi Li et.al.	2602.23228	translate	read	null
2026-02-26	STELLAR: Storage Tuning Engine Leveraging LLM Autonomous Reasoning for High Performance Parallel File Systems	Chris Egersdoerfer et.al.	2602.23220	translate	read	null
2026-02-26	InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models	Sayed Mohammadreza Tayaranian Hosseini et.al.	2602.23200	translate	read	null
2026-02-26	SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation	Jiahao Zhao et.al.	2602.23199	translate	read	null
2026-02-26	Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models	Chungpa Lee et.al.	2602.23197	translate	read	null
2026-02-26	ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering	Elzo Brito dos Santos Filho et.al.	2602.23193	translate	read	null
2026-02-26	MTRAG-UN: A Benchmark for Open Challenges in Multi-Turn RAG Conversations	Sara Rosenthal et.al.	2602.23184	translate	read	null
2026-02-26	A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring	Usman Anwar et.al.	2602.23163	translate	read	null
2026-02-26	Multi-Agent Large Language Model Based Emotional Detoxification Through Personalized Intensity Control for Consumer Protection	Keito Inoshita et.al.	2602.23123	translate	read	null
2026-02-26	Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design	Zhuoliang Xie et.al.	2602.23092	translate	read	null
2026-02-26	Cytoarchitecture in Words: Weakly Supervised Vision-Language Modeling for Human Brain Microscopy	Matthew Sutton et.al.	2602.23088	translate	read	null
2026-02-26	Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent	Boyang Zhang et.al.	2602.23079	translate	read	null
2026-02-26	CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery	Mengze Hong et.al.	2602.23075	translate	read	null
2026-02-26	TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment	Trung Dang et.al.	2602.23068	translate	read	null
2026-02-26	LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer	Kunpeng Zhang et.al.	2602.23065	translate	read	null
2026-02-26	Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department	Gabriela Anna Kaczmarek et.al.	2602.23062	translate	read	null
2026-02-26	CL4SE: A Context Learning Benchmark For Software Engineering Tasks	Haichuan Hu et.al.	2602.23047	translate	read	null
2026-02-26	LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure	Jaehong Cho et.al.	2602.23036	translate	read	null
2026-02-26	WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval	Tianyue Wang et.al.	2602.23029	translate	read	null
2026-02-26	Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization	Zeyuan Liu et.al.	2602.23008	translate	read	null
2026-02-26	Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search	Xun Huang et.al.	2602.22983	translate	read	null
2026-02-26	Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots	Dimitrios P. Panagoulias et.al.	2602.22973	translate	read	null
2026-02-26	SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy	Peiyao Xiao et.al.	2602.22971	translate	read	null
2026-02-26	Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression	Yifeng Guan et.al.	2602.22967	translate	read	null
2026-02-26	FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning	Zehao Li et.al.	2602.22963	translate	read	null
2026-02-26	Can Agents Distinguish Visually Hard-to-Separate Diseases in a Zero-Shot Setting? A Pilot Study	Zihao Zhao et.al.	2602.22959	translate	read	null
2026-02-26	ClawMobile: Rethinking Smartphone-Native Agentic Systems	Hongchao Du et.al.	2602.22942	translate	read	null
2026-02-26	MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding	Wenhui Tan et.al.	2602.22932	translate	read	null
2026-02-26	SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress	Yang Yu et.al.	2602.22913	translate	read	null
2026-02-26	PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA	Yunpeng Hong et.al.	2602.22903	translate	read	null
2026-02-26	Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space	Xingcheng Fu et.al.	2602.22879	translate	read	null
2026-02-26	Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching	Roy Miles et.al.	2602.22871	translate	read	null
2026-02-26	Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference	Yushi Ye et.al.	2602.22868	translate	read	null
2026-02-26	TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought	Jianmin Li et.al.	2602.22828	translate	read	null
2026-02-26	TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models	Reihaneh Iranmanesh et.al.	2602.22827	translate	read	null
2026-02-26	Hierarchy-of-Groups Policy Optimization for Long-Horizon Agentic Tasks	Shuo He et.al.	2602.22817	translate	read	null
2026-02-26	MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks	Shiqian Su et.al.	2602.22808	translate	read	null
2026-02-26	Natural Language Declarative Prompting (NLD-P): A Modular Governance Method for Prompt Design Under Model Drift	Hyunwoo Kim et.al.	2602.22790	translate	read	null
2026-02-26	Probing for Knowledge Attribution in Large Language Models	Ivo Brink et.al.	2602.22787	translate	read	null
2026-02-26	ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making	Yusuke Watanabe et.al.	2602.22771	translate	read	null
2026-02-26	AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications	Yujie Zhao et.al.	2602.22769	translate	read	null
2026-02-26	Imagination Helps Visual Reasoning, But Not Yet in Latent Space	You Li et.al.	2602.22766	translate	read	null
2026-02-26	Towards Better RL Training Data Utilization via Second-Order Rollout	Zhe Yang et.al.	2602.22765	translate	read	null
2026-02-26	Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study	Philipp Wiesner et.al.	2602.22760	translate	read	null
2026-02-26	Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction	Nils Schwager et.al.	2602.22752	translate	read	null
2026-02-26	Generative Recommendation for Large-Scale Advertising	Ben Xue et.al.	2602.22732	translate	read	null
2026-02-26	Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks	Jakub Šmíd et.al.	2602.22730	translate	read	null
2026-02-26	AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification	Tian Zhang et.al.	2602.22724	translate	read	null
2026-02-26	Replacing Multi-Step Assembly of Data Preparation Pipelines with One-Step LLM Pipeline Generation for Table QA	Fengyu Li et.al.	2602.22721	translate	read	null
2026-02-26	RLHFless: Serverless Computing for Efficient RLHF	Rui Wei et.al.	2602.22718	translate	read	null
2026-02-26	SoPE: Spherical Coordinate-Based Positional Embedding for Enhancing Spatial Perception of 3D LVLMs	Guanting Ye et.al.	2602.22716	translate	read	null
2026-02-26	LLM-driven discovery for carbon allotropes with bond-network entropy	Yuzhou Hao et.al.	2602.22706	translate	read	null
2026-02-26	IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation	Yanpei Guo et.al.	2602.22700	translate	read	null
2026-02-26	Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs	Siyue Su et.al.	2602.22698	translate	read	null
2026-02-26	Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue	Ning Gao et.al.	2602.22697	translate	read	null
2026-02-26	SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses	Zhuohang Jiang et.al.	2602.22683	translate	read	null
2026-02-26	Accelerating LLM Pre-Training through Flat-Direction Dynamics Enhancement	Shuchen Zhu et.al.	2602.22681	translate	read	null
2026-02-26	Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions	Yue Xu et.al.	2602.22680	translate	read	null
2026-02-26	Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning	Qin-Wen Luo et.al.	2602.22642	translate	read	null
2026-02-26	MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios	Zhiheng Song et.al.	2602.22638	translate	read	null
2026-02-26	Fine-grained Semantics Integration for Large Language Model-based Recommendation	Jiawen Feng et.al.	2602.22632	translate	read	null
2026-02-26	Instruction-based Image Editing with Planning, Reasoning, and Generation	Liya Ji et.al.	2602.22624	translate	read	null
2026-02-26	Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA	Hai Huang et.al.	2602.22617	translate	read	null
2026-02-26	Transformers converge to invariant algorithmic cores	Joshua S. Schiffman et.al.	2602.22600	translate	read	null
2026-02-26	FLYING SERVING: On-the-Fly Parallelism Switching for Large Language Model Serving	Shouwei Gao et.al.	2602.22593	translate	read	null
2026-02-26	pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training	Wenzheng Zhang et.al.	2602.22592	translate	read	null
2026-02-26	Where Relevance Emerges: A Layer-Wise Study of Internal Attention for Zero-Shot Re-Ranking	Haodong Chen et.al.	2602.22591	translate	read	null
2026-02-26	Search-P1: Path-Centric Reward Shaping for Stable and Efficient Agentic RAG Training	Tianle Xia et.al.	2602.22576	translate	read	null
2026-02-26	Addressing Climate Action Misperceptions with Generative AI	Miriam Remshard et.al.	2602.22564	translate	read	null
2026-02-26	Layer-Targeted Multilingual Knowledge Erasure in Large Language Models	Taoran Li et.al.	2602.22562	translate	read	null
2026-02-26	CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety	Umid Suleymanov et.al.	2602.22557	translate	read	null
2026-02-26	Autoregressive Visual Decoding from EEG Signals	Sicheng Dai et.al.	2602.22555	translate	read	null
2026-02-26	Multilingual Safety Alignment Via Sparse Weight Editing	Jiaming Liang et.al.	2602.22554	translate	read	null
2026-02-26	Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention	Zhiming Wang et.al.	2602.22546	translate	read	null
2026-02-26	Ruyi2 Technical Report	Huan Song et.al.	2602.22543	translate	read	null
2026-02-26	Agentic AI for Intent-driven Optimization in Cell-free O-RAN	Mohammad Hossein Shokouhi et.al.	2602.22539	translate	read	null
2026-02-26	Generative Agents Navigating Digital Libraries	Saber Zerhoudi et.al.	2602.22529	translate	read	null
2026-02-26	Iterative Prompt Refinement for Dyslexia-Friendly Text Summarization Using GPT-4o	Samay Bhojwani et.al.	2602.22524	translate	read	null
2026-02-26	Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents	Ryan Liu et.al.	2602.22523	translate	read	null
2026-02-26	Pix2Key: Controllable Open-Vocabulary Retrieval with Semantic Decomposition and Self-Supervised Visual Dictionary Learning	Guoyizhe Wei et.al.	2602.22510	translate	read	null
2026-02-26	Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models	Ik-hwan Kim et.al.	2602.22508	translate	read	null
2026-02-26	Mapping the Landscape of Artificial Intelligence in Life Cycle Assessment Using Large Language Models	Anastasija Mensikova et.al.	2602.22500	translate	read	null
2026-02-26	Reinforcement-aware Knowledge Distillation for LLM Reasoning	Zhaoyang Zhang et.al.	2602.22495	translate	read	null
2026-02-25	Importance of Prompt Optimisation for Error Detection in Medical Notes Using Language Models	Craig Myles et.al.	2602.22483	translate	read	null
2026-02-25	Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models	Binchi Zhang et.al.	2602.22475	translate	read	null
2026-02-25	ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization	Joseph Tso et.al.	2602.22465	translate	read	null
2026-02-25	CCCL: Node-Spanning GPU Collectives with CXL Memory Pooling	Dong Xu et.al.	2602.22457	translate	read	null
2026-02-25	Automating the Detection of Requirement Dependencies Using Large Language Models	Ikram Darif et.al.	2602.22456	translate	read	null
2026-02-25	Exploring Multimodal LMMs for Online Episodic Memory Question Answering on the Edge	Giuseppe Lando et.al.	2602.22455	translate	read	null
2026-02-25	CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines	Chayan Banerjee et.al.	2602.22452	translate	read	null
2026-02-25	Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace	Qianlong Lan et.al.	2602.22450	translate	read	null
2026-02-25	A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines	Gaoyuan Du et.al.	2602.22442	translate	read	null
2026-02-25	HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems	Idan Habler et.al.	2602.22427	translate	read	null
2026-02-25	SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read	Yibo Peng et.al.	2602.22426	translate	read	null
2026-02-25	Causality $\neq$ Invariance: Function and Concept Vectors in LLMs	Gustaw Opiełka et.al.	2602.22424	translate	read	null
2026-02-25	Seeing Graphs Like Humans: Benchmarking Computational Measures and MLLMs for Similarity Assessment	Seokweon Jung et.al.	2602.22416	translate	read	null
2026-02-25	Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents	Cosmo Santoni et.al.	2602.22402	translate	read	null
2026-02-25	VoiceAlign: A Shimming Layer for Enhancing the Usability of Legacy Voice User Interface Systems	Md Ehtesham-Ul-Haque et.al.	2602.22374	translate	read	null
2026-02-25	EyeLayer: Integrating Human Attention Patterns into LLM-Based Code Summarization	Jiahao Zhang et.al.	2602.22368	translate	read	null
2026-02-25	E3VA: Enhancing Emotional Expressiveness in Virtual Conversational Agents	Abhishek Kulkarni et.al.	2602.22362	translate	read	null
2026-02-25	Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts	Arno Simons et.al.	2602.22359	translate	read	null
2026-02-25	STILTS-NLI: A Natural Language Interface for STILTS	R. A. Shaw et.al.	2602.22357	translate	read	null
2026-02-25	Decoder-based Sense Knowledge Distillation	Qitong Wang et.al.	2602.22351	translate	read	null
2026-02-25	Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory	Davide Ettori et.al.	2602.22345	translate	read	null
2026-02-25	Conversational Successes and Breakdowns in Everyday Non-Display Smart Glasses Use	Xiuqi Tommy Zhu et.al.	2602.22340	translate	read	null
2026-02-25	Decoding the Hook: A Multimodal LLM Framework for Analyzing the Hooking Period of Video Ads	Kunpeng Zhang et.al.	2602.22299	translate	read	null
2026-02-25	UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs	Devan Shah et.al.	2602.22296	translate	read	null
2026-02-25	Manifold of Failure: Behavioral Attraction Basins in Language Models	Sarthak Munshi et.al.	2602.22291	translate	read	null
2026-02-25	OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data	Yan Zhao et.al.	2602.22286	translate	read	null
2026-02-25	BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning	Mingi Kim et.al.	2602.22284	translate	read	null
2026-02-25	Integrating Machine Learning Ensembles and Large Language Models for Heart Disease Prediction Using Voting Fusion	Md. Tahsin Amin et.al.	2602.22280	translate	read	null
2026-02-25	RETLLM: Training and Data-Free MLLMs for Multimodal Information Retrieval	Dawei Su et.al.	2602.22278	translate	read	null
2026-02-25	EmpiRE-Compass: A Neuro-Symbolic Dashboard for Sustainable and Dynamic Knowledge Exploration, Synthesis, and Reuse	Oliver Karras et.al.	2602.22276	translate	read	null
2026-02-25	Sustainable LLM Inference using Context-Aware Model Switching	Yuvarani et.al.	2602.22261	translate	read	null
2026-02-24	A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM	Morteza Eskandarian et.al.	2602.22250	translate	read	null
2026-02-24	Accelerating Incident Response: A Hybrid Approach for Data Breach Reporting	Aurora Arrus et.al.	2602.22244	translate	read	null
2026-02-24	Analysis of LLMs Against Prompt Injection and Jailbreak Attacks	Piyush Jaiswal et.al.	2602.22242	translate	read	null
2026-02-24	From Prompts to Performance: Evaluating LLMs for Task-based Parallel Code Generation	Linus Bantel et.al.	2602.22240	translate	read	null
2026-02-23	CrossLLM-Mamba: Multimodal State Space Fusion of LLMs for RNA Interaction Prediction	Rabeya Tus Sadia et.al.	2602.22236	translate	read	null
2026-02-25	Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets	Hanna Yukhymenko et.al.	2602.22207	translate	read	null
2026-02-25	A Taxonomy of Human–MLLM Interaction in Early-Stage Sketch-Based Design Ideation	Weiayn Shi et.al.	2602.22171	translate	read	null
2026-02-25	LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models	Minqiu Sun et.al.	2602.22158	translate	read	null
2026-02-25	Dynamic Personality Adaptation in Large Language Models via State Machines	Leon Pielage et.al.	2602.22157	translate	read	null
2026-02-25	Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual	Yining Li et.al.	2602.22146	translate	read	null
2026-02-25	When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models	Satyam Kumar Navneet et.al.	2602.22145	translate	read	null
2026-02-25	WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs	Yulin Zhang et.al.	2602.22142	translate	read	null
2026-02-25	Confidence-Driven Multi-Scale Model Selection for Cost-Efficient Inference	Bo-Wei Chen et.al.	2602.22090	translate	read	null
2026-02-25	ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent	Chunggi Lee et.al.	2602.22077	translate	read	null
2026-02-25	Understanding Artificial Theory of Mind: Perturbed Tasks and Reasoning in Large Language Models	Christian Nickel et.al.	2602.22072	translate	read	null
2026-02-25	Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts	Jessica Y. Bo et.al.	2602.22070	translate	read	null
2026-02-25	DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain	Walter Hernandez Cruz et.al.	2602.22045	translate	read	null
2026-02-25	RT-RMOT: A Dataset and Framework for RGB-Thermal Referring Multi-Object Tracking	Yanqiu Yu et.al.	2602.22033	translate	read	null
2026-02-25	Enhancing LLM-Based Test Generation by Eliminating Covered Code	WeiZhe Xu et.al.	2602.21997	translate	read	null
2026-02-25	CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models	Miyu Oba et.al.	2602.21978	translate	read	null
2026-02-25	Global-Aware Edge Prioritization for Pose Graph Initialization	Tong Wei et.al.	2602.21963	translate	read	null
2026-02-25	Global-Local Dual Perception for MLLMs in High-Resolution Text-Rich Image Translation	Junxin Lu et.al.	2602.21956	translate	read	null
2026-02-25	RADAR: Reasoning as Discrimination with Aligned Representations for LLM-based Knowledge Graph Reasoning	Bo Xue et.al.	2602.21951	translate	read	null
2026-02-25	MEDSYN: Benchmarking Multi-EviDence SYNthesis in Complex Clinical Cases for Multimodal Large Language Models	Boqi Chen et.al.	2602.21950	translate	read	null
2026-02-25	Large Language Models are Algorithmically Blind	Sohan Venkatesh et.al.	2602.21947	translate	read	null
2026-02-25	Hidden Topics: Measuring Sensitive AI Beliefs with List Experiments	Maxim Chupilkin et.al.	2602.21939	translate	read	null
2026-02-25	Small Wins Big: Comparing Large Language Models and Domain Fine-Tuned Models for Sarcasm Detection in Code-Mixed Hinglish Text	Bitan Majumder et.al.	2602.21933	translate	read	null
2026-02-25	EmoOmni: Bridging Emotional Understanding and Expression in Omni-Modal LLMs	Wenjie Tian et.al.	2602.21900	translate	read	null
2026-02-25	APFuzz: Towards Automatic Greybox Protocol Fuzzing	Yu Wang et.al.	2602.21892	translate	read	null
2026-02-25	How to Take a Memorable Picture? Empowering Users with Actionable Feedback	Francesco Laiti et.al.	2602.21877	translate	read	null
2026-02-25	Personalized Graph-Empowered Large Language Model for Proactive Information Access	Chia Cheng Chang et.al.	2602.21862	translate	read	null
2026-02-25	ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices	Dezhi Kong et.al.	2602.21858	translate	read	null
2026-02-25	FewMMBench: A Benchmark for Multimodal Few-Shot Learning	Mustafa Dogan et.al.	2602.21854	translate	read	null
2026-02-25	From Restructuring to Stabilization: A Large-Scale Experiment on Iterative Code Readability Refactoring with Large Language Models	Norman Peitek et.al.	2602.21833	translate	read	null
2026-02-25	A Multi-Turn Framework for Evaluating AI Misuse in Fraud and Cybercrime Scenarios	Kimberly T. Mai et.al.	2602.21831	translate	read	null
2026-02-25	SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model	Guibin Chen et.al.	2602.21818	translate	read	null
2026-02-25	Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem	Heejin Jo et.al.	2602.21814	translate	read	null
2026-02-25	An Evaluation of Context Length Extrapolation in Long Code via Positional Embeddings and Efficient Attention	Madhusudan Ghosh et.al.	2602.21800	translate	read	null
2026-02-25	DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism	Yifan Niu et.al.	2602.21788	translate	read	null
2026-02-25	D-COT: Disciplined Chain-of-Thought Learning for Efficient Reasoning in Small Language Models	Shunsuke Ubukata et.al.	2602.21786	translate	read	null
2026-02-25	Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control	Beatrice Luciani et.al.	2602.21783	translate	read	null
2026-02-25	Generalisation of RLHF under Reward Shift and Clipped KL Regularisation	Kenton Tang et.al.	2602.21765	translate	read	null
2026-02-25	Improving Implicit Discourse Relation Recognition with Natural Language Explanations from LLMs	Heng Wang et.al.	2602.21763	translate	read	null
2026-02-25	Offline Reasoning for Efficient Recommendation: LLM-Empowered Persona-Profiled Item Indexing	Deogyong Kim et.al.	2602.21756	translate	read	null
2026-02-25	From Words to Amino Acids: Does the Curse of Depth Persist?	Aleena Siji et.al.	2602.21750	translate	read	null
2026-02-25	Enhancing Multi-Modal LLMs Reasoning via Difficulty-Aware Group Normalization	Jinghan Li et.al.	2602.21743	translate	read	null
2026-02-25	Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling	Shiqi Yan et.al.	2602.21728	translate	read	null
2026-02-25	TranX-Adapter: Bridging Artifacts and Semantics within MLLMs for Robust AI-generated Image Detection	Wenbin Wang et.al.	2602.21716	translate	read	null
2026-02-25	Two-Stage Active Distribution Network Voltage Control via LLM-RL Collaboration: A Hybrid Knowledge-Data-Driven Approach	Xu Yang et.al.	2602.21715	translate	read	null
2026-02-25	EditFlow: Benchmarking and Optimizing Code Edit Recommendation Systems via Reconstruction of Developer Flows	Chenyan Liu et.al.	2602.21697	translate	read	null
2026-02-25	Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning	Tomoya Kawabe et.al.	2602.21670	translate	read	null
2026-02-25	DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation	Duc Trung Vu et.al.	2602.21669	translate	read	null
2026-02-25	CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning	Zhijiang Tang et.al.	2602.21655	translate	read	null
2026-02-25	Irresponsible Counselors: Large Language Models and the Loneliness of Modern Humans	Abas Bertina et.al.	2602.21653	translate	read	null
2026-02-25	Sparsity Induction for Accurate Post-Training Pruning of Large Language Models	Minhao Jiang et.al.	2602.21652	translate	read	null
2026-02-25	Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration	Tangsang Chongbang et.al.	2602.21647	translate	read	null
2026-02-25	Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion	Yexing Du et.al.	2602.21646	translate	read	null
2026-02-25	RuCL: Stratified Rubric-Based Curriculum Learning for Multimodal Large Language Model Reasoning	Yukun Chen et.al.	2602.21628	translate	read	null
2026-02-25	Multi-Layer Scheduling for MoE-Based LLM Reasoning	Yifan Sun et.al.	2602.21626	translate	read	null
2026-02-25	Structurally Aligned Subtask-Level Memory for Software Engineering Agents	Kangning Shen et.al.	2602.21611	translate	read	null
2026-02-25	MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification	Kazi Samin Yasar Alam et.al.	2602.21608	translate	read	null
2026-02-25	Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation	Qiange Wang et.al.	2602.21604	translate	read	null
2026-02-25	AQR-HNSW: Accelerating Approximate Nearest Neighbor Search via Density-aware Quantization and Multi-stage Re-ranking	Ganap Ashit Tewary et.al.	2602.21600	translate	read	null
2026-02-25	SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints	Hyungmin Kim et.al.	2602.21595	translate	read	null
2026-02-25	Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection	Zheng Gao et.al.	2602.21593	translate	read	null
2026-02-25	Revisiting RAG Retrievers: An Information Theoretic Benchmark	Wenqing Zheng et.al.	2602.21553	translate	read	null
2026-02-25	RAC: Relation-Aware Cache Replacement for Large Language Models	Yuchong Wu et.al.	2602.21547	translate	read	null
2026-02-25	Muon+: Towards Better Muon via One Additional Normalization Step	Ruijie Zhang et.al.	2602.21545	translate	read	null
2026-02-25	Reasoning-Driven Design of Single Atom Catalysts via a Multi-Agent Large Language Model Framework	Dong Hyeon Mok et.al.	2602.21533	translate	read	null
2026-02-25	One Brain, Omni Modalities: Towards Unified Non-Invasive Brain Decoding with Large Language Models	Changli Tang et.al.	2602.21522	translate	read	null
2026-02-25	Beyond Refusal: Probing the Limits of Agentic Self-Correction for Semantic Sensitive Information	Umid Suleymanov et.al.	2602.21496	translate	read	null
2026-02-25	GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning	Ningyuan Yang et.al.	2602.21492	translate	read	null
2026-02-25	Evaluating the Usage of African-American Vernacular English in Large Language Models	Deja Dunlap et.al.	2602.21485	translate	read	null
2026-02-25	The Design Space of Tri-Modal Masked Diffusion Models	Louis Bethune et.al.	2602.21472	translate	read	null
2026-02-25	iMiGUE-Speech: A Spontaneous Speech Dataset for Affective Analysis	Sofoklis Kakouros et.al.	2602.21464	translate	read	null
2026-02-25	Revisiting Text Ranking in Deep Research	Chuan Meng et.al.	2602.21456	translate	read	null
2026-02-24	MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning	Jesse He et.al.	2602.21442	translate	read	null
2026-02-24	Causal Decoding for Hallucination-Resistant Multimodal Large Language Models	Shiwei Tan et.al.	2602.21441	translate	read	null
2026-02-24	Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning	Yuanda Xu et.al.	2602.21420	translate	read	null
2026-02-24	MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection	Xuan Chen et.al.	2602.21394	translate	read	null
2026-02-24	Interleaved Head Attention	Sai Surya Duvvuri et.al.	2602.21371	translate	read	null
2026-02-24	A Hierarchical Multi-Agent System for Autonomous Discovery in Geoscientific Data Archives	Dmitrii Pantiukhin et.al.	2602.21351	translate	read	null
2026-02-24	Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment	Mengxuan Hu et.al.	2602.21346	translate	read	null
2026-02-24	Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data	Emre Can Acikgoz et.al.	2602.21320	translate	read	null
2026-02-24	Shared Nature, Unique Nurture: PRISM for Pluralistic Reasoning via In-context Structure Modeling	Guancheng Tu et.al.	2602.21317	translate	read	null
2026-02-24	Group Orthogonalized Policy Optimization:Group Policy Optimization as Orthogonal Projection in Hilbert Space	Wang Zixian et.al.	2602.21269	translate	read	null
2026-02-24	Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models	Sasha Robinson et.al.	2602.21262	translate	read	null
2026-02-23	Structured Prompt Language: Declarative Context Management for LLMs	Wen G. Gong et.al.	2602.21257	translate	read	null
2026-02-23	A General Equilibrium Theory of Orchestrated AI Agent Systems	Jean-Philippe Garnier et.al.	2602.21255	translate	read	null
2026-02-24	On Data Engineering for Scaling LLM Terminal Capabilities	Renjie Pi et.al.	2602.21193	translate	read	null
2026-02-24	Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training	Anas Barakat et.al.	2602.21189	translate	read	null
2026-02-24	Seeing Through Words: Controlling Visual Retrieval Quality with Language Models	Jianglin Lu et.al.	2602.21175	translate	read	null
2026-02-24	PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data	Samah Fodeh et.al.	2602.21165	translate	read	null
2026-02-24	ActionReasoning: Robot Action Reasoning in 3D Space with LLM for Robotic Brick Stacking	Guangming Wang et.al.	2602.21161	translate	read	null
2026-02-24	SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards	Dengjia Zhang et.al.	2602.21158	translate	read	null
2026-02-24	Scaling State-Space Models on Multiple GPUs with Tensor Parallelism	Anurag Dutt et.al.	2602.21144	translate	read	null
2026-02-24	A Benchmark for Deep Information Synthesis	Debjit Paul et.al.	2602.21143	translate	read	null
2026-02-24	SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery	David Anugraha et.al.	2602.21136	translate	read	null
2026-02-24	“Are You Sure?”: An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems	Xinfeng Li et.al.	2602.21127	translate	read	null
2026-02-24	Turning Semantics into Topology: LLM-Driven Attribute Augmentation for Collaborative Filtering	Junjie Meng et.al.	2602.21099	translate	read	null
2026-02-24	Can Interest-Bearing Positions Solve the Long-Horizon Problem in Prediction Markets?	Caleb Maresca et.al.	2602.21091	translate	read	null
2026-02-24	Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification	Vishal Patil et.al.	2602.21082	translate	read	null
2026-02-24	An Expert Schema for Evaluating Large Language Model Errors in Scholarly Question-Answering Systems	Anna Martin-Boyle et.al.	2602.21059	translate	read	null
2026-02-24	PaperTrail: A Claim-Evidence Interface for Grounding Provenance in LLM-based Scholarly Q&A	Anna Martin-Boyle et.al.	2602.21045	translate	read	null
2026-02-24	LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification	Yanrui Wu et.al.	2602.21044	translate	read	null
2026-02-24	Generative Pseudo-Labeling for Pre-Ranking with LLMs	Junyu Bi et.al.	2602.20995	translate	read	null
2026-02-24	CrystaL: Spontaneous Emergence of Visual Latents in MLLMs	Yang Zhang et.al.	2602.20980	translate	read	null
2026-02-24	Evaluating Proactive Risk Awareness of Large Language Models	Xuan Luo et.al.	2602.20976	translate	read	null
2026-02-24	Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving	Yuliang Ji et.al.	2602.20973	translate	read	null
2026-02-24	Are Multimodal Large Language Models Good Annotators for Image Tagging?	Ming-Kun Xie et.al.	2602.20972	translate	read	null
2026-02-24	Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models	Paola Merlo et.al.	2602.20966	translate	read	null
2026-02-24	The Art of Efficient Reasoning: Data, Reward, and Optimization	Taiqiang Wu et.al.	2602.20945	translate	read	null
2026-02-24	Extending $μ$ P: Spectral Conditions for Feature Learning Across Optimizers	Akshita Gupta et.al.	2602.20937	translate	read	null
2026-02-24	Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence	ChengYou Li et.al.	2602.20934	translate	read	null
2026-02-24	HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG	Yuqi Huang et.al.	2602.20926	translate	read	null
2026-02-24	Predicting Sentence Acceptability Judgments in Multimodal Contexts	Hyewon Jang et.al.	2602.20918	translate	read	null
2026-02-24	LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding	Jihao Qiu et.al.	2602.20913	translate	read	null
2026-02-24	TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering	Hanshen Zhu et.al.	2602.20903	translate	read	null
2026-02-24	SpatiaLQA: A Benchmark for Evaluating Spatial Logical Reasoning in Vision-Language Models	Yuechen Xie et.al.	2602.20901	translate	read	null
2026-02-24	Exa-PSD: a new Persian sentiment analysis dataset on Twitter	Seyed Himan Ghaderi et.al.	2602.20892	translate	read	null
2026-02-24	Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs	Dhita Putri Pratama et.al.	2602.20878	translate	read	null
2026-02-24	MUSE: Harnessing Precise and Diverse Semantics for Few-Shot Whole Slide Image Classification	Jiahao Xu et.al.	2602.20873	translate	read	null
2026-02-24	Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset	Jia-Rui Lin et.al.	2602.20812	translate	read	null
2026-02-24	Unseen-Codebases-Domain Data Synthesis and Training Based on Code Graphs	Guangsheng Ou et.al.	2602.20799	translate	read	null
2026-02-24	SPP-SCL: Semi-Push-Pull Supervised Contrastive Learning for Image-Text Sentiment Analysis and Beyond	Jiesheng Wu et.al.	2602.20767	translate	read	null
2026-02-24	Overton Pluralistic Reinforcement Learning for Large Language Models	Yu Fu et.al.	2602.20759	translate	read	null
2026-02-24	Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI Feedback	Chenyang Zhao et.al.	2602.20728	translate	read	null
2026-02-24	ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition	Xindian Ma et.al.	2602.20727	translate	read	null
2026-02-24	Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning	Xu Wan et.al.	2602.20722	translate	read	null
2026-02-24	AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs	Che Wang et.al.	2602.20720	translate	read	null
2026-02-24	PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring	Xiting Liu et.al.	2602.20717	translate	read	null
2026-02-24	ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction	Che Wang et.al.	2602.20708	translate	read	null
2026-02-24	PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding	Baolong Bi et.al.	2602.20696	translate	read	null
2026-02-24	Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment	Mohamed Shamseldein et.al.	2602.20683	translate	read	null
2026-02-24	CAMEL: Confidence-Gated Reflection for Reward Modeling	Zirui Zhu et.al.	2602.20670	translate	read	null
2026-02-24	ICSSPulse: A Modular LLM-Assisted Platform for Industrial Control System Penetration Testing	Michail Takaronis et.al.	2602.20663	translate	read	null
2026-02-24	TOM: A Ternary Read-only Memory Accelerator for LLM-powered Edge Intelligence	Hongyi Guan et.al.	2602.20662	translate	read	null
2026-02-24	CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models	Anqi Li et.al.	2602.20648	translate	read	null
2026-02-24	An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation	Fida Khandaker Safa et.al.	2602.20644	translate	read	null
2026-02-24	Grounding LLMs in Scientific Discovery via Embodied Actions	Bo Zhang et.al.	2602.20639	translate	read	null
2026-02-24	QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs	Santiago Gonzalez et.al.	2602.20629	translate	read	null
2026-02-24	Physics-based phenomenological characterization of cross-modal bias in multimodal models	Hyeongmo Kim et.al.	2602.20624	translate	read	null
2026-02-24	SpecMind: Cognitively Inspired, Interactive Multi-Turn Framework for Postcondition Inference	Cuong Chi Le et.al.	2602.20610	translate	read	null
2026-02-24	Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion	Jiaru Zhang et.al.	2602.20577	translate	read	null
2026-02-24	From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production	Yucheng Shi et.al.	2602.20558	translate	read	null
2026-02-24	Standard Transformers Achieve the Minimax Rate in Nonparametric Regression with $C^{s,λ}$ Targets	Yanming Lai et.al.	2602.20555	translate	read	null
2026-02-24	What Drives Students’ Use of AI Chatbots? Technology Acceptance in Conversational AI	Griffin Pitts et.al.	2602.20547	translate	read	null
2026-02-24	Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training	Zhengyao Gu et.al.	2602.20532	translate	read	null
2026-02-24	FAST-Prefill: FPGA Accelerated Sparse Attention for Long Context LLM Prefill	Rakshith Jayanth et.al.	2602.20515	translate	read	null
2026-02-24	From Performance to Purpose: A Sociotechnical Taxonomy for Evaluating Large Language Model Utility	Gavin Levinson et.al.	2602.20513	translate	read	null
2026-02-24	AWCP: A Workspace Delegation Protocol for Deep-Engagement Collaboration across Remote Agents	Xiaohang Nie et.al.	2602.20493	translate	read	null
2026-02-24	Wireless Federated Multi-Task LLM Fine-Tuning via Sparse-and-Orthogonal LoRA	Nuocheng Yang et.al.	2602.20492	translate	read	null
2026-02-24	Application of Large Language Models for Container Throughput Forecasting: Incorporating Contextual Information in Port Logistics	Minseop Kim et.al.	2602.20489	translate	read	null
2026-02-24	Hybrid LLM-Embedded Dialogue Agents for Learner Reflection: Designing Responsive and Theory-Driven Interactions	Paras Sharma et.al.	2602.20486	translate	read	null
2026-02-24	Oracle-Robust Online Alignment for Large Language Models	Zimeng Li et.al.	2602.20457	translate	read	null
2026-02-23	Emergent Manifold Separability during Reasoning in Large Language Models	Alexandre Polo et.al.	2602.20338	translate	read	null
2026-02-23	DMCD: Semantic-Statistical Framework for Causal Discovery	Samarth KaPatel et.al.	2602.20333	translate	read	null
2026-02-23	No One Size Fits All: QueryBandits for Hallucination Mitigation	Nicole Cho et.al.	2602.20332	translate	read	null
2026-02-23	An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models	Cathy Shyr et.al.	2602.20324	translate	read	null
2026-02-23	What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance	William Watson et.al.	2602.20300	translate	read	null
2026-02-23	InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation	Yu Li et.al.	2602.20294	translate	read	null
2026-02-23	PhantomRun: Auto Repair of Compilation Errors in Embedded Open Source Software	Han Fu et.al.	2602.20284	translate	read	null
2026-02-23	The Truthfulness Spectrum Hypothesis	Zhuofan Josh Ying et.al.	2602.20273	translate	read	null
2026-02-23	HieraMAS: Optimizing Intra-Node LLM Mixtures and Inter-Node Topology for Multi-Agent Systems	Tianjun Yao et.al.	2602.20229	translate	read	null
2026-02-23	Exploring Anti-Aging Literature via ConvexTopics and Large Language Models	Lana E. Yeganova et.al.	2602.20224	translate	read	null
2026-02-23	An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction	Guanting Shen et.al.	2602.20219	translate	read	null
2026-02-23	CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions	Jingwei Shi et.al.	2602.20213	translate	read	null
2026-02-22	Golden Layers and Where to Find Them: Improved Knowledge Editing for Large Language Models Via Layer Gradient Analysis	Shrestha Datta et.al.	2602.20207	translate	read	null
2026-02-22	Mitigating “Epistemic Debt” in Generative AI-Scaffolded Novice Programming using Metacognitive Scripts	Sreecharan Sankaranarayanan et.al.	2602.20206	translate	read	null
2026-02-22	OTPrune: Distribution-Aligned Visual Token Pruning via Optimal Transport	Xiwen Chen et.al.	2602.20205	translate	read	null
2026-02-22	Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study	Jeel Piyushkumar Khatiwala et.al.	2602.20202	translate	read	null
2026-02-22	Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning	Zhuoxu Huang et.al.	2602.20197	translate	read	null
2026-02-23	Do Large Language Models Understand Data Visualization Rules?	Martin Sinnona et.al.	2602.20137	translate	read	null
2026-02-23	KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration	Mohammad Amanlou et.al.	2602.20135	translate	read	null
2026-02-23	AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization	Mert Cemri et.al.	2602.20133	translate	read	null
2026-02-23	To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering	Zaifu Zhan et.al.	2602.20130	translate	read	null
2026-02-23	NanoKnow: How to Know What Your Language Model Knows	Lingwei Gu et.al.	2602.20122	translate	read	null
2026-02-23	BarrierSteer: LLM Safety via Learning Barrier Steering	Thanh Q. Tran et.al.	2602.20102	translate	read	null
2026-02-23	CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching	Yuzhe Wang et.al.	2602.20094	translate	read	null
2026-02-23	How Retrieved Context Shapes Internal Representations in RAG	Samuel Yeh et.al.	2602.20091	translate	read	null
2026-02-23	Do Large Language Models Understand Data Visualization Principles?	Martin Sinnona et.al.	2602.20084	translate	read	null
2026-02-23	Multilingual Large Language Models do not comprehend all natural languages to equal degrees	Natalia Moskvina et.al.	2602.20065	translate	read	null
2026-02-23	The LLMbda Calculus: AI Agents, Conversations, and Information Flow	Zac Garby et.al.	2602.20064	translate	read	null
2026-02-23	Can You Tell It’s AI? Human Perception of Synthetic Voices in Vishing Scenarios	Zoha Hayat Bhatti et.al.	2602.20061	translate	read	null
2026-02-23	Entropy in Large Language Models	Marco Scharringhausen et.al.	2602.20052	translate	read	null
2026-02-23	Closing the gap in multimodal medical representation alignment	Eleonora Grassucci et.al.	2602.20046	translate	read	null
2026-02-23	Let There Be Claws: An Early Social Network Analysis of AI Agents on Moltbook	H. C. W. Price et.al.	2602.20044	translate	read	null
2026-02-23	Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously	Han Bao et.al.	2602.20042	translate	read	null
2026-02-23	AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization	Fahmida Liza Piya et.al.	2602.20040	translate	read	null
2026-02-23	gencat: Generative computerized adaptive testing	Wanyong Feng et.al.	2602.20020	translate	read	null
2026-02-23	ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting	Yuxing Tian et.al.	2602.19969	translate	read	null
2026-02-23	Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval	Yibo Yan et.al.	2602.19961	translate	read	null
2026-02-23	Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming	Ian Steenstra et.al.	2602.19948	translate	read	null
2026-02-23	A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs	Zijie Liu et.al.	2602.19938	translate	read	null
2026-02-23	BeamVLM for Low-altitude Economy: Generative Beam Prediction via Vision-language Models	Chenran Kou et.al.	2602.19929	translate	read	null
2026-02-23	Rethinking LoRA for Privacy-Preserving Federated Learning in Large Models	Jin Liu et.al.	2602.19926	translate	read	null
2026-02-23	DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning	Zhongwei Wan et.al.	2602.19895	translate	read	null
2026-02-23	SHIELD: Semantic Heterogeneity Integrated Embedding for Latent Discovery in Clinical Trial Safety Signals	Francois Vandenhende et.al.	2602.19855	translate	read	null
2026-02-23	LLM-enabled Applications Require System-Level Threat Monitoring	Yedi Zhang et.al.	2602.19844	translate	read	null
2026-02-23	SAMAS: A Spectrum-Guided Multi-Agent System for Achieving Style Fidelity in Literary Translation	Jingzhuo Wu et.al.	2602.19840	translate	read	null
2026-02-23	An Explainable Memory Forensics Approach for Malware Analysis	Silvia Lucia Sanna et.al.	2602.19831	translate	read	null
2026-02-23	TextShield-R1: Reinforced Reasoning for Tampered Text Detection	Chenfan Qu et.al.	2602.19828	translate	read	null
2026-02-23	Universal Pose Pretraining for Generalizable Vision-Language-Action Policies	Haitao Lin et.al.	2602.19710	translate	read	null
2026-02-23	“The explanation makes sense”: An Empirical Study on LLM Performance in News Classification and its Influence on Judgment in Human-AI Collaborative Annotation	Qile Wang et.al.	2602.19690	translate	read	null
2026-02-23	KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge	Alex Robertson et.al.	2602.19643	translate	read	null
2026-02-23	Evaluating the Impact of Data Anonymization on Image Retrieval	Marvin Chen et.al.	2602.19641	translate	read	null
2026-02-23	Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering	Chih-Hong Cheng et.al.	2602.19614	translate	read	null
2026-02-23	Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning	Borisiuk Anna et.al.	2602.19612	translate	read	null
2026-02-23	CLCR: Cross-Level Semantic Collaborative Representation for Multimodal Learning	Chunlei Meng et.al.	2602.19605	translate	read	null
2026-02-23	Tri-Subspaces Disentanglement for Multimodal Sentiment Analysis	Chunlei Meng et.al.	2602.19585	translate	read	null
2026-02-23	CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment	Hanwen Liu et.al.	2602.19574	translate	read	null
2026-02-23	Identifying, Explaining, and Correcting Ableist Language with AI	Kynnedy Simone Smith et.al.	2602.19560	translate	read	null
2026-02-23	Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains	Xiaochong Jiang et.al.	2602.19555	translate	read	null
2026-02-23	Vinedresser3D: Agentic Text-guided 3D Editing	Yankuan Chi et.al.	2602.19542	translate	read	null
2026-02-23	Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial	Yousef Emami et.al.	2602.19534	translate	read	null
2026-02-23	Ada-RS: Adaptive Rejection Sampling for Selective Thinking	Yirou Ge et.al.	2602.19519	translate	read	null
2026-02-23	Anticipate, Adapt, Act: A Hybrid Framework for Task Planning	Nabanita Dash et.al.	2602.19518	translate	read	null
2026-02-23	Classroom Final Exam: An Instructor-Tested Reasoning Benchmark	Chongyang Gao et.al.	2602.19517	translate	read	null
2026-02-23	Pixel2Phys: Distilling Governing Laws from Visual Dynamics	Ruikun Li et.al.	2602.19516	translate	read	null
2026-02-23	Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference	Arindam Khaled et.al.	2602.19509	translate	read	null
2026-02-23	Conversational AI for Automated Patient Questionnaire Completion: Development Insights and Design Principles	David Fraile Navarro et.al.	2602.19507	translate	read	null
2026-02-23	Test-Time Computing for Referring Multimodal Large Language Models	Mingrui Wu et.al.	2602.19505	translate	read	null
2026-02-23	MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models	Mingrui Wu et.al.	2602.19497	translate	read	null
2026-02-23	Botson: An Accessible and Low-Cost Platform for Social Robotics Research	Samuel Bellaire et.al.	2602.19491	translate	read	null
2026-02-23	Can Large Language Models Replace Human Coders? Introducing ContentBench	Michael Haman et.al.	2602.19467	translate	read	null
2026-02-23	SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning	Zelin He et.al.	2602.19455	translate	read	null
2026-02-23	Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments	Kunal Mukherjee et.al.	2602.19450	translate	read	null
2026-02-23	Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images	Yuxuan Yang et.al.	2602.19424	translate	read	null
2026-02-23	AuditoryHuM: Auditory Scene Label Generation and Clustering using Human-MLLM Collaboration	Henry Zhong et.al.	2602.19409	translate	read	null
2026-02-23	Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases	Indira Vats et.al.	2602.19407	translate	read	null
2026-02-23	Personalized Prediction of Perceived Message Effectiveness Using Large Language Model Based Digital Twins	Jasmin Han et.al.	2602.19403	translate	read	null
2026-02-23	Hiding in Plain Text: Detecting Concealed Jailbreaks via Activation Disentanglement	Amirhossein Farzam et.al.	2602.19396	translate	read	null
2026-02-22	LLMs Can Learn to Reason Via Off-Policy RL	Daniel Ritter et.al.	2602.19362	translate	read	null
2026-02-22	Compliance Management for Federated Data Processing	Natallia Kokash et.al.	2602.19360	translate	read	null
2026-02-22	Smooth Gate Functions for Soft Advantage Policy Optimization	Egor Denisov et.al.	2602.19345	translate	read	null
2026-02-22	Soft Sequence Policy Optimization	Svetlana Glazyrina et.al.	2602.19327	translate	read	null
2026-02-22	Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations	Dongming Jiang et.al.	2602.19320	translate	read	null
2026-02-22	A Power Market Model with Hypersaclers and Modular Datacenters	Yihsu Chen et.al.	2602.19310	translate	read	null
2026-02-22	Scaling Inference-Time Computation via Opponent Simulation: Enabling Online Strategic Adaptation in Repeated Negotiation	Xiangyu Liu et.al.	2602.19309	translate	read	null
2026-02-22	The Path to Conversational AI Tutors: Integrating Tutoring Best Practices and Targeted Technologies to Produce Scalable AI Agents	Kirk Vanacore et.al.	2602.19303	translate	read	null
2026-02-22	Automated Generation of Microfluidic Netlists using Large Language Models	Jasper Davidson et.al.	2602.19297	translate	read	null
2026-02-22	Towards Automated Page Object Generation for Web Testing using Large Language Models	Betül Karagöz et.al.	2602.19294	translate	read	null
2026-02-22	Limited Reasoning Space: The cage of long-horizon reasoning in LLMs	Zhenyu Li et.al.	2602.19281	translate	read	null
2026-02-22	ComUICoder: Component-based Reusable UI Code Generation for Complex Websites via Semantic Segmentation and Element-wise Feedback	Jingyu Xiao et.al.	2602.19276	translate	read	null
2026-02-22	KUDA: Knowledge Unlearning by Deviating Representation for Large Language Models	Ce Fang et.al.	2602.19275	translate	read	null
2026-02-22	No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection	Zunkai Dai et.al.	2602.19248	translate	read	null
2026-02-22	Topology of Reasoning: Retrieved Cell Complex-Augmented Generation for Textual Graph Question Answering	Sen Zhao et.al.	2602.19240	translate	read	null
2026-02-22	Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations	Ahmed Karim et.al.	2602.19239	translate	read	null
2026-02-22	Knowledge-aware Visual Question Generation for Remote Sensing Images	Siran Li et.al.	2602.19224	translate	read	null
2026-02-22	Gecko: A Simulation Environment with Stateful Feedback for Refining Agent Tool Calls	Zeyu Zhang et.al.	2602.19218	translate	read	null
2026-02-22	Questions beyond Pixels: Integrating Commonsense Knowledge in Visual Question Generation for Remote Sensing	Siran Li et.al.	2602.19217	translate	read	null
2026-02-22	Statistical Measures for Explainable Aspect-Based Sentiment Analysis: A Case Study on Environmental Discourse in Reddit	Luisa Stracqualursi et.al.	2602.19216	translate	read	null
2026-02-22	How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization	Yangyi Fang et.al.	2602.19208	translate	read	null
2026-02-22	PositionOCR: Augmenting Positional Awareness in Multi-Modal Models via Hybrid Specialist Integration	Chen Duan et.al.	2602.19188	translate	read	null
2026-02-22	Next Reply Prediction X Dataset: Linguistic Discrepancies in Naively Generated Content	Simon Münker et.al.	2602.19177	translate	read	null
2026-02-22	TurkicNLP: An NLP Toolkit for Turkic Languages	Sherzod Hakimov et.al.	2602.19174	translate	read	null
2026-02-22	Reasoning Capabilities of Large Language Models. Lessons Learned from General Game Playing	Maciej Świechowski et.al.	2602.19160	translate	read	null
2026-02-22	DoAtlas-1: A Causal Compilation Paradigm for Clinical AI	Yulong Li et.al.	2602.19158	translate	read	null
2026-02-22	Facet-Level Persona Control by Trait-Activated Routing with Contrastive SAE for Role-Playing LLMs	Wenqiu Tang et.al.	2602.19157	translate	read	null
2026-02-22	A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions	Stefanie Schneider et.al.	2602.19133	translate	read	null
2026-02-22	K-Search: LLM Kernel Generation via Co-Evolving Intrinsic World Model	Shiyi Cao et.al.	2602.19128	translate	read	null
2026-02-22	AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG	Qijie You et.al.	2602.19127	translate	read	null
2026-02-22	Dark and Bright Side of Participatory Red-Teaming with Targets of Stereotyping for Eliciting Harmful Behaviors from Large Language Models	Sieun Kim et.al.	2602.19124	translate	read	null
2026-02-22	How Do LLMs Encode Scientific Quality? An Empirical Study Using Monosemantic Features from Sparse Autoencoders	Michael McCoubrey et.al.	2602.19115	translate	read	null
2026-02-22	Universal 3D Shape Matching via Coarse-to-Fine Language Guidance	Qinfeng Xiao et.al.	2602.19112	translate	read	null
2026-02-22	Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models	Kainan Liu et.al.	2602.19111	translate	read	null
2026-02-22	Value Entanglement: Conflation Between Different Kinds of Good In (Some) Large Language Models	Seong Hah Cho et.al.	2602.19101	translate	read	null
2026-02-22	CREM: Compression-Driven Representation Enhancement for Multimodal Retrieval and Comprehension	Lihao Liu et.al.	2602.19091	translate	read	null
2026-02-22	TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes	Roman Egger et.al.	2602.19079	translate	read	null
2026-02-22	Evaluation and Benchmarking Suite for Financial Large Language Models and Agents	Shengyuan Lin et.al.	2602.19073	translate	read	null
2026-02-22	IDLM: Inverse-distilled Diffusion Language Models	David Li et.al.	2602.19066	translate	read	null
2026-02-22	Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents	Chanjin Park et.al.	2602.19065	translate	read	null
2026-02-22	Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer	Chenhang Cui et.al.	2602.19058	translate	read	null
2026-02-22	IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning	Yinhan He et.al.	2602.19049	translate	read	null
2026-02-22	Uncovering Context Reliance in Unstructured Knowledge Editing	Zisheng Zhou et.al.	2602.19043	translate	read	null
2026-02-22	Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning	Jiahao Zhang et.al.	2602.19041	translate	read	null
2026-02-05	Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning	Xuejun Zhang et.al.	2602.06041	translate	read	null
2026-02-05	SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs	Jintao Tong et.al.	2602.06040	translate	read	link
2026-02-05	DyTopo: Dynamic Topology Routing for Multi-Agent Reasoning via Semantic Matching	Yuxing Lu et.al.	2602.06039	translate	read	null
2026-02-05	Thinking with Geometry: Active Geometry Integration for Spatial Reasoning	Haoyuan Li et.al.	2602.06037	translate	read	link
2026-02-05	DFlash: Block Diffusion for Flash Speculative Decoding	Jian Chen et.al.	2602.06036	translate	read	link
2026-02-05	V-Retrver: Evidence-Driven Agentic Reasoning for Universal Multimodal Retrieval	Dongyang Chen et.al.	2602.06034	translate	read	link
2026-02-05	PhysicsAgentABM: Physics-Guided Generative Agent-Based Modeling	Kavana Venkatesh et.al.	2602.06030	translate	read	null
2026-02-05	Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory	Haozhen Zhang et.al.	2602.06025	translate	read	null
2026-02-05	Correctness-Optimized Residual Activation Lens (CORAL): Transferrable and Calibration-Aware Inference-Time Steering	Miranda Muqing Miao et.al.	2602.06022	translate	read	null
2026-02-05	A Systematic Evaluation of Large Language Models for PTSD Severity Estimation: The Role of Contextual Knowledge and Modeling Strategies	Panagiotis Kaliosis et.al.	2602.06015	translate	read	null
2026-02-05	AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions	Xianyang Liu et.al.	2602.06008	translate	read	null
2026-02-05	VisRefiner: Learning from Visual Differences for Screenshot-to-Code Generation	Jie Deng et.al.	2602.05998	translate	read	null
2026-02-05	DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs	Lizhuo Luo et.al.	2602.05992	translate	read	link
2026-02-05	Layer-wise LoRA fine-tuning: a similarity metric approach	Keith Ando Ogawa et.al.	2602.05988	translate	read	null
2026-02-05	From Human-Human Collaboration to Human-Agent Collaboration: A Vision, Design Philosophy, and an Empirical Framework for Achieving Successful Partnerships Between Humans and LLM Agents	Bingsheng Yao et.al.	2602.05987	translate	read	null
2026-02-05	Inverse Depth Scaling From Most Layers Being Similar	Yizhou Liu et.al.	2602.05970	translate	read	null
2026-02-05	Orthogonal Model Merging	Sihan Yang et.al.	2602.05943	translate	read	null
2026-02-05	Polyglots or Multitudes? Multilingual LLM Answers to Value-laden Multiple-Choice Questions	Léo Labat et.al.	2602.05932	translate	read	null
2026-02-05	Compound Deception in Elite Peer Review: A Failure Mode Taxonomy of 100 Fabricated Citations at NeurIPS 2025	Samar Ansari et.al.	2602.05930	translate	read	null
2026-02-05	KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs	Jian Chen et.al.	2602.05929	translate	read	null
2026-02-05	Transformers Are Born Biased: Structural Inductive Biases at Random Initialization and Their Practical Consequences	Siquan Li et.al.	2602.05927	translate	read	null
2026-02-05	CLIP-Map: Structured Matrix Mapping for Parameter-Efficient CLIP Compression	Kangjie Zhang et.al.	2602.05909	translate	read	null
2026-02-05	Codified Finite-state Machines for Role-playing	Letian Peng et.al.	2602.05905	translate	read	null
2026-02-05	Regularized Calibration with Successive Rounding for Post-Training Quantization	Seohyeon Cha et.al.	2602.05902	translate	read	null
2026-02-05	Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models	Shuo Nie et.al.	2602.05897	translate	read	null
2026-02-05	When Elo Lies: Hidden Biases in Codeforces-Based Evaluation of Large Language Models	Shenyu Zheng et.al.	2602.05891	translate	read	null
2026-02-05	A Guide to Large Language Models in Modeling and Simulation: From Core Techniques to Critical Challenges	Philippe J. Giabbanelli et.al.	2602.05883	translate	read	null
2026-02-05	EuroLLM-22B: Technical Report	Miguel Moura Ramos et.al.	2602.05879	translate	read	null
2026-02-05	Agent2Agent Threats in Safety-Critical LLM Assistants: A Human-Centric Taxonomy	Lukas Stappen et.al.	2602.05877	translate	read	null
2026-02-05	xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection	Adrián Girón et.al.	2602.05874	translate	read	null
2026-02-05	DLM-Scope: Mechanistic Interpretability of Diffusion Language Models via Sparse Autoencoders	Xu Wang et.al.	2602.05859	translate	read	null
2026-02-05	BABE: Biology Arena BEnchmark	Junting Zhou et.al.	2602.05857	translate	read	null
2026-02-05	“It Talks Like a Patient, But Feels Different”: Co-Designing AI Standardized Patients with Medical Learners	Zhiqi Gao et.al.	2602.05856	translate	read	null
2026-02-05	RRAttention: Dynamic Block Sparse Attention via Per-Head Round-Robin Shifts for Long-Context Inference	Siran Liu et.al.	2602.05853	translate	read	null
2026-02-05	OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions	Fangzhi Xu et.al.	2602.05843	translate	read	null
2026-02-05	Reinforcement World Model Learning for LLM-based Agents	Xiao Yu et.al.	2602.05842	translate	read	null
2026-02-05	Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation	Hai Zhang et.al.	2602.05827	translate	read	null
2026-02-05	Whispers of the Butterfly: A Research-through-Design Exploration of In-Situ Conversational AI Guidance in Large-Scale Outdoor MR Exhibitions	Dongyijie Primo Pan et.al.	2602.05826	translate	read	null
2026-02-05	ToMigo: Interpretable Design Concept Graphs for Aligning Generative AI with Creative Intent	Lena Hegemann et.al.	2602.05825	translate	read	null
2026-02-05	Authorship Drift: How Self-Efficacy and Trust Evolve During LLM-Assisted Writing	Yeon Su Park et.al.	2602.05819	translate	read	null
2026-02-05	TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning	Zihao Jiang et.al.	2602.05818	translate	read	null
2026-02-05	Where Does Warm-Up Come From? Adaptive Scheduling for Norm-Constrained Optimizers	Artem Riabinin et.al.	2602.05813	translate	read	null
2026-02-05	NEX: Neuron Explore-Exploit Scoring for Label-Free Chain-of-Thought Selection and Model Ranking	Kang Chen et.al.	2602.05805	translate	read	null
2026-02-05	Task-Oriented Robot-Human Handovers on Legged Manipulators	Andreea Tulbure et.al.	2602.05760	translate	read	null
2026-02-05	Towards Green AI: Decoding the Energy of LLM Inference in Software Development	Lola Solovyeva et.al.	2602.05712	translate	read	null
2026-02-05	Determining Energy Efficiency Sweet Spots in Production LLM Inference	Hiari Pizzini Cavagna et.al.	2602.05695	translate	read	null
2026-02-05	Consensus-Aligned Neuron Efficient Fine-Tuning Large Language Models for Multi-Domain Machine Translation	Shuting Jiang et.al.	2602.05694	translate	read	null
2026-02-05	MedErrBench: A Fine-Grained Multilingual Benchmark for Medical Error Detection and Correction with Clinical Expert Annotations	Congbo Ma et.al.	2602.05692	translate	read	null
2026-02-05	Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction	Pavithren V S Pakianathan et.al.	2602.05687	translate	read	null
2026-02-05	Graph-based Agent Memory: Taxonomy, Techniques, and Applications	Chang Yang et.al.	2602.05665	translate	read	null
2026-02-05	Alignment Verifiability in Large Language Models: Normative Indistinguishability under Behavioral Evaluation	Igor Santos-Grueiro et.al.	2602.05656	translate	read	null
2026-02-05	Generative Ontology: When Structured Knowledge Learns to Create	Benny Cheung et.al.	2602.05636	translate	read	null
2026-02-05	CASTLE: A Comprehensive Benchmark for Evaluating Student-Tailored Personalized Safety in Large Language Models	Rui Jia et.al.	2602.05633	translate	read	null
2026-02-05	Rewards as Labels: Revisiting RLVR from a Classification Perspective	Zepeng Zhai et.al.	2602.05630	translate	read	null
2026-02-05	AI chatbots versus human healthcare professionals: a systematic review and meta-analysis of empathy in patient care	Alastair Howcroft et.al.	2602.05628	translate	read	null
2026-02-05	Emulating Aggregate Human Choice Behavior and Biases with GPT Conversational Agents	Stephen Pilli et.al.	2602.05597	translate	read	null
2026-02-05	Multi-Task GRPO: Reliable LLM Reasoning Across Tasks	Shyam Sundhar Ramesh et.al.	2602.05547	translate	read	null
2026-02-05	Reasoning-guided Collaborative Filtering with Language Models for Explainable Recommendation	Fahad Anwaar et.al.	2602.05544	translate	read	null
2026-02-05	Split Personality Training: Revealing Latent Knowledge Through Alternate Personalities	Florian Dietz et.al.	2602.05532	translate	read	null
2026-02-05	AI Agent Systems for Supply Chains: Structured Decision Prompts and Memory Retrieval	Konosuke Yoshizato et.al.	2602.05524	translate	read	null
2026-02-05	Capture the Flags: Family-Based Evaluation of Agentic LLMs via Semantics-Preserving Transformations	Shahin Honarvar et.al.	2602.05523	translate	read	null
2026-02-05	A Human-in-the-Loop, LLM-Centered Architecture for Knowledge-Graph Question Answering	Larissa Pusch et.al.	2602.05512	translate	read	null
2026-02-05	Relying on LLMs: Student Practices and Instructor Norms are Changing in Computer Science Education	Xinrui Lin et.al.	2602.05506	translate	read	null
2026-02-05	SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration	Hanyu Wei et.al.	2602.05499	translate	read	null
2026-02-05	Transport and Merge: Cross-Architecture Merging for Large Language Models	Chenhang Cui et.al.	2602.05495	translate	read	null
2026-02-05	A Unified Framework for Rethinking Policy Divergence Measures in GRPO	Qingyuan Wu et.al.	2602.05494	translate	read	null
2026-02-05	LinguistAgent: A Reflective Multi-Model Platform for Automated Linguistic Annotation	Bingru Li et.al.	2602.05493	translate	read	null
2026-02-05	Fine-Tuning Large Language Models for Automatic Detection of Sexually Explicit Content in Spanish-Language Song Lyrics	Dolores Zamacola Sánchez de Lamadrid et.al.	2602.05485	translate	read	null
2026-02-05	Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection	Takashi Koide et.al.	2602.05484	translate	read	null
2026-02-05	LMMRec: LLM-driven Motivation-aware Multimodal Recommendation	Yicheng Di et.al.	2602.05474	translate	read	null
2026-02-05	ALIVE: Awakening LLM Reasoning via Adversarial Learning and Instructive Verbal Evaluation	Yiwen Duan et.al.	2602.05472	translate	read	null
2026-02-05	Can We Classify Flaky Tests Using Only Test Code? An LLM-Based Empirical Study	Alexander Berndt et.al.	2602.05465	translate	read	null
2026-02-05	DistillER: Knowledge Distillation in Entity Resolution with Large Language Models	Alexandros Zeakis et.al.	2602.05452	translate	read	null
2026-02-05	BLITZRANK: Principled Zero-shot Ranking Agents with Tournament Graphs	Sheshansh Agrawal et.al.	2602.05448	translate	read	null
2026-02-05	Structured Context Engineering for File-Native Agentic Systems: Evaluating Schema Accuracy, Format Effectiveness, and Multi-File Navigation at Scale	Damon McMillan et.al.	2602.05447	translate	read	null
2026-02-05	DiLLS: Interactive Diagnosis of LLM-based Multi-agent Systems via Layered Summary of Agent Behaviors	Rui Sheng et.al.	2602.05446	translate	read	null
2026-02-05	Causal Front-Door Adjustment for Robust Jailbreak Attacks on LLMs	Yao Zhou et.al.	2602.05444	translate	read	null
2026-02-05	SciDef: Automating Definition Extraction from Academic Literature with Large Language Models	Filip Kučera et.al.	2602.05413	translate	read	null
2026-02-05	BadTemplate: A Training-Free Backdoor Attack via Chat Template Against Large Language Models	Zihan Wang et.al.	2602.05401	translate	read	null
2026-02-05	OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration	Shaobo Wang et.al.	2602.05400	translate	read	null
2026-02-05	Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better	Ji Zhao et.al.	2602.05393	translate	read	null
2026-02-05	Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening	Zhenxiong Yu et.al.	2602.05386	translate	read	null
2026-02-05	IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models	Tao Liu et.al.	2602.05385	translate	read	null
2026-02-05	Clinical Validation of Medical-based Large Language Model Chatbots on Ophthalmic Patient Queries with LLM-based Evaluation	Ting Fang Tan et.al.	2602.05381	translate	read	null
2026-02-05	Cross-Lingual Empirical Evaluation of Large Language Models for Arabic Medical Tasks	Chaimae Abouzahir et.al.	2602.05374	translate	read	null
2026-02-05	Speech-XL: Towards Long-Form Speech Understanding in Large Speech Language Models	Haoqin Sun et.al.	2602.05373	translate	read	null
2026-02-05	PACE: Defying the Scaling Hypothesis of Exploration in Iterative Alignment for Mathematical Reasoning	Jun Rao et.al.	2602.05370	translate	read	null
2026-02-05	RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs	Youngcheon You et.al.	2602.05367	translate	read	null
2026-02-05	Multi-Field Tool Retrieval	Yichen Tang et.al.	2602.05366	translate	read	null
2026-02-05	Multimodal Latent Reasoning via Hierarchical Visual Cues Injection	Yiming Zhang et.al.	2602.05359	translate	read	null
2026-02-05	AgentXRay: White-Boxing Agentic Systems via Workflow Reconstruction	Ruijie Shi et.al.	2602.05353	translate	read	null
2026-02-05	SynAT: Enhancing Security Knowledge Bases via Automatic Synthesizing Attack Tree from Crowd Discussions	Ziyou Jiang et.al.	2602.05329	translate	read	null
2026-02-05	ProAct: Agentic Lookahead in Interactive Environments	Yangbin Yu et.al.	2602.05327	translate	read	null
2026-02-05	ORACL: Optimized Reasoning for Autoscaling via Chain of Thought with LLMs for Microservices	Haoyu Bai et.al.	2602.05292	translate	read	null
2026-02-05	Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science	Jingru Fan et.al.	2602.05289	translate	read	null
2026-02-05	Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities	Pengyi Li et.al.	2602.05281	translate	read	null
2026-02-05	Hallucination-Resistant Security Planning with a Large Language Model	Kim Hammar et.al.	2602.05279	translate	read	null
2026-02-05	Magic-MM-Embedding: Towards Visual-Token-Efficient Universal Multimodal Embedding with MLLMs	Qi Li et.al.	2602.05275	translate	read	null
2026-02-05	PatchGuru: Patch Oracle Inference from Natural Language Artifacts with Large Language Models	Thanh Le-Cong et.al.	2602.05270	translate	read	null
2026-02-05	Hybrid Gated Flow (HGF): Stabilizing 1.58-bit LLMs via Selective Low-Rank Correction	David Alejandro Trejo Pizzo et.al.	2602.05269	translate	read	null
2026-02-05	Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR	Fanfan Liu et.al.	2602.05261	translate	read	null
2026-02-05	CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs	Haoran Li et.al.	2602.05258	translate	read	null
2026-02-05	EGSS: Entropy-guided Stepwise Scaling for Reliable Software Engineering	Chenhui Mao et.al.	2602.05242	translate	read	null
2026-02-05	FedMosaic: Federated Retrieval-Augmented Generation via Parametric Adapters	Zhilin Liang et.al.	2602.05235	translate	read	null
2026-02-05	Surgery: Mitigating Harmful Fine-Tuning for Large Language Models via Attention Sink	Guozhi Liu et.al.	2602.05228	translate	read	null
2026-02-05	E.M.Ground: A Temporal Grounding Vid-LLM with Holistic Event Perception and Matching	Jiahao Nie et.al.	2602.05215	translate	read	null
2026-02-05	Aligning Large Language Model Behavior with Human Citation Preferences	Kenichiro Ando et.al.	2602.05205	translate	read	null
2026-02-05	Double-P: Hierarchical Top-P Sparse Attention for Long-Context LLMs	Wentao Ni et.al.	2602.05191	translate	read	null
2026-02-05	Are Open-Weight LLMs Ready for Social Media Moderation? A Comparative Study on Bluesky	Hsuan-Yu Chou et.al.	2602.05189	translate	read	null
2026-02-05	Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning	John Yan et.al.	2602.05183	translate	read	null
2026-02-05	EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization	Kevin Han et.al.	2602.05165	translate	read	null
2026-02-05	GreekMMLU: A Native-Sourced Multitask Benchmark for Evaluating Language Models in Greek	Yang Zhang et.al.	2602.05150	translate	read	null
2026-02-05	CoSA: Compressed Sensing-Based Adaptation of Large Language Models	Songtao Wei et.al.	2602.05148	translate	read	null
2026-02-04	HugRAG: Hierarchical Causal Knowledge Graph Design for RAG	Nengbo Wang et.al.	2602.05143	translate	read	null
2026-02-04	SemPipes – Optimizable Semantic Data Operators for Tabular Machine Learning Pipelines	Olga Ovcharenko et.al.	2602.05134	translate	read	null
2026-02-04	SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers	Keyang Xuan et.al.	2602.05115	translate	read	null
2026-02-04	Understanding LLM Evaluator Behavior: A Structured Multi-Evaluator Framework for Merchant Risk Assessment	Liang Wang et.al.	2602.05110	translate	read	null
2026-02-04	GAMMS: Graph based Adversarial Multiagent Modeling Simulator	Rohan Patil et.al.	2602.05105	translate	read	null
2026-02-04	VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health	Kate H. Bentley et.al.	2602.05088	translate	read	null
2026-02-04	Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents	Changdae Oh et.al.	2602.05073	translate	read	null
2026-02-04	Evaluating Large Language Models on Solved and Unsolved Problems in Graph Theory: Implications for Computing Education	Adithya Kulkarni et.al.	2602.05059	translate	read	null
2026-02-04	DeepRead: Document Structure-Aware Reasoning to Enhance Agentic Search	Zhanli Li et.al.	2602.05014	translate	read	null
2026-02-04	Private PoEtry: Private In-Context Learning via Product of Experts	Rob Romijnders et.al.	2602.05012	translate	read	null
2026-02-04	CoWork-X: Experience-Optimized Co-Evolution for Multi-Agent Collaboration System	Zexin Lin et.al.	2602.05004	translate	read	null
2026-02-04	Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning	Yu-Ang Lee et.al.	2602.04998	translate	read	null
2026-02-04	BioACE: An Automated Framework for Biomedical Answer and Citation Evaluations	Deepak Gupta et.al.	2602.04982	translate	read	null
2026-02-04	Learning Context Matters: Measuring and Diagnosing Personalization Gaps in LLM-Based Instructional Design	Johaun Hatchett et.al.	2602.04972	translate	read	null
2026-02-04	Large Language Models in Software Documentation and Modeling: A Literature Review and Findings	Lukas Radosky et.al.	2602.04938	translate	read	null
2026-02-04	Linear Model Merging Unlocks Simple and Scalable Multimodal Data Mixture Optimization	Davide Berasi et.al.	2602.04937	translate	read	null
2026-02-04	Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models	Shahar Haim et.al.	2602.04931	translate	read	null
2026-02-04	TurboBoA: Faster and Exact Attention-aware Quantization without Backpropagation	Junhan Kim et.al.	2602.04929	translate	read	link
2026-02-04	PriMod4AI: Lifecycle-Aware Privacy Threat Modeling for AI Systems using LLM	Gautam Savaliya et.al.	2602.04927	translate	read	null
2026-02-04	Knowing When to Answer: Adaptive Confidence Refinement for Reliable Audio-Visual Question Answering	Dinh Phu Tran et.al.	2602.04924	translate	read	null
2026-02-04	Gradually Compacting Large Language Models for Reasoning Like a Boiling Frog	Yiran Zhao et.al.	2602.04919	translate	read	null
2026-02-04	Simulated Adoption: Decoupling Magnitude and Direction in LLM In-Context Conflict Resolution	Long Zhang et.al.	2602.04918	translate	read	null
2026-02-04	AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design	Ling Luo et.al.	2602.04916	translate	read	null
2026-02-04	From Literature to Lab: Closed-Loop Advancement of Perovskite Solar Cells via Domain Knowledge Guided LLM	Penglei Sun et.al.	2602.04914	translate	read	null
2026-02-04	A $^2$ -LLM: An End-to-end Conversational Audio Avatar Large Language Model	Xiaolin Hu et.al.	2602.04913	translate	read	null
2026-02-04	Reducing the Costs of Proof Synthesis on Rust Systems by Scaling Up a Seed Training Set	Nongyu Di et.al.	2602.04910	translate	read	null
2026-02-04	Learning Where It Matters: Geometric Anchoring for Robust Preference Alignment	Youngjae Cho et.al.	2602.04909	translate	read	null
2026-02-03	Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization	Sai Sindhur Malleni et.al.	2602.04900	translate	read	null
2026-02-03	Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models	Chen Xiong et.al.	2602.04896	translate	read	null
2026-02-04	Reinforced Attention Learning	Bangzheng Li et.al.	2602.04884	translate	read	null
2026-02-04	Rethinking the Trust Region in LLM Reinforcement Learning	Penghui Qi et.al.	2602.04879	translate	read	null
2026-02-04	Multi-Head LatentMoE and Head Parallel: Communication-Efficient and Deterministic MoE Parallelism	Chenwei Cui et.al.	2602.04870	translate	read	null
2026-02-04	Subliminal Effects in Your Data: A General Mechanism via Log-Linearity	Ishaq Aden-Ali et.al.	2602.04863	translate	read	null
2026-02-04	CoT is Not the Chain of Truth: An Empirical Internal Analysis of Reasoning LLMs for Fake News Generation	Zhao Tong et.al.	2602.04856	translate	read	null
2026-02-04	Decomposed Prompting Does Not Fix Knowledge Gaps, But Helps Models Say “I Don’t Know”	Dhruv Madhwal et.al.	2602.04853	translate	read	null
2026-02-04	Horizon-LM: A RAM-Centric Architecture for LLM Training	Zhengqing Yuan et.al.	2602.04816	translate	read	link
2026-02-04	Agentic AI in Healthcare & Medicine: A Seven-Dimensional Taxonomy for Empirical Evaluation of LLM-based Agents	Shubham Vatsal et.al.	2602.04813	translate	read	null
2026-02-04	OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models	Yue Ding et.al.	2602.04804	translate	read	null
2026-02-04	Team, Then Trim: An Assembly-Line LLM Framework for High-Quality Tabular Data Generation	Congjing Zhang et.al.	2602.04785	translate	read	null
2026-02-04	NeuroCanvas: VLLM-Powered Robust Seizure Detection by Reformulating Multichannel EEG as Image	Yan Chen et.al.	2602.04769	translate	read	null
2026-02-04	Beyond Many-Shot Translation: Scaling In-Context Demonstrations For Low-Resource Machine Translation	Luis Frentzen Salim et.al.	2602.04764	translate	read	null
2026-02-04	When Silence Is Golden: Can LLMs Learn to Abstain in Temporal QA and Beyond?	Xinyu Zhou et.al.	2602.04755	translate	read	null
2026-02-04	Decomposing Query-Key Feature Interactions Using Contrastive Covariances	Andrew Lee et.al.	2602.04752	translate	read	null
2026-02-04	Exploiting contextual information to improve stance detection in informal political discourse with LLMs	Arman Engin Sucu et.al.	2602.04750	translate	read	null
2026-02-04	Inference-Time Reasoning Selectively Reduces Implicit Social Bias in Large Language Models	Molly Apsel et.al.	2602.04742	translate	read	null
2026-02-04	Alignment Drift in Multimodal LLMs: A Two-Phase, Longitudinal Evaluation of Harm Across Eight Model Releases	Casey Ford et.al.	2602.04739	translate	read	null
2026-02-04	From Data to Behavior: Predicting Unintended Model Behaviors Before Training	Mengru Wang et.al.	2602.04735	translate	read	link
2026-02-04	Less Finetuning, Better Retrieval: Rethinking LLM Adaptation for Biomedical Retrievers via Synthetic Data and Model Merging	Sameh Khattab et.al.	2602.04731	translate	read	null
2026-02-04	“Be My Cheese?”: Cultural Nuance Benchmarking for Machine Translation in Multilingual LLMs	Madison Van Doren et.al.	2602.04729	translate	read	null
2026-02-04	Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation	Marian Kica et.al.	2602.04726	translate	read	null
2026-02-04	SAR-RAG: ATR Visual Question Answering by Semantic Search, Retrieval, and MLLM Generation	David F. Ramirez et.al.	2602.04712	translate	read	null
2026-02-04	LinGO: A Linguistic Graph Optimization Framework with LLMs for Interpreting Intents of Online Uncivil Discourse	Yuan Zhang et.al.	2602.04693	translate	read	null
2026-02-04	UniAudio 2.0: A Unified Audio Language Model with Text-Aligned Factorized Audio Tokenization	Dongchao Yang et.al.	2602.04683	translate	read	link
2026-02-04	Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility	Eun Cheol Choi et.al.	2602.04674	translate	read	null
2026-02-04	Relational Scene Graphs for Object Grounding of Natural Language Commands	Julia Kuhn et.al.	2602.04635	translate	read	null
2026-02-04	WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning	Zelai Xu et.al.	2602.04634	translate	read	link
2026-02-04	Disentangling meaning from language in LLM-based machine translation	Théo Lasnier et.al.	2602.04613	translate	read	null
2026-02-04	Focus-LIME: Surgical Interpretation of Long-Context Large Language Models via Proxy-Based Neighborhood Selection	Junhao Liu et.al.	2602.04607	translate	read	null
2026-02-04	Automated Extraction of Multicomponent Alloy Data Using Large Language Models for Sustainable Design	Aravindan Kamatchi Sundaram et.al.	2602.04602	translate	read	null
2026-02-04	Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference	Xinyu Wang et.al.	2602.04595	translate	read	null
2026-02-04	AIANO: Enhancing Information Retrieval with AI-Augmented Annotation	Sameh Khattab et.al.	2602.04579	translate	read	null
2026-02-04	Semantic Self-Distillation for Language Model Uncertainty	Edward Phillips et.al.	2602.04577	translate	read	null
2026-02-04	Can LLMs capture stable human-generated sentence entropy measures?	Estrella Pivel-Villanueva et.al.	2602.04570	translate	read	null
2026-02-04	LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding	Gang Lin et.al.	2602.04541	translate	read	null
2026-02-04	HoliAntiSpoof: Audio LLM for Holistic Speech Anti-Spoofing	Xuenan Xu et.al.	2602.04535	translate	read	null
2026-02-04	Landscape-aware Automated Algorithm Design: An Efficient Framework for Real-world Optimization	Haoran Yin et.al.	2602.04529	translate	read	null
2026-02-04	OSCAgent: Accelerating the Discovery of Organic Solar Cells with LLM Agents	Zhaolin Hu et.al.	2602.04510	translate	read	null
2026-02-04	Model-Dowser: Data-Free Importance Probing to Mitigate Catastrophic Forgetting in Multimodal Large Language Models	Hyeontaek Hwang et.al.	2602.04509	translate	read	null
2026-02-04	ReThinker: Scientific Reasoning by Rethinking with Guided Reflection and Confidence Control	Zhentao Tang et.al.	2602.04496	translate	read	null
2026-02-04	PersoDPO: Scalable Preference Optimization for Instruction-Adherent, Persona-Grounded Dialogue via Multi-LLM Evaluation	Saleh Afzoon et.al.	2602.04493	translate	read	null
2026-02-04	The Supportiveness-Safety Tradeoff in LLM Well-Being Agents	Himanshi Lalwani et.al.	2602.04487	translate	read	null
2026-02-04	Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition	Jinlong Ma et.al.	2602.04486	translate	read	null
2026-02-04	Vision-aligned Latent Reasoning for Multi-modal Large Language Model	Byungwoo Jeon et.al.	2602.04476	translate	read	null
2026-02-04	LLM-Empowered Cooperative Content Caching in Vehicular Fog Caching-Assisted Platoon Networks	Bowen Tan et.al.	2602.04471	translate	read	null
2026-02-04	DOS: Dual-Flow Orthogonal Semantic IDs for Recommendation in Meituan	Junwei Yin et.al.	2602.04460	translate	read	null
2026-02-04	Growth First, Care Second? Tracing the Landscape of LLM Value Preferences in Everyday Dilemmas	Zhiyi Chen et.al.	2602.04456	translate	read	null
2026-02-04	Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search	Tianming Liang et.al.	2602.04454	translate	read	link
2026-02-04	SDR-CIR: Semantic Debias Retrieval Framework for Training-Free Zero-Shot Composed Image Retrieval	Yi Sun et.al.	2602.04451	translate	read	null
2026-02-04	What’s in a Benchmark? The Case of SWE-Bench in Automated Program Repair	Matias Martinez et.al.	2602.04449	translate	read	null
2026-02-04	Fine-Grained Activation Steering: Steering Less, Achieving More	Zijian Feng et.al.	2602.04428	translate	read	null
2026-02-04	Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning	Heqing Yang et.al.	2602.04419	translate	read	null
2026-02-04	EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL	Lunjun Zhang et.al.	2602.04417	translate	read	null
2026-02-04	History-Guided Iterative Visual Reasoning with Self-Correction	Xinglong Yang et.al.	2602.04413	translate	read	null
2026-02-04	Bi-directional Bias Attribution: Debiasing Large Language Models without Modifying Prompts	Yujie Lin et.al.	2602.04398	translate	read	null
2026-02-04	Evaluating the Presence of Sex Bias in Clinical Reasoning by Large Language Models	Isabel Tsintsiper et.al.	2602.04392	translate	read	null
2026-02-04	Beyond Rejection Sampling: Trajectory Fusion for Scaling Mathematical Reasoning	Jie Deng et.al.	2602.04391	translate	read	null
2026-02-04	On the use of LLMs to generate a dataset of Neural Networks	Nadia Daoudi et.al.	2602.04388	translate	read	null
2026-02-04	Multi-scale hypergraph meets LLMs: Aligning large language models for time series analysis	Zongjiang Shang et.al.	2602.04369	translate	read	null
2026-02-04	EXaMCaP: Subset Selection with Entropy Gain Maximization for Probing Capability Gains of Large Chart Understanding Training Sets	Jiapeng Liu et.al.	2602.04365	translate	read	null
2026-02-04	Generative AI in Systems Engineering: A Framework for Risk Assessment of Large Language Models	Stefan Otten et.al.	2602.04358	translate	read	null
2026-02-04	Can Vision Replace Text in Working Memory? Evidence from Spatial n-Back in Vision-Language Models	Sichu Liang et.al.	2602.04355	translate	read	null
2026-02-04	UnMaskFork: Test-Time Scaling for Masked Diffusion via Deterministic Action Branching	Kou Misaki et.al.	2602.04344	translate	read	null
2026-02-04	From Assumptions to Actions: Turning LLM Reasoning into Uncertainty-Aware Planning for Embodied Agents	SeungWon Seo et.al.	2602.04326	translate	read	null
2026-02-04	A Domain-Specific Curated Benchmark for Entity and Document-Level Relation Extraction	Marco Martinelli et.al.	2602.04320	translate	read	null
2026-02-04	DeFrame: Debiasing Large Language Models Against Framing Effects	Kahee Lim et.al.	2602.04306	translate	read	null
2026-02-04	Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification	Branislav Pecher et.al.	2602.04297	translate	read	null
2026-02-04	ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas	Wenjun Peng et.al.	2602.04296	translate	read	link
2026-02-04	How Few-shot Demonstrations Affect Prompt-based Defenses Against LLM Jailbreak Attacks	Yanshu Wang et.al.	2602.04294	translate	read	null
2026-02-04	Disentangling Causal Importance from Emergent Structure in Multi-Expert Orchestration	Sudipto Ghosh et.al.	2602.04291	translate	read	null
2026-02-04	Guided Verifier: Collaborative Multimodal Reasoning via Dynamic Process Supervision	Lingzhuang Sun et.al.	2602.04290	translate	read	null
2026-02-04	Contextual Drag: How Errors in the Context Affect LLM Reasoning	Yun Cheng et.al.	2602.04288	translate	read	null
2026-02-04	ECG-R1: Protocol-Guided and Modality-Agnostic MLLM for Reliable ECG Interpretation	Jiarui Jin et.al.	2602.04279	translate	read	link
2026-02-04	MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation	Lin Wang et.al.	2602.04278	translate	read	null
2026-02-04	KVSmooth: Mitigating Hallucination in Multi-modal Large Language Models through Key-Value Smoothing	Siyu Jiang et.al.	2602.04268	translate	read	null
2026-02-04	Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning	Wenze Lin et.al.	2602.04265	translate	read	null
2026-02-04	Data Agents: Levels, State of the Art, and Open Problems	Yuyu Luo et.al.	2602.04261	translate	read	null
2026-02-04	Scaling Agentic Verifier for Competitive Coding	Zeyao Ma et.al.	2602.04254	translate	read	null
2026-02-04	Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search	Hao Lu et.al.	2602.04248	translate	read	null
2026-02-04	CoLT: Reasoning with Chain of Latent Tool Calls	Fangwei Zhu et.al.	2602.04246	translate	read	null
2026-02-04	On the Uncertainty of Large Language Model-Based Multi-Agent Systems	Yuxuan Zhao et.al.	2602.04234	translate	read	null
2026-02-04	Following the TRAIL: Predicting and Explaining Tomorrow’s Hits with a Fine-Tuned LLM	Yinan Zhang et.al.	2602.04225	translate	read	null
2026-02-04	Language Models Struggle to Use Representations Learned In-Context	Michael A. Lepori et.al.	2602.04212	translate	read	null
2026-02-04	Steering LLMs via Scalable Interactive Oversight	Enyu Zhou et.al.	2602.04210	translate	read	null
2026-02-04	Enforcing Monotonic Progress in Legal Cross-Examination: Preventing Long-Horizon Stagnation in LLM-Based Inquiry	Hsien-Jyh Liao et.al.	2602.04206	translate	read	null
2026-02-04	Semantic Consensus Decoding: Backdoor Defense for Verilog Code Generation	Guang Yang et.al.	2602.04195	translate	read	null
2026-02-04	SOGPTSpotter: Detecting ChatGPT-Generated Answers on Stack Overflow	Suyu Ma et.al.	2602.04185	translate	read	null
2026-02-04	I Can’t Believe It’s Not a Valid Exploit	Derin Gezgin et.al.	2602.04165	translate	read	null
2026-02-04	BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models	Junyu Chen et.al.	2602.04163	translate	read	null
2026-02-04	Paint by Odor: An Exploration of Odor Visualization through Large Language Model and Generative AI	Gang Yu et.al.	2602.04159	translate	read	null
2026-02-04	A Modern System Recipe for Situated Embodied Human-Robot Conversation with Real-Time Multimodal LLMs and Tool-Calling	Dong Won Lee et.al.	2602.04157	translate	read	null
2026-02-04	JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset built with Large Language Models	Hiroshi Sasaki et.al.	2602.04142	translate	read	null
2026-02-04	Semantic Pilot Design for Data-Aided Channel Estimation Using a Large Language Model	Sojeong Park et.al.	2602.04126	translate	read	null
2026-02-04	Making Videos Accessible for Blind and Low Vision Users Using a Multimodal Agent Video Player	Adriana Olmos et.al.	2602.04104	translate	read	null
2026-02-04	Rethinking Perplexity: Revealing the Impact of Input Length on Perplexity Evaluation in LLMs	Letian Cheng et.al.	2602.04099	translate	read	null
2026-02-03	Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL	Xiaofeng Lin et.al.	2602.04089	translate	read	null
2026-02-03	Abstraction Induces the Brain Alignment of Language and Speech Models	Emily Cheng et.al.	2602.04081	translate	read	null
2026-02-03	Stroke Lesions as a Rosetta Stone for Language Model Interpretability	Julius Fridriksson et.al.	2602.04074	translate	read	null
2026-02-03	Data Verification is the Future of Quantum Computing Copilots	Junhao Song et.al.	2602.04072	translate	read	null
2026-02-03	Exploring the Potential of Large Language Models in Simulink-Stateflow Mutant Generation	Pablo Valle et.al.	2602.04066	translate	read	null
2026-02-03	The CitizenQuery Benchmark: A Novel Dataset and Evaluation Pipeline for Measuring LLM Performance in Citizen Query Tasks	Neil Majithia et.al.	2602.04064	translate	read	null
2026-02-03	RareCollab – An Agentic System Diagnosing Mendelian Disorders with Integrated Phenotypic and Molecular Evidence	Guantong Qi et.al.	2602.04058	translate	read	null
2026-02-03	Evaluating the Vulnerability Landscape of LLM-Generated Smart Contracts	Hoang Long Do et.al.	2602.04039	translate	read	null
2026-02-03	On the Credibility of Evaluating LLMs using Survey Questions	Jindřich Libovický et.al.	2602.04033	translate	read	null
2026-02-03	Understanding and Guiding Layer Placement in Parameter-Efficient Fine-Tuning of Large Language Models	Yichen Xu et.al.	2602.04019	translate	read	null
2026-02-03	Chaplains’ Reflections on the Design and Usage of AI for Conversational Care	Joel Wester et.al.	2602.04017	translate	read	null
2026-02-03	PromptSplit: Revealing Prompt-Level Disagreement in Generative Models	Mehdi Lotfian et.al.	2602.04009	translate	read	null
2026-02-03	StraTyper: Automated Semantic Type Discovery and Multi-Type Annotation for Dataset Collections	Christos Koutras et.al.	2602.04004	translate	read	null
2026-02-03	When AI Persuades: Adversarial Explanation Attacks on Human Trust in AI-Assisted Decision Making	Shutong Fan et.al.	2602.04003	translate	read	null
2026-02-03	After Talking with 1,000 Personas: Learning Preference-Aligned Proactive Assistants From Large-Scale Persona Interactions	Ziyi Xuan et.al.	2602.04000	translate	read	null
2026-02-03	When Chains of Thought Don’t Matter: Causal Bypass in Large Language Models	Anish Sathyanarayanan et.al.	2602.03994	translate	read	null
2026-02-03	Likelihood-Based Reward Designs for General LLM Reasoning	Ariel Kwiatkowski et.al.	2602.03979	translate	read	null
2026-02-03	Adaptive Test-Time Compute Allocation via Learned Heuristics over Categorical Structure	Shuhui Qu et.al.	2602.03975	translate	read	null
2026-02-03	Structural shifts in institutional participation and collaboration within the AI arXiv preprint research ecosystem	Shama Magnur et.al.	2602.03969	translate	read	null
2026-02-03	Automatic Classification of Pedagogical Materials against CS Curriculum Guidelines	Erik Saule et.al.	2602.03962	translate	read	null
2026-02-03	AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent	Yinyi Luo et.al.	2602.03955	translate	read	link
2026-02-03	SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?	Azmine Toushik Wasi et.al.	2602.03916	translate	read	link
2026-02-03	Knowledge Model Prompting Increases LLM Performance on Planning Tasks	Erik Goh et.al.	2602.03900	translate	read	null
2026-02-03	Audit After Segmentation: Reference-Free Mask Quality Assessment for Language-Referred Audio-Visual Segmentation	Jinxing Zhou et.al.	2602.03892	translate	read	null
2026-02-03	4DPC $^2$ hat: Towards Dynamic Point Cloud Understanding with Failure-Aware Bootstrapping	Xindan Zhang et.al.	2602.03890	translate	read	null
2026-02-03	Understanding and Exploiting Weight Update Sparsity for Communication-Efficient Distributed RL	Erfan Miahi et.al.	2602.03839	translate	read	null
2026-02-03	Accelerating Scientific Research with Gemini: Case Studies and Common Techniques	David P. Woodruff et.al.	2602.03837	translate	read	null
2026-02-03	Fast-Slow Efficient Training for Multimodal Large Language Models via Visual Token Pruning	Dingkun Zhang et.al.	2602.03815	translate	read	null
2026-02-03	Conformal Thinking: Risk Control for Reasoning on a Compute Budget	Xi Wang et.al.	2602.03814	translate	read	null
2026-02-03	Antidistillation Fingerprinting	Yixuan Even Xu et.al.	2602.03812	translate	read	null
2026-02-03	Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation	Ziru Chen et.al.	2602.03806	translate	read	link
2026-02-03	Context Compression via Explicit Information Transmission	Jiangnan Ye et.al.	2602.03784	translate	read	null
2026-02-03	Efficient Estimation of Kernel Surrogate Models for Task Attribution	Zhenshuo Zhang et.al.	2602.03783	translate	read	null
2026-02-03	QVLA: Not All Channels Are Equal in Vision-Language-Action Model’s Quantization	Yuhao Xu et.al.	2602.03782	translate	read	null
2026-02-03	A Scene Graph Backed Approach to Open Set Semantic Mapping	Martin Günther et.al.	2602.03781	translate	read	null
2026-02-03	An Empirical Study of Collective Behaviors and Social Dynamics in Large Language Model Agents	Farnoosh Hashemi et.al.	2602.03775	translate	read	null
2026-02-03	Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL	Ian Wu et.al.	2602.03773	translate	read	null
2026-02-03	UniGeM: Unifying Data Mixing and Selection via Geometric Exploration and Mining	Changhao Wang et.al.	2602.03772	translate	read	null
2026-02-03	Training Multi-Turn Search Agent via Contrastive Dynamic Branch Sampling	Yubao Zhao et.al.	2602.03719	translate	read	null
2026-02-03	SWE-Refactor: A Repository-Level Benchmark for Real-World LLM-Based Code Refactoring	Yisen Xu et.al.	2602.03712	translate	read	null
2026-02-03	No Shortcuts to Culture: Indonesian Multi-hop Question Answering for Complex Cultural Understanding	Vynska Amalia Permadi et.al.	2602.03709	translate	read	null
2026-02-03	Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States	Ximing Dong et.al.	2602.03708	translate	read	null
2026-02-03	Cognitively Diverse Multiple-Choice Question Generation: A Hybrid Multi-Agent Framework with Large Language Models	Yu Tian et.al.	2602.03704	translate	read	null
2026-02-03	Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging	Alexandru Meterez et.al.	2602.03702	translate	read	null
2026-02-03	Conflict-Resolving and Sharpness-Aware Minimization for Generalized Knowledge Editing with Multiple Updates	Duy Nguyen et.al.	2602.03696	translate	read	null
2026-02-03	LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization	Zishi Zhang et.al.	2602.03690	translate	read	null
2026-02-03	Universal One-third Time Scaling in Learning Peaked Distributions	Yizhou Liu et.al.	2602.03685	translate	read	null
2026-02-03	Instruction Anchors: Dissecting the Causal Dynamics of Modality Arbitration	Yu Zhang et.al.	2602.03677	translate	read	null
2026-02-03	Mitigating Conversational Inertia in Multi-Turn Agents	Yang Wan et.al.	2602.03664	translate	read	null
2026-02-03	Reinforcement Fine-Tuning for History-Aware Dense Retriever in RAG	Yicheng Zhang et.al.	2602.03645	translate	read	null
2026-02-03	TRE: Encouraging Exploration in the Trust Region	Chao Huang et.al.	2602.03635	translate	read	link
2026-02-03	Can LLMs Do Rocket Science? Exploring the Limits of Complex Reasoning with GTOC 12	Iñaki del Campo et.al.	2602.03630	translate	read	null
2026-02-03	Toward a new AI winter? How diffusion of technological innovation on networks leads to chaotic boom-bust cycles	Sabin Roman et.al.	2602.03620	translate	read	null
2026-02-03	Controlling Output Rankings in Generative Engines for LLM-based Search	Haibo Jin et.al.	2602.03608	translate	read	null
2026-02-03	Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation	Haichao Jiang et.al.	2602.03595	translate	read	null
2026-02-03	SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM	Ming Nie et.al.	2602.03589	translate	read	null
2026-02-03	$V_0$ : A Generalist Value Model for Any Policy at State Zero	Yi-Kai Zhang et.al.	2602.03584	translate	read	null
2026-02-03	Don’t believe everything you read: Understanding and Measuring MCP Behavior under Misleading Tool Descriptions	Zhihao Li et.al.	2602.03580	translate	read	null
2026-02-03	Use Graph When It Needs: Efficiently and Adaptively Integrating Retrieval-Augmented Generation with Graphs	Su Dong et.al.	2602.03578	translate	read	null
2026-02-03	EHRWorld: A Patient-Centric Medical World Model for Long-Horizon Clinical Trajectories	Linjie Mu et.al.	2602.03569	translate	read	null
2026-02-03	CoGenCast: A Coupled Autoregressive-Flow Generative Framework for Time Series Forecasting	Yaguo Liu et.al.	2602.03564	translate	read	null
2026-02-03	Scaling Test-Driven Code Generation from Functions to Classes: An Empirical Study	Yunhao Liang et.al.	2602.03557	translate	read	null
2026-02-03	When Single Answer Is Not Enough: Rethinking Single-Step Retrosynthesis Benchmarks for LLMs	Bogdan Zagribelnyy et.al.	2602.03554	translate	read	null
2026-02-03	Assessing the Impact of Typological Features on Multilingual Machine Translation in the Age of Large Language Models	Vitalii Hirak et.al.	2602.03551	translate	read	null
2026-02-03	SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue	Yuqin Dai et.al.	2602.03548	translate	read	link
2026-02-03	Persona Generators: Generating Diverse Synthetic Personas at Scale	Davide Paglieri et.al.	2602.03545	translate	read	null
2026-02-03	Can Large Language Models Generalize Procedures Across Representations?	Fangru Lin et.al.	2602.03542	translate	read	null
2026-02-03	PnP-U3D: Plug-and-Play 3D Framework Bridging Autoregression and Diffusion for Unified Understanding and Generation	Yongwei Chen et.al.	2602.03533	translate	read	null
2026-02-03	Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning	Zixiang Di et.al.	2602.03516	translate	read	null
2026-02-03	Learning to Reason Faithfully through Step-Level Faithfulness Maximization	Runquan Gui et.al.	2602.03507	translate	read	null
2026-02-03	Lookahead Path Likelihood Optimization for Diffusion LLMs	Xuejie Liu et.al.	2602.03496	translate	read	null
2026-02-03	IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning	Haohao Luo et.al.	2602.03468	translate	read	null
2026-02-03	Quantum Circuit Generation via test-time learning with large language models	Adriano Macarone-Palmieri et.al.	2602.03466	translate	read	null
2026-02-03	RAL-Bench: Benchmarking for Application-Level Functional Correctness and Non-Functional Quality Attributes	Ruwei Pan et.al.	2602.03462	translate	read	null
2026-02-03	Contextualized Visual Personalization in Vision-Language Models	Yeongtak Oh et.al.	2602.03454	translate	read	null
2026-02-03	Beyond Variance: Prompt-Efficient RLVR via Rare-Event Amplification and Bidirectional Pairing	Xin Sheng et.al.	2602.03452	translate	read	null
2026-02-03	Ontology-to-tools compilation for executable semantic constraint enforcement in LLM agents	Xiaochi Zhou et.al.	2602.03439	translate	read	null
2026-02-03	When control meets large language models: From words to dynamics	Komeil Nosrati et.al.	2602.03433	translate	read	null
2026-02-03	ProAct: A Benchmark and Multimodal Framework for Structure-Aware Proactive Response	Xiaomeng Zhu et.al.	2602.03430	translate	read	null
2026-02-03	DiscoverLLM: From Executing Intents to Discovering Them	Tae Soo Kim et.al.	2602.03429	translate	read	null
2026-02-03	RankSteer: Activation Steering for Pointwise LLM Ranking	Yumeng Wang et.al.	2602.03422	translate	read	null
2026-02-03	SWE-World: Building Software Engineering Agents in Docker-Free Environments	Shuang Sun et.al.	2602.03419	translate	read	link
2026-02-03	Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction	Zhengbo Jiao et.al.	2602.03414	translate	read	null
2026-02-03	Verified Critical Step Optimization for LLM Agents	Mukai Li et.al.	2602.03412	translate	read	null
2026-02-03	Risk Awareness Injection: Calibrating Vision-Language Models for Safety without Compromising Utility	Mengxuan Wang et.al.	2602.03402	translate	read	null
2026-02-03	Precision in Practice: Knowledge Guided Code Summarizing Grounded in Industrial Expectations	Jintai Li et.al.	2602.03400	translate	read	null
2026-02-03	Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective	Hao Fang et.al.	2602.03396	translate	read	null
2026-02-03	On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models	Shumin Wang et.al.	2602.03392	translate	read	null
2026-02-03	Pursuing Best Industrial Practices for Retrieval-Augmented Generation in the Medical Domain	Wei Zhu et.al.	2602.03368	translate	read	null
2026-02-03	MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling	Ning Ding et.al.	2602.03359	translate	read	null
2026-02-03	MentalSeek-Dx: Towards Progressive Hypothetico-Deductive Reasoning for Real-world Psychiatric Diagnosis	Xiao Sun et.al.	2602.03340	translate	read	null
2026-02-03	The Personality Trap: How LLMs Embed Bias When Generating Human-Like Personas	Jacopo Amidei et.al.	2602.03334	translate	read	null
2026-02-03	MedSAM-Agent: Empowering Interactive Medical Image Segmentation with Multi-turn Agentic Reinforcement Learning	Shengyuan Liu et.al.	2602.03320	translate	read	null
2026-02-03	MIRROR: A Multi-Agent Framework with Iterative Adaptive Revision and Hierarchical Retrieval for Optimization Modeling in Operations Research	Yifan Shi et.al.	2602.03318	translate	read	null
2026-02-03	Multi-Level Testing of Conversational AI Systems	Elena Masserini et.al.	2602.03311	translate	read	null
2026-02-03	Entropy-Gated Selective Policy Optimization:Token-Level Gradient Allocation for Hybrid Training of Large Language Models	Yuelin Hu et.al.	2602.03309	translate	read	null
2026-02-03	medR: Reward Engineering for Clinical Offline Reinforcement Learning via Tri-Drive Potential Functions	Qianyi Xu et.al.	2602.03305	translate	read	null
2026-02-03	R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model?	Jingyi Zhang et.al.	2602.03300	translate	read	null
2026-02-03	POP: Prefill-Only Pruning for Efficient Large Model Inference	Junhui He et.al.	2602.03295	translate	read	null
2026-02-03	Agentic Proposing: Enhancing Large Language Model Reasoning via Compositional Skill Synthesis	Zhengbo Jiao et.al.	2602.03279	translate	read	null
2026-02-03	LogicScan: An LLM-driven Framework for Detecting Business Logic Vulnerabilities in Smart Contracts	Jiaqi Gao et.al.	2602.03271	translate	read	null
2026-02-03	Beyond Suffixes: Token Position in GCG Adversarial Attacks on Large Language Models	Hicham Eddoubi et.al.	2602.03265	translate	read	null
2026-02-03	CSR-Bench: A Benchmark for Evaluating the Cross-modal Safety and Reliability of MLLMs	Yuxuan Liu et.al.	2602.03263	translate	read	null
2026-02-03	The Necessity of a Unified Framework for LLM-Based Agent Evaluation	Pengyu Zhu et.al.	2602.03238	translate	read	null
2026-02-03	Merging Beyond: Streaming LLM Updates via Activation-Guided Rotations	Yuxuan Yao et.al.	2602.03237	translate	read	null
2026-02-03	EventFlash: Towards Efficient MLLMs for Event-Based Vision	Shaoyu Liu et.al.	2602.03230	translate	read	null
2026-02-03	Spiral RoPE: Rotate Your Rotary Positional Embeddings in the 2D Plane	Haoyu Liu et.al.	2602.03227	translate	read	null
2026-02-03	ATACompressor: Adaptive Task-Aware Compression for Efficient Long-Context Processing in LLMs	Xuancheng Li et.al.	2602.03226	translate	read	null
2026-02-03	Beyond Quantity: Trajectory Diversity Scaling for Code Agents	Guhong Chen et.al.	2602.03219	translate	read	null
2026-02-03	Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection	Dongwon Jo et.al.	2602.03216	translate	read	null
2026-02-03	ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution	Zican Dong et.al.	2602.03203	translate	read	null
2026-02-03	Reinforcement Learning with Promising Tokens for Large Language Models	Jing-Cheng Pang et.al.	2602.03195	translate	read	null
2026-02-03	Prompt Augmentation Scales up GRPO Training on Mathematical Reasoning	Wenquan Lu et.al.	2602.03190	translate	read	null
2026-02-03	DynSplit-KV: Dynamic Semantic Splitting for KVCache Compression in Efficient Long-Context LLM Inference	Jiancai Ye et.al.	2602.03184	translate	read	null
2026-02-03	Privasis: Synthesizing the Largest “Public” Private Dataset from Scratch	Hyunwoo Kim et.al.	2602.03183	translate	read	null
2026-02-03	VALUEFLOW: Toward Pluralistic and Steerable Value-based Alignment in Large Language Models	Woojin Kim et.al.	2602.03160	translate	read	null
2026-02-03	PAMAS: Self-Adaptive Multi-Agent System with Perspective Aggregation for Misinformation Detection	Zongwei Wang et.al.	2602.03158	translate	read	null
2026-02-03	Is It Possible to Make Chatbots Virtuous? Investigating a Virtue-Based Design Methodology Applied to LLMs	Matthew P. Lad et.al.	2602.03155	translate	read	null
2026-02-03	FASA: Frequency-aware Sparse Attention	Yifei Wang et.al.	2602.03152	translate	read	null
2026-02-03	Internet of Agentic AI: Incentive-Compatible Distributed Teaming and Workflow	Ya-Ting Yang et.al.	2602.03145	translate	read	null
2026-02-03	Self-Hinting Language Models Enhance Reinforcement Learning	Baohao Liao et.al.	2602.03143	translate	read	null
2026-02-03	Contrastive Concept-Tree Search for LLM-Assisted Algorithm Discovery	Timothee Leleu et.al.	2602.03132	translate	read	null
2026-02-03	Understanding Multi-Agent LLM Frameworks: A Unified Benchmark and Experimental Analysis	Abdelghny Orogat et.al.	2602.03128	translate	read	null
2026-02-03	Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost	Yinggan Xu et.al.	2602.03120	translate	read	null
2026-02-03	Digital Lifelong Learning in the Age of AI: Trends and Insights	Geeta Puri et.al.	2602.03114	translate	read	null
2026-02-03	ChemPro: A Progressive Chemistry Benchmark for Large Language Models	Aaditya Baranwal et.al.	2602.03108	translate	read	null
2026-02-03	The Mask of Civility: Benchmarking Chinese Mock Politeness Comprehension in Large Language Models	Yitong Zhang et.al.	2602.03107	translate	read	null
2026-02-03	Task–Specificity Score: Measuring How Much Instructions Really Matter for Supervision	Pritam Kadasi et.al.	2602.03103	translate	read	null
2026-02-03	Consensus Group Relative Policy Optimization for Text Generation	Yuki Ichihara et.al.	2602.03102	translate	read	null
2026-02-03	Risky-Bench: Probing Agentic Safety Risks under Real-World Deployment	Jingnan Zheng et.al.	2602.03100	translate	read	null
2026-02-03	De-conflating Preference and Qualification: Constrained Dual-Perspective Reasoning for Job Recommendation with Large Language Models	Bryce Kan et.al.	2602.03097	translate	read	null
2026-02-03	Test-time Recursive Thinking: Self-Improvement without External Feedback	Yufan Zhuang et.al.	2602.03094	translate	read	null
2026-02-03	AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback	Zhitao Gao et.al.	2602.03084	translate	read	null
2026-02-03	ReMiT: RL-Guided Mid-Training for Iterative LLM Evolution	Junjie Huang et.al.	2602.03075	translate	read	null
2026-02-03	TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT	Rana Muhammad Shahroz Khan et.al.	2602.03073	translate	read	null
2026-02-03	ProOPF: Benchmarking and Improving LLMs for Professional-Grade Power Systems Optimization Modeling	Chao Shen et.al.	2602.03070	translate	read	null
2026-02-03	Skill-Based Autonomous Agents for Material Creep Database Construction	Yue Wu et.al.	2602.03069	translate	read	null
2026-02-03	ALPBench: A Benchmark for Attribution-level Long-term Personal Behavior Understanding	Lu Ren et.al.	2602.03056	translate	read	null
2026-02-03	MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems	Vishal Venkataramani et.al.	2602.03053	translate	read	null
2026-02-03	SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression	Xing Hu et.al.	2602.03051	translate	read	null
2026-02-03	Clarify Before You Draw: Proactive Agents for Robust Text-to-CAD Generation	Bo Yuan et.al.	2602.03045	translate	read	null
2026-02-03	LatentMem: Customizing Latent Memory for Multi-Agent Systems	Muxin Fu et.al.	2602.03036	translate	read	null
2026-02-03	Generalizable and Interpretable RF Fingerprinting with Shapelet-Enhanced Large Language Models	Tianya Zhao et.al.	2602.03035	translate	read	null
2026-02-03	RC-GRPO: Reward-Conditioned Group Relative Policy Optimization for Multi-Turn Tool Calling Agents	Haitian Zhong et.al.	2602.03025	translate	read	null
2026-02-03	Rethinking Music Captioning with Music Metadata LLMs	Irmak Bukey et.al.	2602.03023	translate	read	null
2026-02-03	STAR: Similarity-guided Teacher-Assisted Refinement for Super-Tiny Function Calling Models	Jiliang Ni et.al.	2602.03022	translate	read	null
2026-02-03	FedKRSO: Communication and Memory Efficient Federated Fine-Tuning of Large Language Models	Guohao Yang et.al.	2602.03019	translate	read	null
2026-02-03	VOILA: Value-of-Information Guided Fidelity Selection for Cost-Aware Multimodal Question Answering	Rahul Atul Bhope et.al.	2602.03007	translate	read	null
2026-02-03	Distilling LLM Reasoning into Graph of Concept Predictors	Ziyang Yu et.al.	2602.03006	translate	read	null
2026-02-03	Methods and Open Problems in Differentiable Social Choice: Learning Mechanisms, Decisions, and Alignment	Zhiyu An et.al.	2602.03003	translate	read	null
2026-02-03	Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation	Jiaze Li et.al.	2602.02994	translate	read	null
2026-02-03	Large Language Models Can Take False First Steps at Inference-time Planning	Haijiang Yan et.al.	2602.02991	translate	read	null
2026-02-03	NLI:Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference	Jiangyong Yu et.al.	2602.02988	translate	read	null
2026-02-03	Large-Scale LLM Inference with Heterogeneous Workloads: Prefill-Decode Contention and Asymptotically Optimal Control	Ruihan Lin et.al.	2602.02987	translate	read	null
2026-02-03	Are LLMs Biased Like Humans? Causal Reasoning as a Function of Prior Knowledge, Irrelevant Information, and Reasoning Budget	Hanna M. Dettki et.al.	2602.02983	translate	read	null
2026-02-03	CPMobius: Iterative Coach-Player Reasoning for Data-Free Reinforcement Learning	Ran Li et.al.	2602.02979	translate	read	null
2026-02-03	Where Norms and References Collide: Evaluating LLMs on Normative Reasoning	Mitchell Abrams et.al.	2602.02975	translate	read	null
2026-02-03	Testing Framework Migration with Large Language Models	Altino Alves et.al.	2602.02964	translate	read	null
2026-02-03	Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth	Faye Zhang et.al.	2602.02961	translate	read	null
2026-02-03	Nüwa: Mending the Spatial Integrity Torn by VLM Token Pruning	Yihong Huang et.al.	2602.02951	translate	read	null
2026-02-03	Equal Access, Unequal Interaction: A Counterfactual Audit of LLM Fairness	Alireza Amiri-Margavi et.al.	2602.02932	translate	read	null
2026-02-02	FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights	Zhen Wang et.al.	2602.02905	translate	read	null
2026-02-02	Failure-Aware Enhancements for Large Language Model (LLM) Code Generation: An Empirical Study on Decision Framework	Jianru Shen et.al.	2602.02896	translate	read	null

(<a href=../LLM.md>back to LLM</a>)