LLM - 2025-03
LLM - 2025-03
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-03-31 | Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation | Shengqiong Wu et.al. | 2503.24379 | translate | read | link |
| 2025-03-31 | Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models | Rui Wang et.al. | 2503.24377 | translate | read | link |
| 2025-03-31 | Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1 | Yi Chen et.al. | 2503.24376 | translate | read | link |
| 2025-03-31 | Effectively Controlling Reasoning Models through Thinking Intervention | Tong Wu et.al. | 2503.24370 | translate | read | null |
| 2025-03-31 | ORAL: Prompting Your Large-Scale LoRAs via Conditional Recurrent Diffusion | Rana Muhammad Shahroz Khan et.al. | 2503.24354 | translate | read | null |
| 2025-03-31 | BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models | Alok Abhishek et.al. | 2503.24310 | translate | read | null |
| 2025-03-31 | A Systematic Evaluation of LLM Strategies for Mental Health Text Analysis: Fine-tuning vs. Prompt Engineering vs. RAG | Arshia Kermani et.al. | 2503.24307 | translate | read | null |
| 2025-03-31 | Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning | Jiacheng Lin et.al. | 2503.24289 | translate | read | link |
| 2025-03-31 | Evaluating and Designing Sparse Autoencoders by Approximating Quasi-Orthogonality | Sewoong Lee et.al. | 2503.24277 | translate | read | link |
| 2025-03-31 | Enhancing Large Language Models (LLMs) for Telecommunications using Knowledge Graphs and Retrieval-Augmented Generation | Dun Yuan et.al. | 2503.24245 | translate | read | null |
| 2025-03-28 | Q-Insight: Understanding Image Quality via Visual Reinforcement Learning | Weiqi Li et.al. | 2503.22679 | translate | read | link |
| 2025-03-28 | QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks? | Belinda Z. Li et.al. | 2503.22674 | translate | read | link |
| 2025-03-28 | Exploring the Effectiveness of Multi-stage Fine-tuning for Cross-encoder Re-rankers | Francesca Pezzuti et.al. | 2503.22672 | translate | read | link |
| 2025-03-28 | Unicorn: Text-Only Data Synthesis for Vision Language Model Training | Xiaomin Yu et.al. | 2503.22655 | translate | read | link |
| 2025-03-28 | Sentiment Classification of Thai Central Bank Press Releases Using Supervised Learning | Stefano Grassi et.al. | 2503.22629 | translate | read | null |
| 2025-03-28 | Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users | Antonia Karamolegkou et.al. | 2503.22610 | translate | read | null |
| 2025-03-28 | On the Alignment of Post-Publication Reviews & Bibliometric and Altmetric Impact – A Case Study on Expert Statements from the Science Media Center Germany | Dirk Tunger et.al. | 2503.22594 | translate | read | null |
| 2025-03-28 | LLM-enabled Instance Model Generation | Fengjunjie Pan et.al. | 2503.22587 | translate | read | null |
| 2025-03-28 | Historical Ink: Exploring Large Language Models for Irony Detection in 19th-Century Spanish | Kevin Cohen et.al. | 2503.22585 | translate | read | link |
| 2025-03-28 | Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation | Sarubi Thillainathan et.al. | 2503.22582 | translate | read | null |
| 2025-03-27 | Video-R1: Reinforcing Video Reasoning in MLLMs | Kaituo Feng et.al. | 2503.21776 | translate | read | link |
| 2025-03-27 | LOCORE: Image Re-ranking with Long-Context Sequence Modeling | Zilin Xiao et.al. | 2503.21772 | translate | read | link |
| 2025-03-27 | MemInsight: Autonomous Memory Augmentation for LLM Agents | Rana Salama et.al. | 2503.21760 | translate | read | null |
| 2025-03-27 | Fwd2Bot: LVLM Visual Token Compression with Double Forward Bottleneck | Adrian Bulat et.al. | 2503.21757 | translate | read | null |
| 2025-03-27 | LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis | Shitian Zhao et.al. | 2503.21749 | translate | read | link |
| 2025-03-27 | CTRL-O: Language-Controllable Object-Centric Visual Representation Learning | Aniket Didolkar et.al. | 2503.21747 | translate | read | null |
| 2025-03-27 | GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics | Arsham Gholamzadeh Khoee et.al. | 2503.21735 | translate | read | null |
| 2025-03-27 | Effective Skill Unlearning through Intervention and Abstention | Yongce Li et.al. | 2503.21730 | translate | read | link |
| 2025-03-27 | Collab: Controlled Decoding using Mixture of Agents for LLM Alignment | Souradip Chakraborty et.al. | 2503.21720 | translate | read | null |
| 2025-03-27 | Enhancing Repository-Level Software Repair via Repository-Aware Knowledge Graphs | Boyang Yang et.al. | 2503.21710 | translate | read | null |
| 2025-03-26 | Mobile-MMLU: A Mobile Intelligence Language Understanding Benchmark | Sondos Mahmoud Bsharat et.al. | 2503.20786 | translate | read | link |
| 2025-03-26 | Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields | Shijie Zhou et.al. | 2503.20776 | translate | read | null |
| 2025-03-26 | MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams | Yanpeng Sun et.al. | 2503.20745 | translate | read | null |
| 2025-03-26 | Dynamic Motion Blending for Versatile Motion Editing | Nan Jiang et.al. | 2503.20724 | translate | read | null |
| 2025-03-26 | From Annotation to Adaptation: Metrics, Synthetic Data, and Aspect Extraction for Aspect-Based Sentiment Analysis with Large Language Models | Nikita Neveditsin et.al. | 2503.20715 | translate | read | null |
| 2025-03-27 | Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy | Yinan Sun et.al. | 2503.20673 | translate | read | null |
| 2025-03-26 | TAMA: A Human-AI Collaborative Thematic Analysis Framework Using Multi-Agent LLMs for Clinical Interviews | Huimin Xu et.al. | 2503.20666 | translate | read | null |
| 2025-03-26 | Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging | Han Wu et.al. | 2503.20641 | translate | read | link |
| 2025-03-26 | Collaborative Storytelling and LLM: A Linguistic Analysis of Automatically-Generated Role-Playing Game Sessions | Alessandro Maisto et.al. | 2503.20623 | translate | read | null |
| 2025-03-26 | What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond | Wenchao Gu et.al. | 2503.20589 | translate | read | null |
| 2025-03-25 | CoLLM: A Large Language Model for Composed Image Retrieval | Chuong Huynh et.al. | 2503.19910 | translate | read | link |
| 2025-03-25 | A Multi-Agent Framework Integrating Large Language Models and Generative AI for Accelerated Metamaterial Design | Jie Tian et.al. | 2503.19889 | translate | read | null |
| 2025-03-25 | CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation | Nengbo Wang et.al. | 2503.19878 | translate | read | null |
| 2025-03-25 | SLA-Awareness for AI-assisted coding | Kishanthan Thangarajah et.al. | 2503.19876 | translate | read | null |
| 2025-03-25 | Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking | Xiaoyu Tian et.al. | 2503.19855 | translate | read | link |
| 2025-03-25 | Towards Online Multi-Modal Social Interaction Understanding | Xinpeng Li et.al. | 2503.19851 | translate | read | null |
| 2025-03-25 | FALCONEye: Finding Answers and Localizing Content in ONE-hour-long videos with multi-modal LLMs | Carlos Plou et.al. | 2503.19850 | translate | read | null |
| 2025-03-25 | A Comparative Analysis of Word Segmentation, Part-of-Speech Tagging, and Named Entity Recognition for Historical Chinese Sources, 1900-1950 | Zhao Fang et.al. | 2503.19844 | translate | read | null |
| 2025-03-25 | SemEval-2025 Task 9: The Food Hazard Detection Challenge | Korbinian Randl et.al. | 2503.19800 | translate | read | null |
| 2025-03-25 | PAVE: Patching and Adapting Video Large Language Models | Zhuoming Liu et.al. | 2503.19794 | translate | read | link |
| 2025-03-24 | SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding | Mingze Xu et.al. | 2503.18943 | translate | read | null |
| 2025-03-24 | Video-T1: Test-Time Scaling for Video Generation | Fangfu Liu et.al. | 2503.18942 | translate | read | link |
| 2025-03-24 | Exploring Training and Inference Scaling Laws in Generative Retrieval | Hongru Cai et.al. | 2503.18941 | translate | read | null |
| 2025-03-24 | Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training | Brian R. Bartoldson et.al. | 2503.18929 | translate | read | link |
| 2025-03-24 | FFN Fusion: Rethinking Sequential Computation in Large Language Models | Akhiad Bercovich et.al. | 2503.18908 | translate | read | null |
| 2025-03-24 | xKV: Cross-Layer SVD for KV-Cache Compression | Chi-Chih Chang et.al. | 2503.18893 | translate | read | link |
| 2025-03-24 | AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration | Zhexuan Wang et.al. | 2503.18891 | translate | read | null |
| 2025-03-24 | Toward building next-generation Geocoding systems: a systematic review | Zhengcong Yin et.al. | 2503.18888 | translate | read | null |
| 2025-03-24 | I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders | Andrey Galichin et.al. | 2503.18878 | translate | read | link |
| 2025-03-24 | Reimagining Memory Access for LLM Inference: Compression-Aware Memory Controller Design | Rui Xie et.al. | 2503.18869 | translate | read | null |
| 2025-03-21 | Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique | Yansi Li et.al. | 2503.17363 | translate | read | null |
| 2025-03-21 | OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement | Yihe Deng et.al. | 2503.17352 | translate | read | link |
| 2025-03-21 | Capturing Individual Human Preferences with Reward Features | André Barreto et.al. | 2503.17338 | translate | read | null |
| 2025-03-21 | Efficient Intent-Based Filtering for Multi-Party Conversations Using Knowledge Distillation from LLMs | Reem Gody et.al. | 2503.17336 | translate | read | null |
| 2025-03-21 | CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities | Yuxuan Zhu et.al. | 2503.17332 | translate | read | link |
| 2025-03-21 | LLM+MAP: Bimanual Robot Task Planning using Large Language Models and Planning Domain Definition Language | Kun Chu et.al. | 2503.17309 | translate | read | null |
| 2025-03-21 | Bugdar: AI-Augmented Secure Code Review for GitHub Pull Requests | John Naulty et.al. | 2503.17302 | translate | read | null |
| 2025-03-21 | CASE – Condition-Aware Sentence Embeddings for Conditional Semantic Textual Similarity Measurement | Gaifan Zhang et.al. | 2503.17279 | translate | read | null |
| 2025-03-21 | SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging | Aladin Djuhera et.al. | 2503.17239 | translate | read | null |
| 2025-03-21 | FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs | Albert Sawczyn et.al. | 2503.17229 | translate | read | null |
| 2025-03-20 | Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models | Yang Sui et.al. | 2503.16419 | translate | read | link |
| 2025-03-20 | The Emperor’s New Clothes in Benchmarking? A Rigorous Examination of Mitigation Strategies for LLM Benchmark Data Contamination | Yifan Sun et.al. | 2503.16402 | translate | read | null |
| 2025-03-20 | Exploring the Hidden Reasoning Process of Large Language Models by Misleading Them | Guanyu Chen et.al. | 2503.16401 | translate | read | null |
| 2025-03-20 | Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation | Yijia Luo et.al. | 2503.16385 | translate | read | link |
| 2025-03-20 | LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images | Leyang Wang et.al. | 2503.16376 | translate | read | null |
| 2025-03-20 | CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners | Yunzhi Yao et.al. | 2503.16356 | translate | read | link |
| 2025-03-20 | LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates | Ying Shen et.al. | 2503.16334 | translate | read | null |
| 2025-03-20 | OmniGeo: Towards a Multimodal Large Language Models for Geospatial Artificial Intelligence | Long Yuan et.al. | 2503.16326 | translate | read | null |
| 2025-03-20 | Bridging Technology and Humanities: Evaluating the Impact of Large Language Models on Social Sciences Research with DeepSeek-R1 | Peiran Gu et.al. | 2503.16304 | translate | read | null |
| 2025-03-20 | Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens | Shuqi Lu et.al. | 2503.16278 | translate | read | link |
| 2025-03-19 | SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks | Yifei Zhou et.al. | 2503.15478 | translate | read | link |
| 2025-03-19 | Cube: A Roblox View of 3D Intelligence | Foundation AI Team et.al. | 2503.15475 | translate | read | link |
| 2025-03-19 | From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment | Jia-Nan Li et.al. | 2503.15463 | translate | read | null |
| 2025-03-19 | Visual Position Prompt for MLLM based Visual Grounding | Wei Tang et.al. | 2503.15426 | translate | read | link |
| 2025-03-19 | Probing the topology of the space of tokens with structured prompts | Michael Robinson et.al. | 2503.15421 | translate | read | null |
| 2025-03-19 | EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models | Yinan Liang et.al. | 2503.15369 | translate | read | null |
| 2025-03-19 | SemEval-2025 Task 1: AdMIRe – Advancing Multimodal Idiomaticity Representation | Thomas Pickard et.al. | 2503.15358 | translate | read | null |
| 2025-03-19 | SPILL: Domain-Adaptive Intent Clustering based on Selection and Pooling with Large Language Models | I-Fan Lin et.al. | 2503.15351 | translate | read | null |
| 2025-03-19 | TruthLens:A Training-Free Paradigm for DeepFake Detection | Ritabrata Chakraborty et.al. | 2503.15342 | translate | read | null |
| 2025-03-19 | Uncertainty-Guided Chain-of-Thought for Code Generation with LLMs | Yuqi Zhu et.al. | 2503.15341 | translate | read | null |
| 2025-03-18 | Aligning Multimodal LLM with Human Preference: A Survey | Tao Yu et.al. | 2503.14504 | translate | read | null |
| 2025-03-18 | Engineering Scientific Assistants using Interactive Structured Induction of Programs | Shraddha Surana et.al. | 2503.14488 | translate | read | null |
| 2025-03-18 | Gricean Norms as a Basis for Effective Collaboration | Fardin Saad et.al. | 2503.14484 | translate | read | null |
| 2025-03-18 | Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM | Xinyu Fang et.al. | 2503.14478 | translate | read | link |
| 2025-03-18 | EnvBench: A Benchmark for Automated Environment Setup | Aleksandra Eliseeva et.al. | 2503.14443 | translate | read | link |
| 2025-03-18 | LLM-FE: Automated Feature Engineering for Tabular Data with LLMs as Evolutionary Optimizers | Nikhil Abhyankar et.al. | 2503.14434 | translate | read | link |
| 2025-03-18 | PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play | Wei Fang et.al. | 2503.14432 | translate | read | null |
| 2025-03-18 | Unifying Text Semantics and Graph Structures for Temporal Text-attributed Graphs with Large Language Models | Siwei Zhang et.al. | 2503.14411 | translate | read | null |
| 2025-03-18 | Large Language Models for Virtual Human Gesture Selection | Parisa Ghanad Torshizi et.al. | 2503.14408 | translate | read | null |
| 2025-03-18 | From “Hallucination” to “Suture”: Insights from Language Philosophy to Enhance Large Language Models | Qiantong Wang et.al. | 2503.14392 | translate | read | null |
| 2025-03-17 | MetaScale: Test-Time Scaling with Evolving Meta-Thoughts | Qin Liu et.al. | 2503.13447 | translate | read | null |
| 2025-03-17 | Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance | Noah Y. Siegel et.al. | 2503.13445 | translate | read | null |
| 2025-03-17 | VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning | Ye Liu et.al. | 2503.13444 | translate | read | null |
| 2025-03-17 | xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference | Maximilian Beck et.al. | 2503.13427 | translate | read | null |
| 2025-03-17 | A Comprehensive Survey on Multi-Agent Cooperative Decision-Making: Scenarios, Approaches, Challenges and Perspectives | Weiqiang Jin et.al. | 2503.13415 | translate | read | null |
| 2025-03-17 | DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective | Dengyun Peng et.al. | 2503.13413 | translate | read | null |
| 2025-03-17 | Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis | Alexander Ku et.al. | 2503.13401 | translate | read | null |
| 2025-03-17 | MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research | James Burgess et.al. | 2503.13399 | translate | read | null |
| 2025-03-17 | Scale Efficient Training for Large Datasets | Qing Zhou et.al. | 2503.13385 | translate | read | null |
| 2025-03-17 | Cream of the Crop: Harvesting Rich, Scalable and Transferable Multi-Modal Data for Instruction Fine-Tuning | Mengyao Lyu et.al. | 2503.13383 | translate | read | null |
| 2025-03-14 | ASMA-Tune: Unlocking LLMs’ Assembly Code Comprehension via Structural-Semantic Instruction Tuning | Xinyi Wang et.al. | 2503.11617 | translate | read | null |
| 2025-03-14 | Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs using Semantic Space | Zhiliang Chen et.al. | 2503.11586 | translate | read | null |
| 2025-03-14 | Synthesizing Access Control Policies using Large Language Models | Adarsh Vatsa et.al. | 2503.11573 | translate | read | null |
| 2025-03-14 | Implicit Bias-Like Patterns in Reasoning Models | Messi H. J. Lee et.al. | 2503.11572 | translate | read | null |
| 2025-03-14 | VERIFY: A Benchmark of Visual Explanation and Reasoning for Investigating Multimodal Reasoning Fidelity | Jing Bi et.al. | 2503.11557 | translate | read | null |
| 2025-03-14 | Potential of large language model-powered nudges for promoting daily water and energy conservation | Zonghan Li et.al. | 2503.11531 | translate | read | null |
| 2025-03-14 | HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models | Ziqin Zhou et.al. | 2503.11513 | translate | read | null |
| 2025-03-14 | V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning | Zixu Cheng et.al. | 2503.11495 | translate | read | null |
| 2025-03-14 | A Review of DeepSeek Models’ Key Innovative Techniques | Chengen Wang et.al. | 2503.11486 | translate | read | null |
| 2025-03-14 | T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation | Seyed Mohammad Hadi Hosseini et.al. | 2503.11481 | translate | read | null |
| 2025-03-13 | GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing | Rongyao Fang et.al. | 2503.10639 | translate | read | link |
| 2025-03-13 | HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model | Jiaming Liu et.al. | 2503.10631 | translate | read | null |
| 2025-03-13 | UniGoal: Towards Universal Zero-shot Goal-oriented Navigation | Hang Yin et.al. | 2503.10630 | translate | read | null |
| 2025-03-13 | DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding | Ayesha Ishaq et.al. | 2503.10621 | translate | read | link |
| 2025-03-13 | From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM | Kshitij Ambilduke et.al. | 2503.10620 | translate | read | null |
| 2025-03-13 | Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search | Andy Zhou et.al. | 2503.10619 | translate | read | null |
| 2025-03-13 | Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models | Andy Zhou et.al. | 2503.10617 | translate | read | null |
| 2025-03-13 | R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization | Yi Yang et.al. | 2503.10615 | translate | read | link |
| 2025-03-13 | CoSTA $\ast$ : Cost-Sensitive Toolpath Agent for Multi-turn Image Editing | Advait Gupta et.al. | 2503.10613 | translate | read | link |
| 2025-03-13 | TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention | Jinhao Duan et.al. | 2503.10602 | translate | read | link |
| 2025-03-12 | MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System | Jihao Zhao et.al. | 2503.09600 | translate | read | null |
| 2025-03-12 | How to Protect Yourself from 5G Radiation? Investigating LLM Responses to Implicit Misinformation | Ruohao Guo et.al. | 2503.09598 | translate | read | null |
| 2025-03-12 | SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment | Katrin Renz et.al. | 2503.09594 | translate | read | null |
| 2025-03-12 | BIMBA: Selective-Scan Compression for Long-Range Video Question Answering | Md Mohaiminul Islam et.al. | 2503.09590 | translate | read | null |
| 2025-03-12 | Cost-Optimal Grouped-Query Attention for Long-Context LLMs | Yingfa Chen et.al. | 2503.09579 | translate | read | link |
| 2025-03-12 | Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks | Lutfi Eren Erdogan et.al. | 2503.09572 | translate | read | null |
| 2025-03-12 | Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models | Qiguang Chen et.al. | 2503.09567 | translate | read | null |
| 2025-03-12 | Large Language Models for Multi-Facility Location Mechanism Design | Nguyen Thach et.al. | 2503.09533 | translate | read | null |
| 2025-03-12 | Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | Bowen Jin et.al. | 2503.09516 | translate | read | null |
| 2025-03-12 | ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning | Ziyu Wan et.al. | 2503.09501 | translate | read | null |
| 2025-03-11 | Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs | Ariba Khan et.al. | 2503.08688 | translate | read | null |
| 2025-03-11 | OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models | Jialv Zou et.al. | 2503.08686 | translate | read | null |
| 2025-03-11 | Self-Taught Self-Correction for Small Language Models | Viktor Moskvoretskii et.al. | 2503.08681 | translate | read | null |
| 2025-03-11 | Exploring the Word Sense Disambiguation Capabilities of Large Language Models | Pierpaolo Basile et.al. | 2503.08662 | translate | read | null |
| 2025-03-11 | LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization | Xianfeng Wu et.al. | 2503.08619 | translate | read | null |
| 2025-03-11 | EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments | Dongping Li et.al. | 2503.08604 | translate | read | null |
| 2025-03-11 | NSF-SciFy: Mining the NSF Awards Database for Scientific Claims | Delip Rao et.al. | 2503.08600 | translate | read | null |
| 2025-03-11 | HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding | Shehreen Azad et.al. | 2503.08585 | translate | read | null |
| 2025-03-11 | RAG-Adapter: A Plug-and-Play RAG-enhanced Framework for Long Video Understanding | Xichen Tan et.al. | 2503.08576 | translate | read | null |
| 2025-03-11 | DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process | Minjun Zhu et.al. | 2503.08569 | translate | read | null |
| 2025-03-10 | Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru | Dunant Cusipuma et.al. | 2503.07587 | translate | read | null |
| 2025-03-10 | Talking to GDELT Through Knowledge Graphs | Audun Myers et.al. | 2503.07584 | translate | read | null |
| 2025-03-10 | AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning | Yangzhe Kong et.al. | 2503.07557 | translate | read | null |
| 2025-03-10 | Junior Software Developers’ Perspectives on Adopting LLMs for Software Engineering: a Systematic Literature Review | Samuel Ferino et.al. | 2503.07556 | translate | read | null |
| 2025-03-10 | KSOD: Knowledge Supplement for LLMs On Demand | Haoran Li et.al. | 2503.07550 | translate | read | null |
| 2025-03-10 | Bi-Directional Mental Model Reconciliation for Human-Robot Interaction with Large Language Models | Nina Moorman et.al. | 2503.07547 | translate | read | null |
| 2025-03-10 | Queueing, Predictions, and LLMs: Challenges and Open Problems | Michael Mitzenmacher et.al. | 2503.07545 | translate | read | null |
| 2025-03-10 | XIFBench: Evaluating Large Language Models on Multilingual Instruction Following | Zhenyu Li et.al. | 2503.07539 | translate | read | null |
| 2025-03-10 | TokenButler: Token Importance is Predictable | Yash Akhauri et.al. | 2503.07518 | translate | read | null |
| 2025-03-10 | Language Models Fail to Introspect About Their Knowledge of Language | Siyuan Song et.al. | 2503.07513 | translate | read | null |
| 2025-03-10 | LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition? | Bangyan Li et.al. | 2503.07487 | translate | read | null |
| 2025-03-10 | GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models | Ryugo Morita et.al. | 2503.07463 | translate | read | null |
| 2025-03-10 | MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning | Xiangru Tang et.al. | 2503.07459 | translate | read | null |
| 2025-03-10 | LLMs syntactically adapt their language use to their conversational partner | Florian Kandra et.al. | 2503.07457 | translate | read | null |
| 2025-03-10 | From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development – An Opinion Paper | Sargam Yadav et.al. | 2503.07450 | translate | read | null |
| 2025-03-10 | From Text to Visuals: Using LLMs to Generate Math Diagrams with Vector Graphics | Jaewook Lee et.al. | 2503.07429 | translate | read | null |
| 2025-03-10 | RePO: ReLU-based Preference Optimization | Junkang Wu et.al. | 2503.07426 | translate | read | null |
| 2025-03-10 | REF-VLM: Triplet-Based Referring Paradigm for Unified Visual Decoding | Yan Tai et.al. | 2503.07413 | translate | read | link |
| 2025-03-10 | Revisiting Noise in Natural Language Processing for Computational Social Science | Nadav Borenstein et.al. | 2503.07395 | translate | read | null |
| 2025-03-10 | Process-Supervised LLM Recommenders via Flow-guided Tuning | Chongming Gao et.al. | 2503.07377 | translate | read | null |
| 2025-03-07 | Understanding the Limits of Lifelong Knowledge Editing in LLMs | Lukas Thede et.al. | 2503.05683 | translate | read | null |
| 2025-03-07 | A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval | Yu Zhang et.al. | 2503.05659 | translate | read | null |
| 2025-03-07 | Learning LLM Preference over Intra-Dialogue Pairs: A Framework for Utterance-level Understandings | Xuanqing Liu et.al. | 2503.05620 | translate | read | null |
| 2025-03-07 | A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models | Dong Shu et.al. | 2503.05613 | translate | read | null |
| 2025-03-07 | R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning | Huatong Song et.al. | 2503.05592 | translate | read | null |
| 2025-03-07 | Evaluating open-source Large Language Models for automated fact-checking | Nicolo’ Fontana et.al. | 2503.05565 | translate | read | null |
| 2025-03-07 | Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance | Bryan Etzine et.al. | 2503.05551 | translate | read | null |
| 2025-03-07 | Leveraging Approximate Caching for Faster Retrieval-Augmented Generation | Shai Bergman et.al. | 2503.05530 | translate | read | null |
| 2025-03-07 | PoSSUM: A Protocol for Surveying Social-media Users with Multimodal LLMs | Roberto Cerina et.al. | 2503.05529 | translate | read | null |
| 2025-03-07 | Cognitive Bias Detection Using Advanced Prompt Engineering | Frederic Lemieux et.al. | 2503.05516 | translate | read | null |
| 2025-03-06 | L $^2$ M: Mutual Information Scaling Law for Long-Context Language Modeling | Zhuo Chen et.al. | 2503.04725 | translate | read | null |
| 2025-03-06 | Shifting Long-Context LLMs Research from Input to Output | Yuhao Wu et.al. | 2503.04723 | translate | read | null |
| 2025-03-06 | Enough Coin Flips Can Make LLMs Act Bayesian | Ritwik Gupta et.al. | 2503.04722 | translate | read | null |
| 2025-03-06 | Predictable Scale: Part I – Optimal Hyperparameter Scaling Law in Large Language Model Pretraining | Houyi Li et.al. | 2503.04715 | translate | read | null |
| 2025-03-06 | Universality of Layer-Level Entropy-Weighted Quantization Beyond Model Architecture and Size | Alireza Behtash et.al. | 2503.04704 | translate | read | null |
| 2025-03-06 | UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets | Wenyu Wang et.al. | 2503.04693 | translate | read | null |
| 2025-03-06 | Quantifying the Reasoning Abilities of LLMs on Real-world Clinical Cases | Pengcheng Qiu et.al. | 2503.04691 | translate | read | null |
| 2025-03-06 | LLM-guided Plan and Retrieval: A Strategic Alignment for Interpretable User Satisfaction Estimation in Dialogue | Sangyeop Kim et.al. | 2503.04675 | translate | read | null |
| 2025-03-06 | RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining | Tengfei Zhang et.al. | 2503.04653 | translate | read | null |
| 2025-03-06 | Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment | Wen Yang et.al. | 2503.04647 | translate | read | null |
| 2025-03-05 | The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems | Richard Ren et.al. | 2503.03750 | translate | read | null |
| 2025-03-05 | Process-based Self-Rewarding Language Models | Shimao Zhang et.al. | 2503.03746 | translate | read | null |
| 2025-03-05 | Towards Understanding Distilled Reasoning Models: A Representational Approach | David D. Baek et.al. | 2503.03730 | translate | read | null |
| 2025-03-05 | Improving LLM Safety Alignment with Dual-Objective Optimization | Xuandong Zhao et.al. | 2503.03710 | translate | read | null |
| 2025-03-05 | Effective LLM Knowledge Learning via Model Generalization | Mingkang Zhu et.al. | 2503.03705 | translate | read | null |
| 2025-03-05 | A Practical Memory Injection Attack against LLM Agents | Shen Dong et.al. | 2503.03704 | translate | read | null |
| 2025-03-05 | Developing and Utilizing a Large-Scale Cantonese Dataset for Multi-Tasking in Large Language Models | Jiyue Jiang et.al. | 2503.03702 | translate | read | null |
| 2025-03-05 | Addressing Overprescribing Challenges: Fine-Tuning Large Language Models for Medication Recommendation Tasks | Zihao Zhao et.al. | 2503.03687 | translate | read | null |
| 2025-03-05 | Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models | Bar Karov et.al. | 2503.03669 | translate | read | null |
| 2025-03-05 | Analogical Reasoning Inside Large Language Models: Concept Vectors and the Limits of Abstraction | Gustaw Opiełka et.al. | 2503.03666 | translate | read | null |
| 2025-03-04 | Wikipedia in the Era of LLMs: Evolution and Risks | Siming Huang et.al. | 2503.02879 | translate | read | null |
| 2025-03-04 | The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models | Ke Ji et.al. | 2503.02875 | translate | read | null |
| 2025-03-04 | Prompting Generative AI with Interaction-Augmented Instructions | Leixian Shen et.al. | 2503.02874 | translate | read | null |
| 2025-03-04 | FairSense-AI: Responsible AI Meets Sustainability | Shaina Raza et.al. | 2503.02865 | translate | read | null |
| 2025-03-04 | Calibrating LLM Confidence with Semantic Steering: A Multi-Prompt Aggregation Framework | Ziang Zhou et.al. | 2503.02863 | translate | read | null |
| 2025-03-04 | Privacy and Accuracy-Aware AI/ML Model Deduplication | Hong Guan et.al. | 2503.02862 | translate | read | null |
| 2025-03-04 | Shakespearean Sparks: The Dance of Hallucination and Creativity in LLMs’ Decoding Layers | Zicong He et.al. | 2503.02851 | translate | read | null |
| 2025-03-04 | Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs | Yuzhe Gu et.al. | 2503.02846 | translate | read | null |
| 2025-03-04 | AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation | Songming Zhang et.al. | 2503.02832 | translate | read | null |
| 2025-03-04 | Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression | Nathan Godey et.al. | 2503.02812 | translate | read | null |
| 2025-03-03 | ECLeKTic: a Novel Challenge Set for Evaluation of Cross-Lingual Knowledge Transfer | Omer Goldman et.al. | 2502.21228 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)