LLM - 2025-05
LLM - 2025-05
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-05-30 | MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning | Yiqing Liang et.al. | 2505.24871 | translate | read | link |
| 2025-05-30 | SiLVR: A Simple Language-based Video Reasoning Framework | Ce Zhang et.al. | 2505.24869 | translate | read | link |
| 2025-05-30 | ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models | Mingjie Liu et.al. | 2505.24864 | translate | read | null |
| 2025-05-30 | MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning | Jingyan Shen et.al. | 2505.24846 | translate | read | null |
| 2025-05-30 | Chameleon: A Flexible Data-mixing Framework for Language Model Pretraining and Finetuning | Wanyun Xie et.al. | 2505.24844 | translate | read | null |
| 2025-05-30 | Vision LLMs Are Bad at Hierarchical Visual Understanding, and LLMs Are the Bottleneck | Yuwen Tan et.al. | 2505.24840 | translate | read | null |
| 2025-05-30 | VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software | Brandon Man et.al. | 2505.24838 | translate | read | link |
| 2025-05-30 | Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs | Juraj Vladika et.al. | 2505.24830 | translate | read | null |
| 2025-05-30 | LegalEval-Q: A New Benchmark for The Quality Evaluation of LLM-Generated Legal Text | Li yunhan et.al. | 2505.24826 | translate | read | null |
| 2025-05-30 | PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models | Yinggan Xu et.al. | 2505.24823 | translate | read | null |
| 2025-05-29 | Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought | Yunze Man et.al. | 2505.23766 | translate | read | null |
| 2025-05-29 | From Chat Logs to Collective Insights: Aggregative Question Answering | Wentao Zhang et.al. | 2505.23765 | translate | read | null |
| 2025-05-29 | MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence | Sihan Yang et.al. | 2505.23764 | translate | read | null |
| 2025-05-29 | Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch | Aneeshan Sain et.al. | 2505.23763 | translate | read | null |
| 2025-05-29 | Puzzled by Puzzles: When Vision-Language Models Can’t Take a Hint | Heekyung Lee et.al. | 2505.23759 | translate | read | link |
| 2025-05-29 | DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning | Ziyin Zhang et.al. | 2505.23754 | translate | read | link |
| 2025-05-29 | ThinkGeo: Evaluating Tool-Augmented Agents for Remote Sensing Tasks | Akashah Shabbir et.al. | 2505.23752 | translate | read | link |
| 2025-05-29 | Distortion of AI Alignment: Does Preference Optimization Optimize for Preferences? | Paul Gölz et.al. | 2505.23749 | translate | read | null |
| 2025-05-29 | Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence | Diankun Wu et.al. | 2505.23747 | translate | read | link |
| 2025-05-29 | Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time | Mohamad Chehade et.al. | 2505.23729 | translate | read | null |
| 2025-05-28 | Zero-Shot Vision Encoder Grafting via LLM Surrogates | Kaiyu Yue et.al. | 2505.22664 | translate | read | link |
| 2025-05-28 | AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models | Feng Luo et.al. | 2505.22662 | translate | read | null |
| 2025-05-28 | GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning | Qingchen Yu et.al. | 2505.22661 | translate | read | link |
| 2025-05-28 | 3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model | Wenbo Hu et.al. | 2505.22657 | translate | read | null |
| 2025-05-28 | Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents | Michael Kirchhof et.al. | 2505.22655 | translate | read | null |
| 2025-05-28 | The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason | Ang Lv et.al. | 2505.22653 | translate | read | link |
| 2025-05-28 | Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese | Hanjia Lyu et.al. | 2505.22645 | translate | read | link |
| 2025-05-28 | Learning Composable Chains-of-Thought | Fangcong Yin et.al. | 2505.22635 | translate | read | null |
| 2025-05-28 | Spatial Knowledge Graph-Guided Multimodal Synthesis | Yida Xue et.al. | 2505.22633 | translate | read | null |
| 2025-05-28 | Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs | Ziling Cheng et.al. | 2505.22630 | translate | read | null |
| 2025-05-27 | Silence is Not Consensus: Disrupting Agreement Bias in Multi-Agent LLMs via Catfish Agent for Clinical Decision Making | Yihan Wang et.al. | 2505.21503 | translate | read | null |
| 2025-05-27 | Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment | Xiaojun Jia et.al. | 2505.21494 | translate | read | null |
| 2025-05-27 | Reinforcing General Reasoning without Verifiers | Xiangxin Zhou et.al. | 2505.21493 | translate | read | null |
| 2025-05-27 | Robust Hypothesis Generation: LLM-Automated Language Bias for Inductive Logic Programming | Yang Yang et.al. | 2505.21486 | translate | read | null |
| 2025-05-27 | Are Language Models Consequentialist or Deontological Moral Reasoners? | Keenan Samway et.al. | 2505.21479 | translate | read | null |
| 2025-05-27 | Policy Optimized Text-to-Image Pipeline Design | Uri Gadot et.al. | 2505.21478 | translate | read | null |
| 2025-05-27 | Scaling External Knowledge Input Beyond Context Windows of LLMs via Multi-Agent Collaboration | Zijun Liu et.al. | 2505.21471 | translate | read | link |
| 2025-05-27 | Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance | Shintaro Ozaki et.al. | 2505.21458 | translate | read | null |
| 2025-05-27 | Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO | Muzhi Zhu et.al. | 2505.21457 | translate | read | null |
| 2025-05-27 | Can Large Reasoning Models Self-Train? | Sheikh Shafayat et.al. | 2505.21444 | translate | read | null |
| 2025-05-26 | Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs | Hanting Chen et.al. | 2505.20155 | translate | read | null |
| 2025-05-26 | UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter-Efficient Fine-Tuning of Large Models | Xueyan Zhang et.al. | 2505.20154 | translate | read | null |
| 2025-05-26 | MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents | Ziming Wei et.al. | 2505.20148 | translate | read | null |
| 2025-05-26 | FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities | Jin Wang et.al. | 2505.20147 | translate | read | null |
| 2025-05-26 | StructEval: Benchmarking LLMs’ Capabilities to Generate Structural Outputs | Jialin Yang et.al. | 2505.20139 | translate | read | null |
| 2025-05-26 | Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers | Zhengliang Shi et.al. | 2505.20128 | translate | read | null |
| 2025-05-26 | Agentic AI Process Observability: Discovering Behavioral Variability | Fabiana Fournier et.al. | 2505.20127 | translate | read | null |
| 2025-05-26 | TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent | Dominik Meier et.al. | 2505.20118 | translate | read | null |
| 2025-05-26 | Named Entity Recognition in Historical Italian: The Case of Giacomo Leopardi’s Zibaldone | Cristian Santini et.al. | 2505.20113 | translate | read | null |
| 2025-05-26 | ResSVD: Residual Compensated SVD for Large Language Model Compression | Haolei Bai et.al. | 2505.20112 | translate | read | null |
| 2025-05-26 | Language-Agnostic Suicidal Risk Detection Using Large Language Models | June-Woo Kim et.al. | 2505.20109 | translate | read | null |
| 2025-05-26 | Adaptive Deep Reasoning: Triggering Deep Thinking When Needed | Yunhao Wang et.al. | 2505.20101 | translate | read | null |
| 2025-05-23 | Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs | Wafa Alghallabi et.al. | 2505.18152 | translate | read | null |
| 2025-05-23 | First Finish Search: Efficient Test-Time Scaling in Large Language Models | Aradhye Agarwal et.al. | 2505.18149 | translate | read | null |
| 2025-05-23 | Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find | Owen Bianchi et.al. | 2505.18148 | translate | read | null |
| 2025-05-23 | Gaming Tool Preferences in Agentic LLMs | Kazem Faghih et.al. | 2505.18135 | translate | read | link |
| 2025-05-23 | Reward Model Overoptimisation in Iterated RLHF | Lorenz Wolf et.al. | 2505.18126 | translate | read | null |
| 2025-05-23 | UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification | Poojah Ganesan et.al. | 2505.18122 | translate | read | null |
| 2025-05-23 | ProgRM: Build Better GUI Agents with Progress Rewards | Danyang Zhang et.al. | 2505.18121 | translate | read | null |
| 2025-05-23 | Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models | Jiongran Wu et.al. | 2505.18120 | translate | read | null |
| 2025-05-23 | Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM | Zinuo Li et.al. | 2505.18110 | translate | read | null |
| 2025-05-23 | ManuSearch: Democratizing Deep Search in Large Language Models with a Transparent and Open Multi-Agent Framework | Lisheng Huang et.al. | 2505.18105 | translate | read | null |
| 2025-05-22 | CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms | Shilin Yan et.al. | 2505.17020 | translate | read | link |
| 2025-05-22 | Let Androids Dream of Electric Sheep: A Human-like Image Implication Understanding and Reasoning Framework | Chenhao Zhang et.al. | 2505.17019 | translate | read | link |
| 2025-05-22 | SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward | Kaixuan Fan et.al. | 2505.17018 | translate | read | link |
| 2025-05-22 | Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO | Chengzhuo Tong et.al. | 2505.17017 | translate | read | link |
| 2025-05-22 | Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models | Runsen Xu et.al. | 2505.17015 | translate | read | link |
| 2025-05-22 | SpatialScore: Towards Unified Evaluation for Multimodal Spatial Understanding | Haoning Wu et.al. | 2505.17012 | translate | read | link |
| 2025-05-22 | R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning | Huatong Song et.al. | 2505.17005 | translate | read | link |
| 2025-05-22 | Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? | Jin Jiang et.al. | 2505.16998 | translate | read | link |
| 2025-05-22 | DecoupledESC: Enhancing Emotional Support Generation via Strategy-Response Decoupled Preference Optimization | Chao Zhang et.al. | 2505.16995 | translate | read | null |
| 2025-05-22 | Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding | Runpeng Yu et.al. | 2505.16990 | translate | read | link |
| 2025-05-21 | The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation | Patrick Kahardipraja et.al. | 2505.15807 | translate | read | null |
| 2025-05-21 | Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering | Hwan Chang et.al. | 2505.15805 | translate | read | null |
| 2025-05-21 | STAR-R1: Spacial TrAnsformation Reasoning by Reinforcing Multimodal LLMs | Zongzhao Li et.al. | 2505.15804 | translate | read | null |
| 2025-05-21 | VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models | Yuchen Yan et.al. | 2505.15801 | translate | read | null |
| 2025-05-21 | Reverse Engineering Human Preferences with Reinforcement Learning | Lisa Alazraki et.al. | 2505.15795 | translate | read | null |
| 2025-05-21 | HCRMP: A LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving | Zhiwen Chen et.al. | 2505.15793 | translate | read | null |
| 2025-05-21 | Large Language Models as Computable Approximations to Solomonoff Induction | Jun Wan et.al. | 2505.15784 | translate | read | null |
| 2025-05-21 | ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning | Changtai Zhu et.al. | 2505.15776 | translate | read | null |
| 2025-05-21 | Beyond Hard and Soft: Hybrid Context Compression for Balancing Local and Global Information Retention | Huanxuan Liao et.al. | 2505.15774 | translate | read | null |
| 2025-05-21 | MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling | Cheng Yifan et.al. | 2505.15772 | translate | read | null |
| 2025-05-20 | Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning | Haolei Xu et.al. | 2505.14684 | translate | read | null |
| 2025-05-20 | UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation | Rui Tian et.al. | 2505.14682 | translate | read | null |
| 2025-05-20 | UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models | Xiaojie Gu et.al. | 2505.14679 | translate | read | null |
| 2025-05-20 | Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning | Jiaer Xia et.al. | 2505.14677 | translate | read | null |
| 2025-05-20 | Reward Reasoning Model | Jiaxin Guo et.al. | 2505.14674 | translate | read | null |
| 2025-05-20 | Quartet: Native FP4 Training Can Be Optimal for Large Language Models | Roberto L. Castro et.al. | 2505.14669 | translate | read | null |
| 2025-05-20 | ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions | Bufang Yang et.al. | 2505.14668 | translate | read | null |
| 2025-05-20 | Beyond Words: Multimodal LLM Knows When to Speak | Zikai Liao et.al. | 2505.14654 | translate | read | null |
| 2025-05-20 | General-Reasoner: Advancing LLM Reasoning Across All Domains | Xueguang Ma et.al. | 2505.14652 | translate | read | null |
| 2025-05-20 | Think Only When You Need with Large Hybrid-Reasoning Models | Lingjie Jiang et.al. | 2505.14631 | translate | read | null |
| 2025-05-19 | CIE: Controlling Language Model Text Generations Using Continuous Signals | Vinay Samuel et.al. | 2505.13448 | translate | read | link |
| 2025-05-19 | Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards | Xiaoyuan Liu et.al. | 2505.13445 | translate | read | null |
| 2025-05-19 | Optimizing Anytime Reasoning via Budget Relative Policy Optimization | Penghui Qi et.al. | 2505.13438 | translate | read | link |
| 2025-05-19 | SMOTExT: SMOTE meets Large Language Models | Mateusz Bystroński et.al. | 2505.13434 | translate | read | null |
| 2025-05-19 | Fine-tuning Quantized Neural Networks with Zeroth-order Optimization | Sifeng Shang et.al. | 2505.13430 | translate | read | null |
| 2025-05-19 | Understanding Complexity in VideoQA via Visual Program Generation | Cristobal Eyzaguirre et.al. | 2505.13429 | translate | read | null |
| 2025-05-19 | MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision | Lingxiao Du et.al. | 2505.13427 | translate | read | link |
| 2025-05-19 | Learnware of Language Models: Specialized Small Language Models Can Do Big | Zhi-Hao Tan et.al. | 2505.13425 | translate | read | null |
| 2025-05-19 | Make Still Further Progress: Chain of Thoughts for Tabular Data Leaderboard | Si-Yang Liu et.al. | 2505.13421 | translate | read | null |
| 2025-05-19 | FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning | Zhuozhao Hu et.al. | 2505.13419 | translate | read | link |
| 2025-05-16 | Modeling cognitive processes of natural reading with transformer-based Language Models | Bruno Bianchi et.al. | 2505.11485 | translate | read | null |
| 2025-05-16 | msf-CNN: Patch-based Multi-Stage Fusion with Convolutional Neural Networks for TinyML | Zhaolan Huang et.al. | 2505.11483 | translate | read | null |
| 2025-05-16 | Improving Assembly Code Performance with Large Language Models via Reinforcement Learning | Anjiang Wei et.al. | 2505.11480 | translate | read | null |
| 2025-05-16 | HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages | Zhilin Wang et.al. | 2505.11475 | translate | read | null |
| 2025-05-16 | Disentangling Reasoning and Knowledge in Medical Large Language Models | Rahul Thapa et.al. | 2505.11462 | translate | read | null |
| 2025-05-16 | ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks | Zhixiong Zhuang et.al. | 2505.11459 | translate | read | null |
| 2025-05-16 | HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation | Shaina Raza et.al. | 2505.11454 | translate | read | null |
| 2025-05-16 | LLMs unlock new paths to monetizing exploits | Nicholas Carlini et.al. | 2505.11449 | translate | read | null |
| 2025-05-16 | Is Compression Really Linear with Code Intelligence? | Xianzhen Luo et.al. | 2505.11441 | translate | read | null |
| 2025-05-16 | GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art | Chenkai Zhang et.al. | 2505.11436 | translate | read | null |
| 2025-05-15 | End-to-End Vision Tokenizer Tuning | Wenxuan Wang et.al. | 2505.10562 | translate | read | null |
| 2025-05-15 | Neural Thermodynamic Laws for Large Language Model Training | Ziming Liu et.al. | 2505.10559 | translate | read | null |
| 2025-05-15 | MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning | Ke Wang et.al. | 2505.10557 | translate | read | link |
| 2025-05-15 | Does Feasibility Matter? Understanding the Impact of Feasibility on Synthetic Training Data | Yiwen Liu et.al. | 2505.10551 | translate | read | link |
| 2025-05-15 | Towards a Deeper Understanding of Reasoning Capabilities in Large Language Models | Annie Wong et.al. | 2505.10543 | translate | read | link |
| 2025-05-15 | Exploring Implicit Visual Misunderstandings in Multimodal Large Language Models through Attention Analysis | Pengfei Wang et.al. | 2505.10541 | translate | read | link |
| 2025-05-15 | S3C2 Summit 2024-09: Industry Secure Software Supply Chain Summit | Imranur Rahman et.al. | 2505.10538 | translate | read | null |
| 2025-05-15 | RouteNator: A Router-Based Multi-Modal Architecture for Generating Synthetic Training Data for Function Calling LLMs | Vibha Belavadi et.al. | 2505.10495 | translate | read | null |
| 2025-05-15 | Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective | Yutao Mou et.al. | 2505.10494 | translate | read | link |
| 2025-05-15 | CL-RAG: Bridging the Gap in Retrieval-Augmented Generation with Curriculum Learning | Shaohan Wang et.al. | 2505.10493 | translate | read | null |
| 2025-05-14 | Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors | Nicolas Dupuis et.al. | 2505.09610 | translate | read | null |
| 2025-05-14 | Adversarial Suffix Filtering: a Defense Pipeline for LLMs | David Khachaturov et.al. | 2505.09602 | translate | read | null |
| 2025-05-14 | How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference | Nidhal Jegham et.al. | 2505.09598 | translate | read | null |
| 2025-05-14 | WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models | Abdullah Mushtaq et.al. | 2505.09595 | translate | read | null |
| 2025-05-14 | Variational Visual Question Answering | Tobias Jan Wieczorek et.al. | 2505.09591 | translate | read | null |
| 2025-05-14 | Beyond Likes: How Normative Feedback Complements Engagement Signals on Social Media | Yuchen Wu et.al. | 2505.09583 | translate | read | null |
| 2025-05-14 | Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach | Shannon Lodoen et.al. | 2505.09576 | translate | read | null |
| 2025-05-14 | MIGRATION-BENCH: Repository-Level Code Migration Benchmark from Java 8 | Linbo Liu et.al. | 2505.09569 | translate | read | null |
| 2025-05-14 | PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning | Zongqian Li et.al. | 2505.09519 | translate | read | null |
| 2025-05-14 | Layered Unlearning for Adversarial Relearning | Timothy Qian et.al. | 2505.09500 | translate | read | link |
| 2025-05-13 | CodePDE: An Inference Framework for LLM-driven PDE Solver Generation | Shanda Li et.al. | 2505.08783 | translate | read | null |
| 2025-05-13 | HealthBench: Evaluating Large Language Models Towards Improved Human Health | Rahul K. Arora et.al. | 2505.08775 | translate | read | link |
| 2025-05-14 | Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology | Yatai Ji et.al. | 2505.08765 | translate | read | null |
| 2025-05-13 | AC-Reason: Towards Theory-Guided Actual Causality Reasoning with Large Language Models | Yanxi Zhang et.al. | 2505.08750 | translate | read | null |
| 2025-05-13 | DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models | Xiaoyang Chen et.al. | 2505.08744 | translate | read | link |
| 2025-05-13 | Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies | Xiaoliang Luo et.al. | 2505.08739 | translate | read | null |
| 2025-05-13 | NurValues: Real-World Nursing Values Evaluation for Large Language Models in Clinical Context | Ben Yao et.al. | 2505.08734 | translate | read | null |
| 2025-05-13 | Securing RAG: A Risk Assessment and Mitigation Framework | Lukas Ammann et.al. | 2505.08728 | translate | read | null |
| 2025-05-13 | PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts | Yang Su et.al. | 2505.08719 | translate | read | null |
| 2025-05-13 | LLM-based Prompt Ensemble for Reliable Medical Entity Recognition from EHRs | K M Sajjadul Islam et.al. | 2505.08704 | translate | read | null |
| 2025-05-12 | A Comparative Analysis of Static Word Embeddings for Hungarian | Máté Gedeon et.al. | 2505.07809 | translate | read | null |
| 2025-05-12 | Learning Dynamics in Continual Pre-Training for Large Language Models | Xingjin Wang et.al. | 2505.07796 | translate | read | null |
| 2025-05-12 | Domain Regeneration: How well do LLMs match syntactic properties of text domains? | Da Ju et.al. | 2505.07784 | translate | read | null |
| 2025-05-12 | Relative Overfitting and Accept-Reject Framework | Yanxin Liu et.al. | 2505.07783 | translate | read | null |
| 2025-05-12 | MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering | Rushi Qiang et.al. | 2505.07782 | translate | read | null |
| 2025-05-12 | Must Read: A Systematic Survey of Computational Persuasion | Nimet Beyza Bozdag et.al. | 2505.07775 | translate | read | null |
| 2025-05-12 | Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving | Xinji Mai et.al. | 2505.07773 | translate | read | link |
| 2025-05-12 | Enhancing Code Generation via Bidirectional Comment-Level Mutual Grounding | Yifeng Di et.al. | 2505.07768 | translate | read | null |
| 2025-05-12 | Assessing the Chemical Intelligence of Large Language Models | Nicholas T. Runcie et.al. | 2505.07735 | translate | read | null |
| 2025-05-12 | Spoken Language Understanding on Unseen Tasks With In-Context Learning | Neeraj Agrawal et.al. | 2505.07731 | translate | read | null |
| 2025-05-09 | From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling | Vahid Rahimzadeh et.al. | 2505.06184 | translate | read | null |
| 2025-05-09 | A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows | Linjiang Cao et.al. | 2505.06178 | translate | read | null |
| 2025-05-09 | MonetGPT: Solving Puzzles Enhances MLLMs’ Image Retouching Skills | Niladri Shekhar Dutt et.al. | 2505.06176 | translate | read | null |
| 2025-05-09 | Turbo-ICL: In-Context Learning-Based Turbo Equalization | Zihang Song et.al. | 2505.06175 | translate | read | null |
| 2025-05-09 | A Scaling Law for Token Efficiency in LLM Fine-Tuning Under Fixed Compute Budgets | Ryan Lagasse et.al. | 2505.06150 | translate | read | null |
| 2025-05-09 | Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study | Faeze Ghorbanpour et.al. | 2505.06149 | translate | read | null |
| 2025-05-09 | LLMs Get Lost In Multi-Turn Conversation | Philippe Laban et.al. | 2505.06120 | translate | read | link |
| 2025-05-09 | Multimodal Sentiment Analysis on CMU-MOSEI Dataset using Transformer-based Models | Jugal Gajjar et.al. | 2505.06110 | translate | read | null |
| 2025-05-09 | LLMs Outperform Experts on Challenging Biology Benchmarks | Lennart Justen et.al. | 2505.06108 | translate | read | null |
| 2025-05-09 | Free and Fair Hardware: A Pathway to Copyright Infringement-Free Verilog Generation using LLMs | Sam Bush et.al. | 2505.06096 | translate | read | null |
| 2025-05-08 | Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation | Chao Liao et.al. | 2505.05472 | translate | read | null |
| 2025-05-08 | Flow-GRPO: Training Flow Matching Models via Online RL | Jie Liu et.al. | 2505.05470 | translate | read | link |
| 2025-05-08 | Generating Physically Stable and Buildable LEGO Designs from Text | Ava Pun et.al. | 2505.05469 | translate | read | link |
| 2025-05-08 | StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant | Haibo Wang et.al. | 2505.05467 | translate | read | null |
| 2025-05-08 | ComPO: Preference Alignment via Comparison Oracles | Peter Chen et.al. | 2505.05465 | translate | read | null |
| 2025-05-08 | Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging | Shiqi Chen et.al. | 2505.05464 | translate | read | link |
| 2025-05-08 | UKElectionNarratives: A Dataset of Misleading Narratives Surrounding Recent UK General Elections | Fatima Haouari et.al. | 2505.05459 | translate | read | null |
| 2025-05-08 | SITE: towards Spatial Intelligence Thorough Evaluation | Wenqi Wang et.al. | 2505.05456 | translate | read | null |
| 2025-05-08 | Conversational Process Model Redesign | Nataliia Klievtsova et.al. | 2505.05453 | translate | read | null |
| 2025-05-08 | clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations | Chalamalasetti Kranti et.al. | 2505.05445 | translate | read | null |
| 2025-05-07 | EchoInk-R1: Exploring Audio-Visual Reasoning in Multimodal LLMs via Reinforcement Learning | Zhenghao Xing et.al. | 2505.04623 | translate | read | null |
| 2025-05-07 | On Path to Multimodal Generalist: General-Level and General-Bench | Hao Fei et.al. | 2505.04620 | translate | read | link |
| 2025-05-07 | OmniGIRL: A Multilingual and Multimodal Benchmark for GitHub Issue Resolution | Lianghong Guo et.al. | 2505.04606 | translate | read | null |
| 2025-05-08 | MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection | Zhihao Zhang et.al. | 2505.04594 | translate | read | null |
| 2025-05-07 | ZeroSearch: Incentivize the Search Capability of LLMs without Searching | Hao Sun et.al. | 2505.04588 | translate | read | link |
| 2025-05-07 | SlideItRight: Using AI to Find Relevant Slides and Provide Feedback for Open-Ended Questions | Chloe Qianhui Zhao et.al. | 2505.04584 | translate | read | null |
| 2025-05-07 | Fight Fire with Fire: Defending Against Malicious RL Fine-Tuning via Reward Neutralization | Wenjun Cao et.al. | 2505.04578 | translate | read | null |
| 2025-05-07 | Comparative Analysis of Carbon Footprint in Manual vs. LLM-Assisted Code Development | Kuen Sum Cheung et.al. | 2505.04521 | translate | read | null |
| 2025-05-07 | Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs | Yehui Tang et.al. | 2505.04519 | translate | read | null |
| 2025-05-07 | CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation | Jiahao Li et.al. | 2505.04481 | translate | read | null |
| 2025-05-06 | VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model | Zuwei Long et.al. | 2505.03739 | translate | read | link |
| 2025-05-06 | Graph Drawing for LLMs: An Empirical Evaluation | Walter Didimo et.al. | 2505.03678 | translate | read | null |
| 2025-05-06 | Binding threshold units with artificial oscillatory neurons | Vladimir Fanaskov et.al. | 2505.03648 | translate | read | null |
| 2025-05-06 | PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing | Yiping Xie et.al. | 2505.03621 | translate | read | null |
| 2025-05-06 | A Unifying Bias-aware Multidisciplinary Framework for Investigating Socio-Technical Issues | Sacha Hasan et.al. | 2505.03593 | translate | read | null |
| 2025-05-06 | BCause: Human-AI collaboration to improve hybrid mapping and ideation in argumentation-grounded deliberation | Lucas Anastasiou et.al. | 2505.03584 | translate | read | null |
| 2025-05-06 | DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes | Sergey Linok et.al. | 2505.03581 | translate | read | link |
| 2025-05-06 | LlamaFirewall: An open source guardrail system for building secure AI agents | Sahana Chennabasappa et.al. | 2505.03574 | translate | read | null |
| 2025-05-06 | Say It Another Way: A Framework for User-Grounded Paraphrasing | Cléa Chataigner et.al. | 2505.03563 | translate | read | null |
| 2025-05-06 | A Comprehensive Survey of Large AI Models for Future Communications: Foundations, Applications and Challenges | Feibo Jiang et.al. | 2505.03556 | translate | read | null |
| 2025-05-05 | Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation | Lu Ling et.al. | 2505.02836 | translate | read | null |
| 2025-05-05 | R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning | Yi-Fan Zhang et.al. | 2505.02835 | translate | read | link |
| 2025-05-05 | ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations | Dmitriy Shopkhoev et.al. | 2505.02819 | translate | read | link |
| 2025-05-05 | Towards Quantifying the Hessian Structure of Neural Networks | Zhaorui Dong et.al. | 2505.02809 | translate | read | null |
| 2025-05-05 | Generating HomeAssistant Automations Using an LLM-based Chatbot | Mathyas Giudici et.al. | 2505.02802 | translate | read | null |
| 2025-05-05 | HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models | Zheng Lin et.al. | 2505.02795 | translate | read | null |
| 2025-05-05 | Beyond the Monitor: Mixed Reality Visualization and AI for Enhanced Digital Pathology Workflow | Jai Prakash Veerla et.al. | 2505.02780 | translate | read | null |
| 2025-05-05 | Giving Simulated Cells a Voice: Evolving Prompt-to-Intervention Models for Cellular Control | Nam H. Le et.al. | 2505.02766 | translate | read | null |
| 2025-05-05 | Bye-bye, Bluebook? Automating Legal Procedure with Large Language Models | Matthew Dahl et.al. | 2505.02763 | translate | read | null |
| 2025-05-05 | Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation | Pons Gerard et.al. | 2505.02737 | translate | read | null |
| 2025-05-02 | Helping Big Language Models Protect Themselves: An Enhanced Filtering and Summarization System | Sheikh Samit Muhaimin et.al. | 2505.01315 | translate | read | null |
| 2025-05-02 | Enhancing SPARQL Query Rewriting for Complex Ontology Alignments | Anicet Lepetit Ondo et.al. | 2505.01309 | translate | read | null |
| 2025-05-02 | Document Retrieval Augmented Fine-Tuning (DRAFT) for safety-critical software assessments | Regan Bolton et.al. | 2505.01307 | translate | read | null |
| 2025-05-02 | FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing | Gaoxiang Cong et.al. | 2505.01263 | translate | read | null |
| 2025-05-02 | Digital Pathway Curation (DPC): a comparative pipeline to assess the reproducibility, consensus and accuracy across Gemini, PubMed, and scientific reviewers in biomedical research | Flavio Lichtenstein et.al. | 2505.01259 | translate | read | null |
| 2025-05-02 | CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning | Tsai-Ning Wang et.al. | 2505.01199 | translate | read | null |
| 2025-05-02 | LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures | Francisco Aguilera-Martínez et.al. | 2505.01177 | translate | read | null |
| 2025-05-02 | Methodological Foundations for AI-Driven Survey Question Generation | Ted K. Mburu et.al. | 2505.01150 | translate | read | null |
| 2025-05-02 | Retrieval-Augmented Generation in Biomedicine: A Survey of Technologies, Datasets, and Clinical Applications | Jiawei He et.al. | 2505.01146 | translate | read | null |
| 2025-05-02 | MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning | Murtadha Ahmed et.al. | 2505.01110 | translate | read | null |
| 2025-05-01 | T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT | Dongzhi Jiang et.al. | 2505.00703 | translate | read | link |
| 2025-05-01 | Steering Large Language Models with Register Analysis for Arbitrary Style Transfer | Xinchen Yang et.al. | 2505.00679 | translate | read | null |
| 2025-05-01 | Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions | Yiming Du et.al. | 2505.00675 | translate | read | link |
| 2025-05-01 | DeepCritic: Deliberate Critique with Large Language Models | Wenkai Yang et.al. | 2505.00662 | translate | read | link |
| 2025-05-01 | On the generalization of language models from in-context learning and finetuning: a controlled study | Andrew K. Lampinen et.al. | 2505.00661 | translate | read | null |
| 2025-05-01 | Large Language Models Understanding: an Inherent Ambiguity Barrier | Daniel N. Nissani et.al. | 2505.00654 | translate | read | null |
| 2025-05-01 | Open-Source LLM-Driven Federated Transformer for Predictive IoV Management | Yazan Otoum et.al. | 2505.00651 | translate | read | null |
| 2025-05-01 | Investigating Task Arithmetic for Zero-Shot Information Retrieval | Marco Braga et.al. | 2505.00649 | translate | read | null |
| 2025-05-01 | The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them) | Zihao Wang et.al. | 2505.00626 | translate | read | null |
| 2025-05-01 | FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation | Chaitali Bhattacharyya et.al. | 2505.00624 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)