LLM - 2025-06
LLM - 2025-06
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-06-30 | Calligrapher: Freestyle Text Image Customization | Yue Ma et.al. | 2506.24123 | translate | read | null |
| 2025-06-30 | Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime | Yuqing Wang et.al. | 2506.24120 | translate | read | null |
| 2025-06-30 | DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World | Xiangtai Li et.al. | 2506.24102 | translate | read | null |
| 2025-06-30 | Logit-Gap Steering: Efficient Short-Suffix Jailbreaks for Aligned Large Language Models | Tung-Ling Li et.al. | 2506.24056 | translate | read | null |
| 2025-06-30 | Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC | Xinming Wei et.al. | 2506.24045 | translate | read | null |
| 2025-06-30 | A Survey on Vision-Language-Action Models for Autonomous Driving | Sicong Jiang et.al. | 2506.24044 | translate | read | null |
| 2025-06-30 | EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations | Hyunjong Kim et.al. | 2506.24016 | translate | read | null |
| 2025-06-30 | Large Language Models Don’t Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective | Anselm R. Strohmaier et.al. | 2506.24006 | translate | read | null |
| 2025-06-30 | Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning | Seungjun Yi et.al. | 2506.23998 | translate | read | null |
| 2025-06-27 | The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements | Bingchen Zhao et.al. | 2506.22419 | translate | read | null |
| 2025-06-27 | HyperCLOVA X THINK Technical Report | NAVER Cloud HyperCLOVA X Team et.al. | 2506.22403 | translate | read | null |
| 2025-06-27 | Refining Czech GEC: Insights from a Multi-Experiment Approach | Petr Pechman et.al. | 2506.22402 | translate | read | null |
| 2025-06-27 | QuickSilver – Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization | Danush Khanna et.al. | 2506.22396 | translate | read | null |
| 2025-06-27 | What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub | Ramtin Ehsani et.al. | 2506.22390 | translate | read | null |
| 2025-06-27 | Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment | Yue Zhang et.al. | 2506.22385 | translate | read | null |
| 2025-06-27 | Probabilistic Optimality for Inference-time Scaling | Youkang Wang et.al. | 2506.22376 | translate | read | null |
| 2025-06-27 | Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement | Maryam Mousavian et.al. | 2506.22372 | translate | read | null |
| 2025-06-27 | Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny | Carolina Carreira et.al. | 2506.22370 | translate | read | null |
| 2025-06-27 | Concept-Level AI for Telecom: Moving Beyond Large Language Models | Viswanath Kumarskandpriya et.al. | 2506.22359 | translate | read | null |
| 2025-06-26 | Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test | Ziyue Li et.al. | 2506.21551 | translate | read | null |
| 2025-06-26 | mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale | Xiaona Zhou et.al. | 2506.21550 | translate | read | null |
| 2025-06-26 | PsyLite Technical Report | Fangjun Ding et.al. | 2506.21536 | translate | read | null |
| 2025-06-26 | Exploring the Design Space of 3D MLLMs for CT Report Generation | Mohammed Baharoon et.al. | 2506.21535 | translate | read | null |
| 2025-06-26 | “What’s Up, Doc?”: Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets | Akshay Paruchuri et.al. | 2506.21532 | translate | read | null |
| 2025-06-26 | Potemkin Understanding in Large Language Models | Marina Mancoridis et.al. | 2506.21521 | translate | read | null |
| 2025-06-26 | Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration | Jiahe Chen et.al. | 2506.21509 | translate | read | null |
| 2025-06-26 | Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge | Boyu Gou et.al. | 2506.21506 | translate | read | null |
| 2025-06-26 | Bridging Offline and Online Reinforcement Learning for LLMs | Jack Lanchantin et.al. | 2506.21495 | translate | read | null |
| 2025-06-26 | Efficient and Reuseable Cloud Configuration Search Using Discovery Spaces | Michael Johnston et.al. | 2506.21467 | translate | read | null |
| 2025-06-25 | The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind | Andrei Lupu et.al. | 2506.20664 | translate | read | null |
| 2025-06-25 | Memento: Note-Taking for Your Future Self | Chao Wan et.al. | 2506.20642 | translate | read | null |
| 2025-06-25 | Towards Community-Driven Agents for Machine Learning Engineering | Sijie Li et.al. | 2506.20640 | translate | read | null |
| 2025-06-25 | DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation | Shansan Gong et.al. | 2506.20639 | translate | read | null |
| 2025-06-25 | AI Assistants to Enhance and Exploit the PETSc Knowledge Base | Barry Smith et.al. | 2506.20608 | translate | read | null |
| 2025-06-25 | Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm | Baixiang Huang et.al. | 2506.20606 | translate | read | null |
| 2025-06-25 | Video Perception Models for 3D Scene Synthesis | Rui Huang et.al. | 2506.20601 | translate | read | null |
| 2025-06-25 | HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction | Zhonghao Shi et.al. | 2506.20566 | translate | read | null |
| 2025-06-25 | Large Language Model-Driven Code Compliance Checking in Building Information Modeling | Soumya Madireddy et.al. | 2506.20551 | translate | read | null |
| 2025-06-25 | When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs | Ammar Khairi et.al. | 2506.20544 | translate | read | null |
| 2025-06-24 | ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing | Long Xing et.al. | 2506.19848 | translate | read | null |
| 2025-06-24 | JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning | Ai Han et.al. | 2506.19846 | translate | read | null |
| 2025-06-24 | MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration | Yucheng Zhou et.al. | 2506.19835 | translate | read | null |
| 2025-06-24 | Curating art exhibitions using machine learning | Eurico Covas et.al. | 2506.19813 | translate | read | null |
| 2025-06-24 | KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality | Baochang Ren et.al. | 2506.19807 | translate | read | null |
| 2025-06-24 | LLM-Based Social Simulations Require a Boundary | Zengqing Wu et.al. | 2506.19806 | translate | read | null |
| 2025-06-24 | KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs | Xin Fan Guo et.al. | 2506.19802 | translate | read | null |
| 2025-06-24 | Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study | Yuqi Zhu et.al. | 2506.19794 | translate | read | null |
| 2025-06-24 | SAGE: Strategy-Adaptive Generation Engine for Query Rewriting | Teng Wang et.al. | 2506.19783 | translate | read | null |
| 2025-06-24 | SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning | Yuqian Fu et.al. | 2506.19767 | translate | read | null |
| 2025-06-23 | jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval | Michael Günther et.al. | 2506.18902 | translate | read | null |
| 2025-06-23 | Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations | Jiaming Han et.al. | 2506.18898 | translate | read | null |
| 2025-06-23 | ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs | Jiaru Zou et.al. | 2506.18896 | translate | read | null |
| 2025-06-23 | Universal Video Temporal Grounding with Generative Multi-modal Large Language Models | Zeqian Li et.al. | 2506.18883 | translate | read | null |
| 2025-06-23 | CommVQ: Commutative Vector Quantization for KV Cache Compression | Junyan Li et.al. | 2506.18879 | translate | read | null |
| 2025-06-23 | OmniGen2: Exploration to Advanced Multimodal Generation | Chenyuan Wu et.al. | 2506.18871 | translate | read | null |
| 2025-06-23 | TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting | Zhongbin Guo et.al. | 2506.18862 | translate | read | null |
| 2025-06-23 | LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning | Yuhao Wu et.al. | 2506.18841 | translate | read | null |
| 2025-06-23 | STU-PID: Steering Token Usage via PID Controller for Efficient Large Language Model Reasoning | Aryasomayajula Ram Bharadwaj et.al. | 2506.18831 | translate | read | null |
| 2025-06-23 | Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories | Islem Bouzenia et.al. | 2506.18824 | translate | read | null |
| 2025-06-20 | VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning | Zhangyang Qi et.al. | 2506.17221 | translate | read | null |
| 2025-06-20 | No Free Lunch: Rethinking Internal Feedback for LLM Reasoning | Yanzhi Zhang et.al. | 2506.17219 | translate | read | null |
| 2025-06-20 | Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency | Kathleen C. Fraser et.al. | 2506.17209 | translate | read | null |
| 2025-06-20 | Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems | Matias Martinez et.al. | 2506.17208 | translate | read | null |
| 2025-06-20 | Confidence Scoring for LLM-Generated SQL in Supply Chain Data Extraction | Jiekai Ma et.al. | 2506.17203 | translate | read | null |
| 2025-06-20 | Detecting LLM-Generated Short Answers and Effects on Learner Performance | Shambhavi Bhushan et.al. | 2506.17196 | translate | read | null |
| 2025-06-20 | The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making | Abinitha Gourabathina et.al. | 2506.17163 | translate | read | null |
| 2025-06-20 | Do We Need Large VLMs for Spotting Soccer Actions? | Ritabrata Chakraborty et.al. | 2506.17144 | translate | read | null |
| 2025-06-20 | Large Language Model Unlearning for Source Code | Xue Jiang et.al. | 2506.17125 | translate | read | null |
| 2025-06-20 | When Can Model-Free Reinforcement Learning be Enough for Thinking? | Josiah P. Hanna et.al. | 2506.17124 | translate | read | null |
| 2025-06-18 | PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning | Yuhui Shi et.al. | 2506.15683 | translate | read | null |
| 2025-06-18 | GenRecal: Generation after Recalibration from Large to Small Vision-Language Models | Byung-Kwan Lee et.al. | 2506.15681 | translate | read | null |
| 2025-06-18 | SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence | Yao Zhang et.al. | 2506.15672 | translate | read | null |
| 2025-06-18 | CC-LEARN: Cohort-based Consistency Learning | Xiao Ye et.al. | 2506.15662 | translate | read | null |
| 2025-06-18 | PhishDebate: An LLM-Based Multi-Agent Framework for Phishing Website Detection | Wenhao Li et.al. | 2506.15656 | translate | read | null |
| 2025-06-18 | deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses | Georgios Androutsopoulos et.al. | 2506.15648 | translate | read | null |
| 2025-06-18 | Demystifying the Visual Quality Paradox in Multimodal Large Language Models | Shuo Xing et.al. | 2506.15645 | translate | read | null |
| 2025-06-18 | Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability | Yusuke Sakai et.al. | 2506.15629 | translate | read | null |
| 2025-06-18 | The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games | Lyle Goodyear et.al. | 2506.15624 | translate | read | null |
| 2025-06-18 | The Compositional Architecture of Regret in Large Language Models | Xiangxiang Cui et.al. | 2506.15617 | translate | read | null |
| 2025-06-17 | A Variational Framework for Improving Naturalness in Generative Spoken Language Models | Li-Wei Chen et.al. | 2506.14767 | translate | read | link |
| 2025-06-17 | ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM | Yujun Wang et.al. | 2506.14766 | translate | read | null |
| 2025-06-17 | Large Language Models – the Future of Fundamental Physics? | Caroline Heneka et.al. | 2506.14757 | translate | read | null |
| 2025-06-17 | Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs | Ring Team et.al. | 2506.14731 | translate | read | null |
| 2025-06-17 | AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes | Jiahao Qiu et.al. | 2506.14728 | translate | read | link |
| 2025-06-17 | HARMONY: A Scalable Distributed Vector Database for High-Throughput Approximate Nearest Neighbor Search | Qian Xu et.al. | 2506.14707 | translate | read | null |
| 2025-06-17 | Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data | Anton Changalidis et.al. | 2506.14704 | translate | read | null |
| 2025-06-17 | Unified Software Engineering agent as AI Software Engineer | Leonhard Applis et.al. | 2506.14683 | translate | read | null |
| 2025-06-17 | AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models | Ads Dawson et.al. | 2506.14682 | translate | read | null |
| 2025-06-17 | Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality | Yuto Harada et.al. | 2506.14681 | translate | read | null |
| 2025-06-16 | Steering LLM Thinking with Budget Guidance | Junyan Li et.al. | 2506.13752 | translate | read | link |
| 2025-06-16 | Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability | Shova Kuikel et.al. | 2506.13746 | translate | read | link |
| 2025-06-16 | Instruction Following by Boosting Attention of Large Language Models | Vitoria Guardieiro et.al. | 2506.13734 | translate | read | null |
| 2025-06-16 | Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs | Sayed Mohammad Vakilzadeh Hatefi et.al. | 2506.13727 | translate | read | null |
| 2025-06-16 | Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models | Arjun Krishna et.al. | 2506.13726 | translate | read | null |
| 2025-06-16 | TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning | Junru Zhang et.al. | 2506.13705 | translate | read | link |
| 2025-06-16 | Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems | Shang-Chi Tsai et.al. | 2506.13692 | translate | read | null |
| 2025-06-16 | What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers | Pulkit Gopalani et.al. | 2506.13688 | translate | read | link |
| 2025-06-16 | An LLM’s Apology: Outsourcing Awkwardness in the Age of AI | Twm Stone et.al. | 2506.13685 | translate | read | null |
| 2025-06-16 | Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data | Haonan Wang et.al. | 2506.13674 | translate | read | null |
| 2025-06-13 | code_transformed: The Influence of Large Language Models on Code | Yuliang Xu et.al. | 2506.12014 | translate | read | null |
| 2025-06-13 | Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making | Xiaopeng Yuan et.al. | 2506.12012 | translate | read | null |
| 2025-06-13 | VGR: Visual Grounded Reasoning | Jiacong Wang et.al. | 2506.11991 | translate | read | null |
| 2025-06-13 | How Visual Representations Map to Language Feature Space in Multimodal LLMs | Constantin Venhoff et.al. | 2506.11976 | translate | read | null |
| 2025-06-13 | Improving Large Language Model Safety with Contrastive Representation Learning | Samuel Simko et.al. | 2506.11938 | translate | read | null |
| 2025-06-13 | Temporal Dynamics of Emotions in Italian Online Soccer Fandoms | Salvatore Citraro et.al. | 2506.11934 | translate | read | null |
| 2025-06-13 | LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming? | Zihan Zheng et.al. | 2506.11928 | translate | read | link |
| 2025-06-13 | Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache | Xiaoran Liu et.al. | 2506.11886 | translate | read | null |
| 2025-06-13 | Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment | Alejandro Peña et.al. | 2506.11880 | translate | read | null |
| 2025-06-13 | A Short Survey on Formalising Software Requirements using Large Language Models | Arshad Beg et.al. | 2506.11874 | translate | read | null |
| 2025-06-12 | AutoMind: Adaptive Knowledgeable Agent for Automated Data Science | Yixin Ou et.al. | 2506.10974 | translate | read | null |
| 2025-06-12 | Farseer: A Refined Scaling Law in Large Language Models | Houyi Li et.al. | 2506.10972 | translate | read | link |
| 2025-06-12 | Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs | Qizhe Zhang et.al. | 2506.10967 | translate | read | null |
| 2025-06-12 | ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark | Kangwei Liu et.al. | 2506.10960 | translate | read | link |
| 2025-06-12 | SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks | Lianghong Guo et.al. | 2506.10954 | translate | read | link |
| 2025-06-12 | Build the web for agents, not agents for the web | Xing Han Lù et.al. | 2506.10953 | translate | read | null |
| 2025-06-12 | Execution Guided Line-by-Line Code Generation | Boaz Lavon et.al. | 2506.10948 | translate | read | null |
| 2025-06-12 | GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models | Evelyn Ma et.al. | 2506.10946 | translate | read | null |
| 2025-06-12 | Self-Adapting Language Models | Adam Zweiger et.al. | 2506.10943 | translate | read | null |
| 2025-06-12 | Building a Media Ecosystem Observatory from Scratch: Infrastructure, Methodology, and Insights | Zeynep Pehlivan et.al. | 2506.10942 | translate | read | null |
| 2025-06-11 | Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling | Tim Z. Xiao et.al. | 2506.09998 | translate | read | null |
| 2025-06-11 | From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring | Yang Li et.al. | 2506.09996 | translate | read | null |
| 2025-06-11 | Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages | Amel Muminovic et.al. | 2506.09992 | translate | read | link |
| 2025-06-11 | Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation | Xinyu Yang et.al. | 2506.09991 | translate | read | null |
| 2025-06-11 | V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning | Mido Assran et.al. | 2506.09985 | translate | read | link |
| 2025-06-11 | Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs | Hiroshi Matsuda et.al. | 2506.09983 | translate | read | null |
| 2025-06-11 | SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance | Wentao Ge et.al. | 2506.09968 | translate | read | null |
| 2025-06-11 | Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing | Junfei Wu et.al. | 2506.09965 | translate | read | link |
| 2025-06-11 | Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy | Sushant Gautam et.al. | 2506.09958 | translate | read | link |
| 2025-06-11 | LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge | Sahar Abdelnabi et.al. | 2506.09956 | translate | read | null |
| 2025-06-09 | GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior | Penghao Wu et.al. | 2506.08012 | translate | read | link |
| 2025-06-09 | Play to Generalize: Learning to Reason Through Game Play | Yunfei Xie et.al. | 2506.08011 | translate | read | link |
| 2025-06-09 | Reinforcement Pre-Training | Qingxiu Dong et.al. | 2506.08007 | translate | read | null |
| 2025-06-09 | Reparameterized LLM Training via Orthogonal Equivalence Transformation | Zeju Qiu et.al. | 2506.08001 | translate | read | link |
| 2025-06-09 | Supporting Construction Worker Well-Being with a Multi-Agent Conversational AI System | Fan Yang et.al. | 2506.07997 | translate | read | null |
| 2025-06-09 | $τ^2$ -Bench: Evaluating Conversational Agents in a Dual-Control Environment | Victor Barres et.al. | 2506.07982 | translate | read | link |
| 2025-06-09 | HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization | Hongzheng Chen et.al. | 2506.07972 | translate | read | link |
| 2025-06-09 | CyberV: Cybernetics for Test-time Scaling in Video Understanding | Jiahao Meng et.al. | 2506.07971 | translate | read | link |
| 2025-06-09 | SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence | Ziyang Gong et.al. | 2506.07966 | translate | read | link |
| 2025-06-09 | Reinforcing Multimodal Understanding and Generation with Dual Self-rewards | Jixiang Hong et.al. | 2506.07963 | translate | read | null |
| 2025-06-06 | Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias | Yuanzhe Hu et.al. | 2506.06280 | translate | read | null |
| 2025-06-06 | CoMemo: LVLMs Need Image Context with Image Memory | Shi Liu et.al. | 2506.06279 | translate | read | link |
| 2025-06-06 | AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization | Mukur Gupta et.al. | 2506.06273 | translate | read | null |
| 2025-06-06 | Cartridges: Lightweight and general-purpose long context representations via self-study | Sabri Eyuboglu et.al. | 2506.06266 | translate | read | link |
| 2025-06-06 | PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time | Weizhi Zhang et.al. | 2506.06254 | translate | read | null |
| 2025-06-06 | DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation | Jingyu Xiao et.al. | 2506.06251 | translate | read | link |
| 2025-06-06 | Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models | Zahra Babaiee et.al. | 2506.06242 | translate | read | null |
| 2025-06-06 | Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge | Yi Sui et.al. | 2506.06240 | translate | read | null |
| 2025-06-06 | CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports | Peter Pirkelbauer et.al. | 2506.06227 | translate | read | null |
| 2025-06-06 | PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems | Yi Huang et.al. | 2506.06226 | translate | read | null |
| 2025-06-05 | Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets | Lei Hsiung et.al. | 2506.05346 | translate | read | null |
| 2025-06-05 | SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs | Jiahui Wang et.al. | 2506.05344 | translate | read | link |
| 2025-06-05 | Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning | Xingjian Ran et.al. | 2506.05341 | translate | read | null |
| 2025-06-05 | VideoMolmo: Spatio-Temporal Grounding Meets Pointing | Ghazi Shazan Ahmad et.al. | 2506.05336 | translate | read | link |
| 2025-06-05 | Search Arena: Analyzing Search-Augmented LLMs | Mihran Miroyan et.al. | 2506.05334 | translate | read | link |
| 2025-06-05 | MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Xinyan Chen et.al. | 2506.05331 | translate | read | link |
| 2025-06-05 | Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay | Yifan Sun et.al. | 2506.05316 | translate | read | null |
| 2025-06-05 | Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models | Taha Entesari et.al. | 2506.05314 | translate | read | null |
| 2025-06-05 | ProRefine: Inference-time Prompt Refinement with Textual Feedback | Deepak Pandita et.al. | 2506.05305 | translate | read | null |
| 2025-06-05 | Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Weifeng Lin et.al. | 2506.05302 | translate | read | null |
| 2025-06-04 | Language-Image Alignment with Fixed Text Encoders | Jingfeng Yang et.al. | 2506.04209 | translate | read | link |
| 2025-06-04 | Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning | Shuang Chen et.al. | 2506.04207 | translate | read | link |
| 2025-06-04 | EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation | Jinghan Jia et.al. | 2506.04205 | translate | read | null |
| 2025-06-04 | Cascadia: A Cascade Serving System for Large Language Models | Youhe Jiang et.al. | 2506.04203 | translate | read | null |
| 2025-06-04 | TracLLM: A Generic Framework for Attributing Long Context LLMs | Yanting Wang et.al. | 2506.04202 | translate | read | link |
| 2025-06-04 | R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning | Qingfei Zhao et.al. | 2506.04185 | translate | read | link |
| 2025-06-04 | SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models | Yuhao Wu et.al. | 2506.04180 | translate | read | link |
| 2025-06-04 | SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling | Anhao Zhao et.al. | 2506.04179 | translate | read | null |
| 2025-06-04 | Does Prompt Design Impact Quality of Data Imputation by LLMs? | Shreenidhi Srinivasan et.al. | 2506.04172 | translate | read | null |
| 2025-06-04 | VISCA: Inferring Component Abstractions for Automated End-to-End Testing | Parsa Alian et.al. | 2506.04161 | translate | read | null |
| 2025-06-03 | Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM | Pralaypati Ta et.al. | 2506.03145 | translate | read | null |
| 2025-06-03 | Not All Tokens Are Meant to Be Forgotten | Xiangyu Zhou et.al. | 2506.03142 | translate | read | null |
| 2025-06-03 | SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation | Siqi Chen et.al. | 2506.03139 | translate | read | link |
| 2025-06-03 | Native-Resolution Image Synthesis | Zidong Wang et.al. | 2506.03131 | translate | read | link |
| 2025-06-03 | AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation | Lu Qiu et.al. | 2506.03126 | translate | read | link |
| 2025-06-03 | AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation | Prashanth Vijayaraghavan et.al. | 2506.03122 | translate | read | null |
| 2025-06-03 | Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback | Xiaoying Zhang et.al. | 2506.03106 | translate | read | link |
| 2025-06-03 | TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models | Chetwin Low et.al. | 2506.03099 | translate | read | link |
| 2025-06-03 | EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models | Mingzhe Li et.al. | 2506.03067 | translate | read | null |
| 2025-06-03 | Facts Do Care About Your Language: Assessing Answer Quality of Multilingual LLMs | Yuval Kansal et.al. | 2506.03051 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)