LLM - 2025-06 | Paper Arxiv Daily

LLM - 2025-06

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-06-30	Calligrapher: Freestyle Text Image Customization	Yue Ma et.al.	2506.24123	translate	read	null
2025-06-30	Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime	Yuqing Wang et.al.	2506.24120	translate	read	null
2025-06-30	DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World	Xiangtai Li et.al.	2506.24102	translate	read	null
2025-06-30	Logit-Gap Steering: Efficient Short-Suffix Jailbreaks for Aligned Large Language Models	Tung-Ling Li et.al.	2506.24056	translate	read	null
2025-06-30	Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC	Xinming Wei et.al.	2506.24045	translate	read	null
2025-06-30	A Survey on Vision-Language-Action Models for Autonomous Driving	Sicong Jiang et.al.	2506.24044	translate	read	null
2025-06-30	EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations	Hyunjong Kim et.al.	2506.24016	translate	read	null
2025-06-30	Large Language Models Don’t Make Sense of Word Problems. A Scoping Review from a Mathematics Education Perspective	Anselm R. Strohmaier et.al.	2506.24006	translate	read	null
2025-06-30	Auto-TA: Towards Scalable Automated Thematic Analysis (TA) via Multi-Agent Large Language Models with Reinforcement Learning	Seungjun Yi et.al.	2506.23998	translate	read	null
2025-06-27	The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements	Bingchen Zhao et.al.	2506.22419	translate	read	null
2025-06-27	HyperCLOVA X THINK Technical Report	NAVER Cloud HyperCLOVA X Team et.al.	2506.22403	translate	read	null
2025-06-27	Refining Czech GEC: Insights from a Multi-Experiment Approach	Petr Pechman et.al.	2506.22402	translate	read	null
2025-06-27	QuickSilver – Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization	Danush Khanna et.al.	2506.22396	translate	read	null
2025-06-27	What Makes ChatGPT Effective for Software Issue Resolution? An Empirical Study of Developer-ChatGPT Conversations in GitHub	Ramtin Ehsani et.al.	2506.22390	translate	read	null
2025-06-27	Can Video Large Multimodal Models Think Like Doubters-or Double-Down: A Study on Defeasible Video Entailment	Yue Zhang et.al.	2506.22385	translate	read	null
2025-06-27	Probabilistic Optimality for Inference-time Scaling	Youkang Wang et.al.	2506.22376	translate	read	null
2025-06-27	Towards Fair Rankings: Leveraging LLMs for Gender Bias Detection and Measurement	Maryam Mousavian et.al.	2506.22372	translate	read	null
2025-06-27	Can Large Language Models Help Students Prove Software Correctness? An Experimental Study with Dafny	Carolina Carreira et.al.	2506.22370	translate	read	null
2025-06-27	Concept-Level AI for Telecom: Moving Beyond Large Language Models	Viswanath Kumarskandpriya et.al.	2506.22359	translate	read	null
2025-06-26	Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test	Ziyue Li et.al.	2506.21551	translate	read	null
2025-06-26	mTSBench: Benchmarking Multivariate Time Series Anomaly Detection and Model Selection at Scale	Xiaona Zhou et.al.	2506.21550	translate	read	null
2025-06-26	PsyLite Technical Report	Fangjun Ding et.al.	2506.21536	translate	read	null
2025-06-26	Exploring the Design Space of 3D MLLMs for CT Report Generation	Mohammed Baharoon et.al.	2506.21535	translate	read	null
2025-06-26	“What’s Up, Doc?”: Analyzing How Users Seek Health Information in Large-Scale Conversational AI Datasets	Akshay Paruchuri et.al.	2506.21532	translate	read	null
2025-06-26	Potemkin Understanding in Large Language Models	Marina Mancoridis et.al.	2506.21521	translate	read	null
2025-06-26	Mitigating Hallucination of Large Vision-Language Models via Dynamic Logits Calibration	Jiahe Chen et.al.	2506.21509	translate	read	null
2025-06-26	Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge	Boyu Gou et.al.	2506.21506	translate	read	null
2025-06-26	Bridging Offline and Online Reinforcement Learning for LLMs	Jack Lanchantin et.al.	2506.21495	translate	read	null
2025-06-26	Efficient and Reuseable Cloud Configuration Search Using Discovery Spaces	Michael Johnston et.al.	2506.21467	translate	read	null
2025-06-25	The Decrypto Benchmark for Multi-Agent Reasoning and Theory of Mind	Andrei Lupu et.al.	2506.20664	translate	read	null
2025-06-25	Memento: Note-Taking for Your Future Self	Chao Wan et.al.	2506.20642	translate	read	null
2025-06-25	Towards Community-Driven Agents for Machine Learning Engineering	Sijie Li et.al.	2506.20640	translate	read	null
2025-06-25	DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation	Shansan Gong et.al.	2506.20639	translate	read	null
2025-06-25	AI Assistants to Enhance and Exploit the PETSc Knowledge Base	Barry Smith et.al.	2506.20608	translate	read	null
2025-06-25	Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm	Baixiang Huang et.al.	2506.20606	translate	read	null
2025-06-25	Video Perception Models for 3D Scene Synthesis	Rui Huang et.al.	2506.20601	translate	read	null
2025-06-25	HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction	Zhonghao Shi et.al.	2506.20566	translate	read	null
2025-06-25	Large Language Model-Driven Code Compliance Checking in Building Information Modeling	Soumya Madireddy et.al.	2506.20551	translate	read	null
2025-06-25	When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs	Ammar Khairi et.al.	2506.20544	translate	read	null
2025-06-24	ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing	Long Xing et.al.	2506.19848	translate	read	null
2025-06-24	JoyAgents-R1: Joint Evolution Dynamics for Versatile Multi-LLM Agents with Reinforcement Learning	Ai Han et.al.	2506.19846	translate	read	null
2025-06-24	MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration	Yucheng Zhou et.al.	2506.19835	translate	read	null
2025-06-24	Curating art exhibitions using machine learning	Eurico Covas et.al.	2506.19813	translate	read	null
2025-06-24	KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality	Baochang Ren et.al.	2506.19807	translate	read	null
2025-06-24	LLM-Based Social Simulations Require a Boundary	Zengqing Wu et.al.	2506.19806	translate	read	null
2025-06-24	KnowML: Improving Generalization of ML-NIDS with Attack Knowledge Graphs	Xin Fan Guo et.al.	2506.19802	translate	read	null
2025-06-24	Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study	Yuqi Zhu et.al.	2506.19794	translate	read	null
2025-06-24	SAGE: Strategy-Adaptive Generation Engine for Query Rewriting	Teng Wang et.al.	2506.19783	translate	read	null
2025-06-24	SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning	Yuqian Fu et.al.	2506.19767	translate	read	null
2025-06-23	jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval	Michael Günther et.al.	2506.18902	translate	read	null
2025-06-23	Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations	Jiaming Han et.al.	2506.18898	translate	read	null
2025-06-23	ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs	Jiaru Zou et.al.	2506.18896	translate	read	null
2025-06-23	Universal Video Temporal Grounding with Generative Multi-modal Large Language Models	Zeqian Li et.al.	2506.18883	translate	read	null
2025-06-23	CommVQ: Commutative Vector Quantization for KV Cache Compression	Junyan Li et.al.	2506.18879	translate	read	null
2025-06-23	OmniGen2: Exploration to Advanced Multimodal Generation	Chenyuan Wu et.al.	2506.18871	translate	read	null
2025-06-23	TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting	Zhongbin Guo et.al.	2506.18862	translate	read	null
2025-06-23	LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning	Yuhao Wu et.al.	2506.18841	translate	read	null
2025-06-23	STU-PID: Steering Token Usage via PID Controller for Efficient Large Language Model Reasoning	Aryasomayajula Ram Bharadwaj et.al.	2506.18831	translate	read	null
2025-06-23	Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories	Islem Bouzenia et.al.	2506.18824	translate	read	null
2025-06-20	VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning	Zhangyang Qi et.al.	2506.17221	translate	read	null
2025-06-20	No Free Lunch: Rethinking Internal Feedback for LLM Reasoning	Yanzhi Zhang et.al.	2506.17219	translate	read	null
2025-06-20	Fine-Tuning Lowers Safety and Disrupts Evaluation Consistency	Kathleen C. Fraser et.al.	2506.17209	translate	read	null
2025-06-20	Dissecting the SWE-Bench Leaderboards: Profiling Submitters and Architectures of LLM- and Agent-Based Repair Systems	Matias Martinez et.al.	2506.17208	translate	read	null
2025-06-20	Confidence Scoring for LLM-Generated SQL in Supply Chain Data Extraction	Jiekai Ma et.al.	2506.17203	translate	read	null
2025-06-20	Detecting LLM-Generated Short Answers and Effects on Learner Performance	Shambhavi Bhushan et.al.	2506.17196	translate	read	null
2025-06-20	The MedPerturb Dataset: What Non-Content Perturbations Reveal About Human and Clinical LLM Decision Making	Abinitha Gourabathina et.al.	2506.17163	translate	read	null
2025-06-20	Do We Need Large VLMs for Spotting Soccer Actions?	Ritabrata Chakraborty et.al.	2506.17144	translate	read	null
2025-06-20	Large Language Model Unlearning for Source Code	Xue Jiang et.al.	2506.17125	translate	read	null
2025-06-20	When Can Model-Free Reinforcement Learning be Enough for Thinking?	Josiah P. Hanna et.al.	2506.17124	translate	read	null
2025-06-18	PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning	Yuhui Shi et.al.	2506.15683	translate	read	null
2025-06-18	GenRecal: Generation after Recalibration from Large to Small Vision-Language Models	Byung-Kwan Lee et.al.	2506.15681	translate	read	null
2025-06-18	SwarmAgentic: Towards Fully Automated Agentic System Generation via Swarm Intelligence	Yao Zhang et.al.	2506.15672	translate	read	null
2025-06-18	CC-LEARN: Cohort-based Consistency Learning	Xiao Ye et.al.	2506.15662	translate	read	null
2025-06-18	PhishDebate: An LLM-Based Multi-Agent Framework for Phishing Website Detection	Wenhao Li et.al.	2506.15656	translate	read	null
2025-06-18	deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses	Georgios Androutsopoulos et.al.	2506.15648	translate	read	null
2025-06-18	Demystifying the Visual Quality Paradox in Multimodal Large Language Models	Shuo Xing et.al.	2506.15645	translate	read	null
2025-06-18	Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability	Yusuke Sakai et.al.	2506.15629	translate	read	null
2025-06-18	The Effect of State Representation on LLM Agent Behavior in Dynamic Routing Games	Lyle Goodyear et.al.	2506.15624	translate	read	null
2025-06-18	The Compositional Architecture of Regret in Large Language Models	Xiangxiang Cui et.al.	2506.15617	translate	read	null
2025-06-17	A Variational Framework for Improving Naturalness in Generative Spoken Language Models	Li-Wei Chen et.al.	2506.14767	translate	read	link
2025-06-17	ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM	Yujun Wang et.al.	2506.14766	translate	read	null
2025-06-17	Large Language Models – the Future of Fundamental Physics?	Caroline Heneka et.al.	2506.14757	translate	read	null
2025-06-17	Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs	Ring Team et.al.	2506.14731	translate	read	null
2025-06-17	AgentDistill: Training-Free Agent Distillation with Generalizable MCP Boxes	Jiahao Qiu et.al.	2506.14728	translate	read	link
2025-06-17	HARMONY: A Scalable Distributed Vector Database for High-Throughput Approximate Nearest Neighbor Search	Qian Xu et.al.	2506.14707	translate	read	null
2025-06-17	Capacity Matters: a Proof-of-Concept for Transformer Memorization on Real-World Data	Anton Changalidis et.al.	2506.14704	translate	read	null
2025-06-17	Unified Software Engineering agent as AI Software Engineer	Leonhard Applis et.al.	2506.14683	translate	read	null
2025-06-17	AIRTBench: Measuring Autonomous AI Red Teaming Capabilities in Language Models	Ads Dawson et.al.	2506.14682	translate	read	null
2025-06-17	Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality	Yuto Harada et.al.	2506.14681	translate	read	null
2025-06-16	Steering LLM Thinking with Budget Guidance	Junyan Li et.al.	2506.13752	translate	read	link
2025-06-16	Evaluating Large Language Models for Phishing Detection, Self-Consistency, Faithfulness, and Explainability	Shova Kuikel et.al.	2506.13746	translate	read	link
2025-06-16	Instruction Following by Boosting Attention of Large Language Models	Vitoria Guardieiro et.al.	2506.13734	translate	read	null
2025-06-16	Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs	Sayed Mohammad Vakilzadeh Hatefi et.al.	2506.13727	translate	read	null
2025-06-16	Weakest Link in the Chain: Security Vulnerabilities in Advanced Reasoning Models	Arjun Krishna et.al.	2506.13726	translate	read	null
2025-06-16	TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning	Junru Zhang et.al.	2506.13705	translate	read	link
2025-06-16	Balancing Knowledge Delivery and Emotional Comfort in Healthcare Conversational Systems	Shang-Chi Tsai et.al.	2506.13692	translate	read	null
2025-06-16	What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers	Pulkit Gopalani et.al.	2506.13688	translate	read	link
2025-06-16	An LLM’s Apology: Outsourcing Awkwardness in the Age of AI	Twm Stone et.al.	2506.13685	translate	read	null
2025-06-16	Prefix-Tuning+: Modernizing Prefix-Tuning through Attention Independent Prefix Data	Haonan Wang et.al.	2506.13674	translate	read	null
2025-06-13	code_transformed: The Influence of Large Language Models on Code	Yuliang Xu et.al.	2506.12014	translate	read	null
2025-06-13	Tracing LLM Reasoning Processes with Strategic Games: A Framework for Planning, Revision, and Resource-Constrained Decision Making	Xiaopeng Yuan et.al.	2506.12012	translate	read	null
2025-06-13	VGR: Visual Grounded Reasoning	Jiacong Wang et.al.	2506.11991	translate	read	null
2025-06-13	How Visual Representations Map to Language Feature Space in Multimodal LLMs	Constantin Venhoff et.al.	2506.11976	translate	read	null
2025-06-13	Improving Large Language Model Safety with Contrastive Representation Learning	Samuel Simko et.al.	2506.11938	translate	read	null
2025-06-13	Temporal Dynamics of Emotions in Italian Online Soccer Fandoms	Salvatore Citraro et.al.	2506.11934	translate	read	null
2025-06-13	LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?	Zihan Zheng et.al.	2506.11928	translate	read	link
2025-06-13	Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache	Xiaoran Liu et.al.	2506.11886	translate	read	null
2025-06-13	Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment	Alejandro Peña et.al.	2506.11880	translate	read	null
2025-06-13	A Short Survey on Formalising Software Requirements using Large Language Models	Arshad Beg et.al.	2506.11874	translate	read	null
2025-06-12	AutoMind: Adaptive Knowledgeable Agent for Automated Data Science	Yixin Ou et.al.	2506.10974	translate	read	null
2025-06-12	Farseer: A Refined Scaling Law in Large Language Models	Houyi Li et.al.	2506.10972	translate	read	link
2025-06-12	Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs	Qizhe Zhang et.al.	2506.10967	translate	read	null
2025-06-12	ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark	Kangwei Liu et.al.	2506.10960	translate	read	link
2025-06-12	SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks	Lianghong Guo et.al.	2506.10954	translate	read	link
2025-06-12	Build the web for agents, not agents for the web	Xing Han Lù et.al.	2506.10953	translate	read	null
2025-06-12	Execution Guided Line-by-Line Code Generation	Boaz Lavon et.al.	2506.10948	translate	read	null
2025-06-12	GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models	Evelyn Ma et.al.	2506.10946	translate	read	null
2025-06-12	Self-Adapting Language Models	Adam Zweiger et.al.	2506.10943	translate	read	null
2025-06-12	Building a Media Ecosystem Observatory from Scratch: Infrastructure, Methodology, and Insights	Zeynep Pehlivan et.al.	2506.10942	translate	read	null
2025-06-11	Flipping Against All Odds: Reducing LLM Coin Flip Bias via Verbalized Rejection Sampling	Tim Z. Xiao et.al.	2506.09998	translate	read	null
2025-06-11	From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring	Yang Li et.al.	2506.09996	translate	read	null
2025-06-11	Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages	Amel Muminovic et.al.	2506.09992	translate	read	link
2025-06-11	Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation	Xinyu Yang et.al.	2506.09991	translate	read	null
2025-06-11	V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning	Mido Assran et.al.	2506.09985	translate	read	link
2025-06-11	Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs	Hiroshi Matsuda et.al.	2506.09983	translate	read	null
2025-06-11	SRLAgent: Enhancing Self-Regulated Learning Skills through Gamification and LLM Assistance	Wentao Ge et.al.	2506.09968	translate	read	null
2025-06-11	Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing	Junfei Wu et.al.	2506.09965	translate	read	link
2025-06-11	Kvasir-VQA-x1: A Multimodal Dataset for Medical Reasoning and Robust MedVQA in Gastrointestinal Endoscopy	Sushant Gautam et.al.	2506.09958	translate	read	link
2025-06-11	LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge	Sahar Abdelnabi et.al.	2506.09956	translate	read	null
2025-06-09	GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior	Penghao Wu et.al.	2506.08012	translate	read	link
2025-06-09	Play to Generalize: Learning to Reason Through Game Play	Yunfei Xie et.al.	2506.08011	translate	read	link
2025-06-09	Reinforcement Pre-Training	Qingxiu Dong et.al.	2506.08007	translate	read	null
2025-06-09	Reparameterized LLM Training via Orthogonal Equivalence Transformation	Zeju Qiu et.al.	2506.08001	translate	read	link
2025-06-09	Supporting Construction Worker Well-Being with a Multi-Agent Conversational AI System	Fan Yang et.al.	2506.07997	translate	read	null
2025-06-09	$τ^2$ -Bench: Evaluating Conversational Agents in a Dual-Control Environment	Victor Barres et.al.	2506.07982	translate	read	link
2025-06-09	HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization	Hongzheng Chen et.al.	2506.07972	translate	read	link
2025-06-09	CyberV: Cybernetics for Test-time Scaling in Video Understanding	Jiahao Meng et.al.	2506.07971	translate	read	link
2025-06-09	SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence	Ziyang Gong et.al.	2506.07966	translate	read	link
2025-06-09	Reinforcing Multimodal Understanding and Generation with Dual Self-rewards	Jixiang Hong et.al.	2506.07963	translate	read	null
2025-06-06	Eigenspectrum Analysis of Neural Networks without Aspect Ratio Bias	Yuanzhe Hu et.al.	2506.06280	translate	read	null
2025-06-06	CoMemo: LVLMs Need Image Context with Image Memory	Shi Liu et.al.	2506.06279	translate	read	link
2025-06-06	AdvSumm: Adversarial Training for Bias Mitigation in Text Summarization	Mukur Gupta et.al.	2506.06273	translate	read	null
2025-06-06	Cartridges: Lightweight and general-purpose long context representations via self-study	Sabri Eyuboglu et.al.	2506.06266	translate	read	link
2025-06-06	PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time	Weizhi Zhang et.al.	2506.06254	translate	read	null
2025-06-06	DesignBench: A Comprehensive Benchmark for MLLM-based Front-end Code Generation	Jingyu Xiao et.al.	2506.06251	translate	read	link
2025-06-06	Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models	Zahra Babaiee et.al.	2506.06242	translate	read	null
2025-06-06	Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge	Yi Sui et.al.	2506.06240	translate	read	null
2025-06-06	CompilerGPT: Leveraging Large Language Models for Analyzing and Acting on Compiler Optimization Reports	Peter Pirkelbauer et.al.	2506.06227	translate	read	null
2025-06-06	PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems	Yi Huang et.al.	2506.06226	translate	read	null
2025-06-05	Why LLM Safety Guardrails Collapse After Fine-tuning: A Similarity Analysis Between Alignment and Fine-tuning Datasets	Lei Hsiung et.al.	2506.05346	translate	read	null
2025-06-05	SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs	Jiahui Wang et.al.	2506.05344	translate	read	link
2025-06-05	Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning	Xingjian Ran et.al.	2506.05341	translate	read	null
2025-06-05	VideoMolmo: Spatio-Temporal Grounding Meets Pointing	Ghazi Shazan Ahmad et.al.	2506.05336	translate	read	link
2025-06-05	Search Arena: Analyzing Search-Augmented LLMs	Mihran Miroyan et.al.	2506.05334	translate	read	link
2025-06-05	MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning	Xinyan Chen et.al.	2506.05331	translate	read	link
2025-06-05	Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay	Yifan Sun et.al.	2506.05316	translate	read	null
2025-06-05	Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models	Taha Entesari et.al.	2506.05314	translate	read	null
2025-06-05	ProRefine: Inference-time Prompt Refinement with Textual Feedback	Deepak Pandita et.al.	2506.05305	translate	read	null
2025-06-05	Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos	Weifeng Lin et.al.	2506.05302	translate	read	null
2025-06-04	Language-Image Alignment with Fixed Text Encoders	Jingfeng Yang et.al.	2506.04209	translate	read	link
2025-06-04	Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning	Shuang Chen et.al.	2506.04207	translate	read	link
2025-06-04	EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation	Jinghan Jia et.al.	2506.04205	translate	read	null
2025-06-04	Cascadia: A Cascade Serving System for Large Language Models	Youhe Jiang et.al.	2506.04203	translate	read	null
2025-06-04	TracLLM: A Generic Framework for Attributing Long Context LLMs	Yanting Wang et.al.	2506.04202	translate	read	link
2025-06-04	R-Search: Empowering LLM Reasoning with Search via Multi-Reward Reinforcement Learning	Qingfei Zhao et.al.	2506.04185	translate	read	link
2025-06-04	SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models	Yuhao Wu et.al.	2506.04180	translate	read	link
2025-06-04	SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling	Anhao Zhao et.al.	2506.04179	translate	read	null
2025-06-04	Does Prompt Design Impact Quality of Data Imputation by LLMs?	Shreenidhi Srinivasan et.al.	2506.04172	translate	read	null
2025-06-04	VISCA: Inferring Component Abstractions for Automated End-to-End Testing	Parsa Alian et.al.	2506.04161	translate	read	null
2025-06-03	Entity-Augmented Neuroscience Knowledge Retrieval Using Ontology and Semantic Understanding Capability of LLM	Pralaypati Ta et.al.	2506.03145	translate	read	null
2025-06-03	Not All Tokens Are Meant to Be Forgotten	Xiangyu Zhou et.al.	2506.03142	translate	read	null
2025-06-03	SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation	Siqi Chen et.al.	2506.03139	translate	read	link
2025-06-03	Native-Resolution Image Synthesis	Zidong Wang et.al.	2506.03131	translate	read	link
2025-06-03	AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation	Lu Qiu et.al.	2506.03126	translate	read	link
2025-06-03	AUTOCIRCUIT-RL: Reinforcement Learning-Driven LLM for Automated Circuit Topology Generation	Prashanth Vijayaraghavan et.al.	2506.03122	translate	read	null
2025-06-03	Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback	Xiaoying Zhang et.al.	2506.03106	translate	read	link
2025-06-03	TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models	Chetwin Low et.al.	2506.03099	translate	read	link
2025-06-03	EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models	Mingzhe Li et.al.	2506.03067	translate	read	null
2025-06-03	Facts Do Care About Your Language: Assessing Answer Quality of Multilingual LLMs	Yuval Kansal et.al.	2506.03051	translate	read	null

(<a href=../LLM.md>back to LLM</a>)