LLM - 2024-10

Publish Date Title Authors PDF Translate Read Code
2024-10-31 P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation Mohamed Elgaar et.al. 2410.24201 translate read null
2024-10-31 Constraint Back-translation Improves Complex Instruction Following of Large Language Models Yunjia Qi et.al. 2410.24175 translate read link
2024-10-31 Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning Jinghan Zhang et.al. 2410.24155 translate read null
2024-10-31 Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning Jiaqi Liu et.al. 2410.24152 translate read null
2024-10-31 Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age Nouar AlDahoul et.al. 2410.24148 translate read null
2024-10-31 Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing Akash Dhruv et.al. 2410.24119 translate read link
2024-10-31 Repository-Level Compositional Code Translation and Validation Ali Reza Ibrahimzada et.al. 2410.24117 translate read null
2024-10-31 Nearest Neighbor Normalization Improves Multimodal Retrieval Neil Chowdhury et.al. 2410.24114 translate read link
2024-10-30 EMMA: End-to-End Multimodal Model for Autonomous Driving Jyh-Jing Hwang et.al. 2410.23262 translate read null
2024-10-30 Evaluating Cultural and Social Awareness of LLM Web Agents Haoyi Qiu et.al. 2410.23252 translate read null
2024-10-30 Carrot and Stick: Eliciting Comparison Data and Beyond Yiling Chen et.al. 2410.23243 translate read null
2024-10-30 A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment Matteo G. Mecattaf et.al. 2410.23242 translate read null
2024-10-30 EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning Peide Huang et.al. 2410.23234 translate read null
2024-10-31 Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval Sheryl Hsu et.al. 2410.23214 translate read null
2024-10-30 Reliability of Topic Modeling Kayla Schroeder et.al. 2410.23186 translate read null
2024-10-30 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm Zhichao Hou et.al. 2410.23182 translate read null
2024-10-30 ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning Millennium Bismay et.al. 2410.23180 translate read link
2024-10-30 SciPIP: An LLM-based Scientific Paper Idea Proposer Wenxiao Wang et.al. 2410.23166 translate read link
2024-10-29 Enhancing Code Annotation Reliability: Generative AI’s Role in Comment Quality Assessment Models Seetharam Killivalavan et.al. 2410.22323 translate read null
2024-10-29 Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting Can Chen et.al. 2410.22318 translate read link
2024-10-29 Natural Language Inference Improves Compositionality in Vision-Language Models Paola Cascante-Bonilla et.al. 2410.22315 translate read null
2024-10-29 GPT-4o reads the mind in the eyes James W. A. Strachan et.al. 2410.22309 translate read null
2024-10-29 SVIP: Towards Verifiable Inference of Open-source Large Language Models Yifan Sun et.al. 2410.22307 translate read null
2024-10-29 Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Yihe Deng et.al. 2410.22304 translate read null
2024-10-29 LLMs are Highly-Constrained Biophysical Sequence Optimizers Angelica Chen et.al. 2410.22296 translate read null
2024-10-29 Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats Mohammad Setak et.al. 2410.22293 translate read null
2024-10-29 Embedding-based classifiers can detect prompt injection attacks Md. Ahsan Ayub et.al. 2410.22284 translate read link
2024-10-29 Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models Renzhe Yu et.al. 2410.22282 translate read null
2024-10-28 Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics Yaniv Nikankin et.al. 2410.21272 translate read link
2024-10-28 LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior Hanyu Wang et.al. 2410.21264 translate read link
2024-10-28 AutoBench-V: Can Large Vision-Language Models Benchmark Themselves? Han Bao et.al. 2410.21259 translate read link
2024-10-28 LongReward: Improving Long-context Large Language Models with AI Feedback Jiajie Zhang et.al. 2410.21252 translate read link
2024-10-28 Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback Nour Jedidi et.al. 2410.21242 translate read null
2024-10-28 Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce Zhantao Yang et.al. 2410.21237 translate read null
2024-10-28 Flaming-hot Initiation with Regular Execution Sampling for Large Language Models Weizhe Chen et.al. 2410.21236 translate read null
2024-10-28 LoRA vs Full Fine-tuning: An Illusion of Equivalence Reece Shuttleworth et.al. 2410.21228 translate read null
2024-10-28 Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations Kaifeng Huang et.al. 2410.21218 translate read null
2024-10-28 BongLLaMA: LLaMA for Bangla Language Abdullah Khan Zehady et.al. 2410.21200 translate read null
2024-10-25 The Potential and Value of AI Chatbot in Personalized Cognitive Training Zilong Wang et.al. 2410.19733 translate read null
2024-10-25 Counting Ability of Large Language Models and Impact of Tokenization Xiang Zhang et.al. 2410.19730 translate read link
2024-10-25 FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning Nicole Cho et.al. 2410.19727 translate read null
2024-10-25 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision Shilong Li et.al. 2410.19720 translate read null
2024-10-25 TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning Xiangyu Zeng et.al. 2410.19702 translate read link
2024-10-25 IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation Kaixian Qu et.al. 2410.19697 translate read null
2024-10-25 Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs Yifei Zhang et.al. 2410.19694 translate read null
2024-10-25 APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs Huaxiaoyue Wang et.al. 2410.19656 translate read null
2024-10-25 Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina Yuan Gao et.al. 2410.19599 translate read null
2024-10-25 Diverse Sign Language Translation Xin Shen et.al. 2410.19586 translate read null
2024-10-24 Unbounded: A Generative Infinite Game of Character Life Simulation Jialu Li et.al. 2410.18975 translate read null
2024-10-24 Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Zhangheng Li et.al. 2410.18967 translate read link
2024-10-24 Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions Yujuan Fu et.al. 2410.18966 translate read null
2024-10-24 OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning Xiaoqiang Wang et.al. 2410.18963 translate read link
2024-10-24 Bridge-Coder: Unlocking LLMs’ Potential to Overcome Language Gaps in Low-Resource Code Jipeng Zhang et.al. 2410.18957 translate read null
2024-10-24 BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning Yujuan Velvin Fu et.al. 2410.18955 translate read null
2024-10-24 Dynamic Vocabulary Pruning in Early-Exit LLMs Jort Vincenti et.al. 2410.18952 translate read link
2024-10-24 SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models Zonghao Ying et.al. 2410.18927 translate read null
2024-10-24 From Blind Solvers to Logical Thinkers: Benchmarking LLMs’ Logical Integrity on Faulty Mathematical Problems A M Muntasir Rahman et.al. 2410.18921 translate read null
2024-10-24 A Survey on Speech Large Language Models Jing Peng et.al. 2410.18908 translate read null
2024-10-23 TP-Eval: Tap Multimodal LLMs’ Potential in Evaluation by Customizing Prompts Yuxuan Xie et.al. 2410.18071 translate read null
2024-10-23 LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering Qingfei Zhao et.al. 2410.18050 translate read link
2024-10-23 Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases Anna Glazkova et.al. 2410.18040 translate read null
2024-10-23 MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning Jingfan Zhang et.al. 2410.18035 translate read null
2024-10-23 GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration Xin Li et.al. 2410.18032 translate read link
2024-10-23 MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting Sungil Seok et.al. 2410.18012 translate read null
2024-10-23 Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation Suho Kang et.al. 2410.18001 translate read link
2024-10-23 Zeitenwenden: Detecting changes in the German political discourse Kai-Robin Lange et.al. 2410.17960 translate read null
2024-10-23 ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference Xin He et.al. 2410.17954 translate read null
2024-10-23 SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains Ran Xu et.al. 2410.17952 translate read null
2024-10-22 Altogether: Image Captioning via Re-aligning Alt-text Hu Xu et.al. 2410.17251 translate read null
2024-10-22 Large Language Models Empowered Personalized Web Agents Hongru Cai et.al. 2410.17236 translate read null
2024-10-22 Automated Spinal MRI Labelling from Reports Using a Large Language Model Robin Y. Park et.al. 2410.17235 translate read link
2024-10-22 Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy Benedict Aaron Tjandra et.al. 2410.17234 translate read null
2024-10-22 Few-shot In-Context Preference Learning Using Large Language Models Chao Yu et.al. 2410.17233 translate read null
2024-10-22 Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods Tsachi Blau et.al. 2410.17222 translate read null
2024-10-22 Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling Azmine Toushik Wasi et.al. 2410.17210 translate read link
2024-10-22 VoiceBench: Benchmarking LLM-Based Voice Assistants Yiming Chen et.al. 2410.17196 translate read link
2024-10-22 Language Model Non-myopic Generation for Reasoning and Planning Chang Ma et.al. 2410.17195 translate read null
2024-10-22 From Attention to Activation: Unravelling the Enigmas of Large Language Models Prannay Kaul et.al. 2410.17174 translate read null
2024-10-21 Reflection-Bench: probing AI intelligence with reflection Lingyu Li et.al. 2410.16270 translate read link
2024-10-21 Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance Zhangwei Gao et.al. 2410.16261 translate read link
2024-10-21 Elucidating the design space of language models for image generation Xuantong Liu et.al. 2410.16257 translate read null
2024-10-21 CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution Maosong Cao et.al. 2410.16256 translate read link
2024-10-21 Can Knowledge Editing Really Correct Hallucinations? Baixiang Huang et.al. 2410.16251 translate read link
2024-10-21 Analyzing Context Contributions in LLM-based Machine Translation Emmanouil Zaranis et.al. 2410.16246 translate read null
2024-10-21 IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems Yihuan Mao et.al. 2410.16237 translate read null
2024-10-21 LLaVA-KD: A Framework of Distilling Multimodal Large Language Models Yuxuan Cai et.al. 2410.16236 translate read null
2024-10-21 ToW: Thoughts of Words Improve Reasoning in Large Language Models Zhikun Xu et.al. 2410.16235 translate read null
2024-10-21 Building A Coding Assistant via the Retrieval-Augmented Language Model Xinze Li et.al. 2410.16229 translate read null
2024-10-18 Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts German Gritsai et.al. 2410.14677 translate read null
2024-10-18 SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment Qin Liu et.al. 2410.14676 translate read null
2024-10-18 Enhancing Large Language Models’ Situated Faithfulness to External Contexts Yukun Huang et.al. 2410.14675 translate read link
2024-10-18 NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Baiqi Li et.al. 2410.14669 translate read null
2024-10-18 MiCEval: Unveiling Multimodal Chain of Thought’s Quality via Image Description and Reasoning Steps Xiongtao Zhou et.al. 2410.14668 translate read link
2024-10-18 A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Shengjie Sun et.al. 2410.14660 translate read null
2024-10-18 EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search Oliver Sieberling et.al. 2410.14649 translate read null
2024-10-18 Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs Runchu Tian et.al. 2410.14641 translate read link
2024-10-18 GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings Raghuveer Thirukovalluru et.al. 2410.14635 translate read null
2024-10-18 You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools Daniel Baumartz et.al. 2410.14626 translate read null
2024-10-17 Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Lijie Fan et.al. 2410.13863 translate read null
2024-10-17 PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Rongyao Fang et.al. 2410.13861 translate read link
2024-10-17 $γ-$ MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models Yaxin Luo et.al. 2410.13859 translate read null
2024-10-17 How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs Guhao Feng et.al. 2410.13857 translate read null
2024-10-17 Can MLLMs Understand the Deep Implication Behind Chinese Images? Chenhao Zhang et.al. 2410.13854 translate read link
2024-10-17 Retrospective Learning from Interactions Zizhao Chen et.al. 2410.13852 translate read null
2024-10-17 SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction Xuan Zhang et.al. 2410.13846 translate read link
2024-10-17 Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs Tianyu Guo et.al. 2410.13835 translate read null
2024-10-17 AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents Ke Yang et.al. 2410.13825 translate read null
2024-10-17 Harnessing Webpage UIs for Text-Rich Visual Understanding Junpeng Liu et.al. 2410.13824 translate read null
2024-10-16 Context is Key(NMF): Modelling Topical Information Dynamics in Chinese Diaspora Media Ross Deans Kristensen-McLachlan et.al. 2410.12791 translate read null
2024-10-16 Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception Jihao Zhao et.al. 2410.12788 translate read null
2024-10-16 In-Context Learning Enables Robot Action Prediction in LLMs Yida Yin et.al. 2410.12782 translate read null
2024-10-16 Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information Yingya Li et.al. 2410.12774 translate read null
2024-10-16 StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples Ajay Patel et.al. 2410.12757 translate read null
2024-10-16 Comparative Analysis of Extrinsic Factors for NER in French Grace Yang et.al. 2410.12750 translate read null
2024-10-16 CREAM: Consistency Regularized Self-Rewarding Language Models Zhaoyang Wang et.al. 2410.12735 translate read null
2024-10-16 FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression Zhenheng Tang et.al. 2410.12707 translate read null
2024-10-16 WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Genta Indra Winata et.al. 2410.12705 translate read null
2024-10-16 Sarcasm Detection in a Less-Resourced Language Lazar Đoković et.al. 2410.12704 translate read null
2024-10-15 GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation Fei Tang et.al. 2410.11841 translate read null
2024-10-15 MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding Yue Cao et.al. 2410.11829 translate read link
2024-10-15 SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing Zhiyuan Zhang et.al. 2410.11815 translate read null
2024-10-15 NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models Han Han et.al. 2410.11805 translate read null
2024-10-15 FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting Zhe Li et.al. 2410.11802 translate read null
2024-10-15 Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability Tsz Ting Chung et.al. 2410.11786 translate read null
2024-10-15 G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks Guibin Zhang et.al. 2410.11782 translate read null
2024-10-15 Language Models Encode Numbers Using Digit Representations in Base 10 Amit Arnold Levy et.al. 2410.11781 translate read null
2024-10-15 MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation Chenxi Wang et.al. 2410.11779 translate read link
2024-10-15 Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models Kai Yao et.al. 2410.11772 translate read link
2024-10-14 DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads Guangxuan Xiao et.al. 2410.10819 translate read link
2024-10-14 TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models Mu Cai et.al. 2410.10818 translate read null
2024-10-14 Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Ziyue Li et.al. 2410.10814 translate read null
2024-10-14 LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory Di Wu et.al. 2410.10813 translate read link
2024-10-14 Local and Global Decoding in Text Generation Daniel Gareev et.al. 2410.10810 translate read link
2024-10-14 Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning Aakanksha et.al. 2410.10801 translate read null
2024-10-14 Towards Foundation Models for 3D Vision: How Close Are We? Yiming Zuo et.al. 2410.10799 translate read null
2024-10-14 MMAR: Towards Lossless Multi-Modal Auto-Regressive Prababilistic Modeling Jian Yang et.al. 2410.10798 translate read null
2024-10-14 Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance Sachin Goyal et.al. 2410.10796 translate read link
2024-10-14 LiveXiv – A Multi-Modal Live Benchmark Based on Arxiv Papers Content Nimrod Shabtay et.al. 2410.10783 translate read link
2024-10-11 MiRAGeNews: Multimodal Realistic AI-Generated News Detection Runsheng Huang et.al. 2410.09045 translate read null
2024-10-11 AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation Zijun Wang et.al. 2410.09040 translate read link
2024-10-11 Semi-Supervised Learning of Noisy Mixture of Experts Models Oh-Ran Kwon et.al. 2410.09039 translate read null
2024-10-11 SimpleStrat: Diversifying Language Model Generation with Stratification Justin Wong et.al. 2410.09038 translate read null
2024-10-11 Mentor-KD: Making Small Language Models Better Multi-step Reasoners Hojae Lee et.al. 2410.09037 translate read link
2024-10-11 PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents Xiangyu Yin et.al. 2410.09034 translate read null
2024-10-11 The Impact of Visual Information in Chinese Characters: Evaluating Large Models’ Ability to Recognize and Utilize Radicals Xiaofeng Wu et.al. 2410.09013 translate read null
2024-10-11 Software Engineering and Foundation Models: Insights from Industry Blogs Using a Jury of Foundation Models Hao Li et.al. 2410.09012 translate read null
2024-10-11 SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights Ling Yang et.al. 2410.09008 translate read link
2024-10-11 From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts Zhuohao Jerry Zhang et.al. 2410.09006 translate read null
2024-10-10 Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision Shengcao Cao et.al. 2410.08209 translate read null
2024-10-10 Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training Gen Luo et.al. 2410.08202 translate read null
2024-10-10 From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions Changle Qu et.al. 2410.08197 translate read link
2024-10-10 MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Zimu Lu et.al. 2410.08196 translate read link
2024-10-10 GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment Yuancheng Xu et.al. 2410.08193 translate read null
2024-10-10 Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models Qingni Wang et.al. 2410.08174 translate read null
2024-10-10 On the Evaluation of Generative Robotic Simulations Feng Chen et.al. 2410.08172 translate read null
2024-10-10 Agent S: An Open Agentic Framework that Uses Computers Like a Human Saaket Agashe et.al. 2410.08164 translate read link
2024-10-10 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Amrith Setlur et.al. 2410.08146 translate read null
2024-10-10 Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs Xiaoyuan Liu et.al. 2410.08145 translate read null
2024-10-09 Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models Fei Wang et.al. 2410.07176 translate read null
2024-10-09 Do better language models have crisper vision? Jona Ruthardt et.al. 2410.07173 translate read null
2024-10-09 Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate Qidong Huang et.al. 2410.07167 translate read link
2024-10-09 Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making Manling Li et.al. 2410.07166 translate read link
2024-10-09 Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning Chongyu Fan et.al. 2410.07163 translate read null
2024-10-09 Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis Bohan Zeng et.al. 2410.07155 translate read link
2024-10-09 Mental Disorders Detection in the Era of Large Language Models Gleb Kuzmin et.al. 2410.07129 translate read null
2024-10-09 Personalized Visual Instruction Tuning Renjie Pi et.al. 2410.07113 translate read null
2024-10-09 I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy Gian Maria Campedelli et.al. 2410.07109 translate read null
2024-10-09 Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context Sangwon Yu et.al. 2410.07103 translate read null
2024-10-07 Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models Fei Wang et.al. 2410.05269 translate read null
2024-10-07 PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs Mengzhao Chen et.al. 2410.05265 translate read link
2024-10-07 TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles Qingchen Yu et.al. 2410.05262 translate read link
2024-10-07 Differential Transformer Tianzhu Ye et.al. 2410.05258 translate read null
2024-10-07 GLEE: A Unified Framework and Benchmark for Language-based Economic Environments Eilam Shapira et.al. 2410.05254 translate read link
2024-10-07 Causal Micro-Narratives Mourad Heddaya et.al. 2410.05252 translate read null
2024-10-07 LoTLIP: Improving Language-Image Pre-training for Long Text Understanding Wei Wu et.al. 2410.05249 translate read null
2024-10-07 SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe Yuxin Xiao et.al. 2410.05248 translate read null
2024-10-07 Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents Boyu Gou et.al. 2410.05243 translate read null
2024-10-07 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models Iman Mirzadeh et.al. 2410.05229 translate read null
2024-10-04 Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models Zhuochun Li et.al. 2410.03663 translate read null
2024-10-04 RAFT: Realistic Attacks to Fool Text Detectors James Wang et.al. 2410.03658 translate read null
2024-10-04 Aligning LLMs with Individual Preferences via Interaction Shujin Wu et.al. 2410.03642 translate read link
2024-10-04 Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation Jie Xiao et.al. 2410.03613 translate read null
2024-10-04 TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation Jonathan Cook et.al. 2410.03608 translate read null
2024-10-04 Efficiently Identifying Watermarked Segments in Mixed-Source Texts Xuandong Zhao et.al. 2410.03600 translate read null
2024-10-04 Understanding Reasoning in Chain-of-Thought from the Hopfieldian View Lijie Hu et.al. 2410.03595 translate read null
2024-10-04 Explicit, Implicit, and Scattered: Revisiting Event Extraction to Capture Complex Arguments Omar Sharif et.al. 2410.03594 translate read null
2024-10-04 Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models Xin Zou et.al. 2410.03577 translate read null
2024-10-04 Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) Abrar Rahman et.al. 2410.03568 translate read null
2024-10-03 FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models Zhipei Xu et.al. 2410.02761 translate read null
2024-10-03 Loong: Generating Minute-level Long Videos with Autoregressive Language Models Yuqing Wang et.al. 2410.02757 translate read null
2024-10-03 SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost Jifan Zhang et.al. 2410.02755 translate read null
2024-10-03 Training Language Models on Synthetic Edit Sequences Improves Code Synthesis Ulyana Piterbarg et.al. 2410.02749 translate read null
2024-10-03 CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation Han He et.al. 2410.02748 translate read null
2024-10-03 Contrastive Localized Language-Image Pre-Training Hong-You Chen et.al. 2410.02746 translate read null
2024-10-03 Neutral residues: revisiting adapters for model extension Franck Signe Talla et.al. 2410.02744 translate read null
2024-10-03 MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions Yekun Chai et.al. 2410.02743 translate read null
2024-10-03 Grounding Large Language Models In Embodied Environment With Imperfect World Models Haolan Liu et.al. 2410.02742 translate read null
2024-10-03 Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization Lei Xu et.al. 2410.02741 translate read null
2024-10-02 Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads Yuxiang Huang et.al. 2410.01805 translate read link
2024-10-02 Efficient $1$ -bit tensor approximations Alex W. Neal Riasanovsky et.al. 2410.01799 translate read null
2024-10-02 Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models Joseph Lee et.al. 2410.01795 translate read link
2024-10-02 When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 R. Thomas McCoy et.al. 2410.01792 translate read null
2024-10-02 Investigating on RLHF methodology Alexey Kutalev et.al. 2410.01789 translate read null
2024-10-02 OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models Heng Yang et.al. 2410.01784 translate read link
2024-10-02 Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models Shayekh Bin Islam et.al. 2410.01782 translate read null
2024-10-02 Quantifying Generalization Complexity for Large Language Models Zhenting Qi et.al. 2410.01769 translate read null
2024-10-02 LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks Mengzhao Jia et.al. 2410.01744 translate read null
2024-10-02 VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models Kailai Feng et.al. 2410.01738 translate read link
2024-10-02 Linear Projections of Teacher Embeddings for Few-Class Distillation Noel Loo et.al. 2409.20449 translate read null
2024-10-01 Instance-adaptive Zero-shot Chain-of-Thought Prompting Xiaosong Yuan et.al. 2409.20441 translate read null

(<a href=../LLM.md>back to LLM</a>)