LLM - 2024-12
LLM - 2024-12
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-12-30 | Distributed Mixture-of-Agents for Edge Inference with Large Language Models | Purbesh Mitra et.al. | 2412.21200 | translate | read | link |
| 2024-12-31 | HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation | Zhaojian Yu et.al. | 2412.21199 | translate | read | link |
| 2024-12-30 | Facilitating large language model Russian adaptation with Learned Embedding Propagation | Mikhail Tikhomirov et.al. | 2412.21140 | translate | read | link |
| 2024-12-30 | ExpShield: Safeguarding Web Text from Unauthorized Crawling and Language Modeling Exploitation | Ruixuan Liu et.al. | 2412.21123 | translate | read | null |
| 2024-12-30 | Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense | Yuyang Zhou et.al. | 2412.21051 | translate | read | link |
| 2024-12-30 | TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization | Chia-Yu Hung et.al. | 2412.21037 | translate | read | link |
| 2024-12-30 | GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models | Shangyu Xing et.al. | 2412.21036 | translate | read | null |
| 2024-12-30 | Automated Robustness Testing for LLM-based NLP Software | Mingxuan Xiao et.al. | 2412.21016 | translate | read | link |
| 2024-12-30 | MapQaTor: A System for Efficient Annotation of Map Query Datasets | Mahir Labib Dihan et.al. | 2412.21015 | translate | read | link |
| 2024-12-31 | Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria | Joonwon Jang et.al. | 2412.21006 | translate | read | null |
| 2024-12-27 | Can AI Help with Your Personal Finances? | Oudom Hean et.al. | 2412.19784 | translate | read | null |
| 2024-12-27 | Machine Learning for Sentiment Analysis of Imported Food in Trinidad and Tobago | Cassandra Daniels et.al. | 2412.19781 | translate | read | null |
| 2024-12-27 | Fortran2CPP: Automating Fortran-to-C++ Migration using LLMs via Multi-Turn Dialogue and Dual-Agent Integration | Le Chen et.al. | 2412.19770 | translate | read | link |
| 2024-12-27 | Can Large Language Models Adapt to Other Agents In-Context? | Matthew Riemer et.al. | 2412.19726 | translate | read | null |
| 2024-12-27 | Text2Insight: Transform natural language text into insights seamlessly using multi-model architecture | Pradeep Sain et.al. | 2412.19718 | translate | read | null |
| 2024-12-27 | Toward Adaptive Reasoning in Large Language Models with Thought Rollback | Sijia Chen et.al. | 2412.19707 | translate | read | link |
| 2024-12-27 | A Large-scale Interpretable Multi-modality Benchmark for Facial Image Forgery Localization | Jingchun Lian et.al. | 2412.19685 | translate | read | null |
| 2024-12-27 | Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework | Jiang Liu et.al. | 2412.19684 | translate | read | null |
| 2024-12-27 | CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs | Siyu Wang et.al. | 2412.19663 | translate | read | link |
| 2024-12-27 | FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios | Kaiyi Pang et.al. | 2412.19652 | translate | read | null |
| 2024-12-24 | Decentralized Intelligence in GameFi: Embodied AI Agents and the Convergence of DeFi and Virtual Ecosystems | Fernando Jia et.al. | 2412.18601 | translate | read | link |
| 2024-12-24 | A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs | OpenMind et.al. | 2412.18588 | translate | read | null |
| 2024-12-24 | Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control | Sergey Sedov et.al. | 2412.18582 | translate | read | null |
| 2024-12-24 | Zero-resource Speech Translation and Recognition with LLMs | Karel Mundnich et.al. | 2412.18566 | translate | read | null |
| 2024-12-24 | Distilling Fine-grained Sentiment Understanding from Large Language Models | Yice Zhang et.al. | 2412.18552 | translate | read | link |
| 2024-12-24 | Token-Budget-Aware LLM Reasoning | Tingxu Han et.al. | 2412.18547 | translate | read | link |
| 2024-12-24 | PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction | Xingjian Xu et.al. | 2412.18541 | translate | read | null |
| 2024-12-24 | Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation | Derong Xu Xinhang Li et.al. | 2412.18537 | translate | read | link |
| 2024-12-24 | Automated Code Review In Practice | Umut Cihan et.al. | 2412.18531 | translate | read | null |
| 2024-12-24 | Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving | Hao Pang et.al. | 2412.18511 | translate | read | null |
| 2024-12-23 | ChatGarment: Garment Estimation, Generation and Editing via Large Language Models | Siyuan Bian et.al. | 2412.17811 | translate | read | null |
| 2024-12-23 | Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective | Xinmiao Yu et.al. | 2412.17787 | translate | read | null |
| 2024-12-23 | ResearchTown: Simulator of Human Research Community | Haofei Yu et.al. | 2412.17767 | translate | read | link |
| 2024-12-23 | Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy | Priyaranjan Pattnayak et.al. | 2412.17759 | translate | read | null |
| 2024-12-23 | ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level Feedback | Wei Zhang et.al. | 2412.17754 | translate | read | null |
| 2024-12-23 | Deliberation in Latent Space via Differentiable Cache Augmentation | Luyang Liu et.al. | 2412.17747 | translate | read | null |
| 2024-12-23 | YuLan-Mini: An Open Data-efficient Language Model | Yiwen Hu et.al. | 2412.17743 | translate | read | link |
| 2024-12-23 | Reasoning to Attend: Try to Understand How |
Rui Qian et.al. | 2412.17741 | translate | read | link |
| 2024-12-23 | Knowledge Editing through Chain-of-Thought | Changyue Wang et.al. | 2412.17727 | translate | read | link |
| 2024-12-23 | Understanding the Logic of Direct Preference Alignment through Logic | Kyle Richardson et.al. | 2412.17696 | translate | read | null |
| 2024-12-20 | HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding | Chenxin Tao et.al. | 2412.16158 | translate | read | null |
| 2024-12-20 | Offline Reinforcement Learning for LLM Multi-Step Reasoning | Huaijie Wang et.al. | 2412.16145 | translate | read | link |
| 2024-12-20 | Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation | Seyedreza Mohseni et.al. | 2412.16135 | translate | read | link |
| 2024-12-20 | Data-Driven Mechanism Design: Jointly Eliciting Preferences and Information | Dirk Bergemann et.al. | 2412.16132 | translate | read | null |
| 2024-12-20 | PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics | Daniil Larionov et.al. | 2412.16120 | translate | read | null |
| 2024-12-20 | Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource Scripts | Muhammad Abdullah Sohail et.al. | 2412.16119 | translate | read | link |
| 2024-12-20 | PruneVid: Visual Token Pruning for Efficient Video Large Language Models | Xiaohu Huang et.al. | 2412.16117 | translate | read | link |
| 2024-12-20 | The Content Moderator’s Dilemma: Removal of Toxic Content and Distortions to Online Discourse | Mahyar Habibi et.al. | 2412.16114 | translate | read | null |
| 2024-12-20 | Logical Consistency of Large Language Models in Fact-checking | Bishwamittra Ghosh et.al. | 2412.16100 | translate | read | null |
| 2024-12-20 | The Evolution of LLM Adoption in Industry Data Curation Practices | Crystal Qian et.al. | 2412.16089 | translate | read | null |
| 2024-12-19 | UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency | Enis Simsar et.al. | 2412.15216 | translate | read | null |
| 2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | translate | read | null |
| 2024-12-19 | OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving | Shuo Xing et.al. | 2412.15208 | translate | read | link |
| 2024-12-19 | AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving | Shuo Xing et.al. | 2412.15206 | translate | read | link |
| 2024-12-19 | MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark | Qihao Zhao et.al. | 2412.15194 | translate | read | link |
| 2024-12-19 | LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation | Weijia Shi et.al. | 2412.15188 | translate | read | null |
| 2024-12-19 | Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning | Simon Frieder et.al. | 2412.15184 | translate | read | null |
| 2024-12-19 | HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages | Aman Chaturvedi et.al. | 2412.15178 | translate | read | null |
| 2024-12-19 | Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying | Federico Castagna et.al. | 2412.15177 | translate | read | link |
| 2024-12-19 | Rethinking Uncertainty Estimation in Natural Language Generation | Lukas Aichberger et.al. | 2412.15176 | translate | read | null |
| 2024-12-18 | Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces | Jihan Yang et.al. | 2412.14171 | translate | read | link |
| 2024-12-18 | TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks | Frank F. Xu et.al. | 2412.14161 | translate | read | link |
| 2024-12-18 | Advanced Reasoning and Transformation Engine for Multi-Step Insight Synthesis in Data Analytics with Large Language Models | Atin Sakkeer Hussain et.al. | 2412.14146 | translate | read | null |
| 2024-12-18 | LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research | Tianyang Gu et.al. | 2412.14141 | translate | read | null |
| 2024-12-18 | Design choices made by LLM-based test generators prevent them from finding bugs | Noble Saji Mathews et.al. | 2412.14137 | translate | read | null |
| 2024-12-18 | Adversarial Hubness in Multi-Modal Retrieval | Tingwei Zhang et.al. | 2412.14113 | translate | read | link |
| 2024-12-18 | Alignment faking in large language models | Ryan Greenblatt et.al. | 2412.14093 | translate | read | link |
| 2024-12-18 | Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report | Markus Dablander et.al. | 2412.14085 | translate | read | null |
| 2024-12-18 | Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification | Kyle Thompson et.al. | 2412.14063 | translate | read | null |
| 2024-12-18 | Understanding and Evaluating Trust in Generative AI and Large Language Models for Spreadsheets | Simon Thorne et.al. | 2412.14062 | translate | read | null |
| 2024-12-17 | SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents | Sheng Yin et.al. | 2412.13178 | translate | read | link |
| 2024-12-17 | DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation | Miriam Wanner et.al. | 2412.13175 | translate | read | null |
| 2024-12-17 | Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study | Bolei Ma et.al. | 2412.13169 | translate | read | link |
| 2024-12-17 | C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System | Parker Addison et.al. | 2412.13163 | translate | read | null |
| 2024-12-17 | BanglishRev: A Large-Scale Bangla-English and Code-mixed Dataset of Product Reviews in E-Commerce | Mohammad Nazmush Shamael et.al. | 2412.13161 | translate | read | null |
| 2024-12-17 | SWAN: Preprocessing SGD Enables Adam-Level Performance On LLM Training With Significant Memory Reduction | Chao Ma et.al. | 2412.13148 | translate | read | null |
| 2024-12-17 | Are Your LLMs Capable of Stable Reasoning? | Junnan Liu et.al. | 2412.13147 | translate | read | link |
| 2024-12-17 | AI PERSONA: Towards Life-long Personalization of LLMs | Tiannan Wang et.al. | 2412.13103 | translate | read | null |
| 2024-12-17 | AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark | Jianlyu Chen et.al. | 2412.13102 | translate | read | link |
| 2024-12-17 | Modality-Inconsistent Continual Learning of Multimodal Large Language Models | Weiguo Pian et.al. | 2412.13050 | translate | read | null |
| 2024-12-16 | SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Guoxuan Chen et.al. | 2412.12094 | translate | read | link |
| 2024-12-16 | Instruction-based Image Manipulation by Watching How Things Move | Mingdeng Cao et.al. | 2412.12087 | translate | read | null |
| 2024-12-16 | CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology | Yuxuan Sun et.al. | 2412.12077 | translate | read | null |
| 2024-12-16 | CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding | Guo Chen et.al. | 2412.12075 | translate | read | null |
| 2024-12-16 | Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats | Kuleen Sasse et.al. | 2412.12072 | translate | read | link |
| 2024-12-16 | How Private are Language Models in Abstractive Summarization? | Anthony Hughes et.al. | 2412.12040 | translate | read | null |
| 2024-12-16 | Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection | Ira Ceka et.al. | 2412.12039 | translate | read | null |
| 2024-12-16 | SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval | Yueqian Lin et.al. | 2412.12009 | translate | read | null |
| 2024-12-16 | Agentic AI-Driven Technical Troubleshooting for Enterprise Systems: A Novel Weighted Retrieval-Augmented Generation Paradigm | Rajat Khanda et.al. | 2412.12006 | translate | read | null |
| 2024-12-16 | The Open Source Advantage in Large Language Models (LLMs) | Jiya Manchanda et.al. | 2412.12004 | translate | read | null |
| 2024-12-13 | UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities | Muhammad Uzair Khattak et.al. | 2412.10372 | translate | read | link |
| 2024-12-13 | Robust image classification with multi-modal large language models | Francesco Villani et.al. | 2412.10353 | translate | read | null |
| 2024-12-13 | COMET: Benchmark for Comprehensive Biological Multi-omics Evaluation Tasks and Language Models | Yuchen Ren et.al. | 2412.10347 | translate | read | null |
| 2024-12-13 | Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining | Zhiqi Ge et.al. | 2412.10342 | translate | read | null |
| 2024-12-13 | AdvPrefix: An Objective for Nuanced LLM Jailbreaks | Sicheng Zhu et.al. | 2412.10321 | translate | read | null |
| 2024-12-13 | BrushEdit: All-In-One Image Inpainting and Editing | Yaowei Li et.al. | 2412.10316 | translate | read | link |
| 2024-12-13 | DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding | Zhiyu Wu et.al. | 2412.10302 | translate | read | link |
| 2024-12-13 | Buzz to Broadcast: Predicting Sports Viewership Using Social Media Engagement | Anakin Trotter et.al. | 2412.10298 | translate | read | link |
| 2024-12-13 | Still “Talking About Large Language Models”: Some Clarifications | Murray Shanahan et.al. | 2412.10291 | translate | read | null |
| 2024-12-13 | One world, one opinion? The superstar effect in LLM responses | Sofie Goethals et.al. | 2412.10281 | translate | read | null |
| 2024-12-12 | Doe-1: Closed-Loop Autonomous Driving with Large World Model | Wenzhao Zheng et.al. | 2412.09627 | translate | read | link |
| 2024-12-12 | EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM | Zhuofan Zong et.al. | 2412.09618 | translate | read | null |
| 2024-12-12 | Olympus: A Universal Task Router for Computer Vision Tasks | Yuanze Lin et.al. | 2412.09612 | translate | read | link |
| 2024-12-12 | SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding | Hao Li et.al. | 2412.09604 | translate | read | null |
| 2024-12-12 | Do Multimodal Large Language Models See Like Humans? | Jiaying Lin et.al. | 2412.09603 | translate | read | null |
| 2024-12-12 | InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions | Pan Zhang et.al. | 2412.09596 | translate | read | link |
| 2024-12-12 | OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages | Chester Palen-Michel et.al. | 2412.09587 | translate | read | null |
| 2024-12-12 | DISHONEST: Dissecting misInformation Spread using Homogeneous sOcial NEtworks and Semantic Topic classification | Caleb Stam et.al. | 2412.09578 | translate | read | null |
| 2024-12-12 | DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction | Yu Feng et.al. | 2412.09572 | translate | read | null |
| 2024-12-12 | Does Representation Matter? Exploring Intermediate Layers in Large Language Models | Oscar Skean et.al. | 2412.09563 | translate | read | null |
| 2024-12-11 | Generative Semantic Communication: Architectures, Technologies, and Applications | Jinke Ren et.al. | 2412.08642 | translate | read | null |
| 2024-12-11 | Fast Prompt Alignment for Text-to-Image Generation | Khalil Mrini et.al. | 2412.08639 | translate | read | link |
| 2024-12-11 | Multimodal Latent Language Modeling with Next-Token Diffusion | Yutao Sun et.al. | 2412.08635 | translate | read | null |
| 2024-12-11 | Synthetic Vision: Training Vision-Language Models to Understand Physics | Vahid Balazadeh et.al. | 2412.08619 | translate | read | null |
| 2024-12-11 | Image Retrieval Methods in the Dissimilarity Space | Madhu Kiran et.al. | 2412.08618 | translate | read | null |
| 2024-12-11 | Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models | Jiahui Li et.al. | 2412.08615 | translate | read | link |
| 2024-12-11 | Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning | Fan Lu et.al. | 2412.08614 | translate | read | link |
| 2024-12-11 | Preference Discerning with LLM-Enhanced Generative Retrieval | Fabian Paischer et.al. | 2412.08604 | translate | read | null |
| 2024-12-11 | Empirical Measurements of AI Training Power Demand on a GPU-Accelerated Node | Imran Latif et.al. | 2412.08602 | translate | read | null |
| 2024-12-11 | Leveraging Graph-RAG and Prompt Engineering to Enhance LLM-Based Automated Requirement Traceability and Compliance Checks | Arsalan Masoudifard et.al. | 2412.08593 | translate | read | null |
| 2024-12-10 | BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities | Sahal Shaji Mullappilly et.al. | 2412.07769 | translate | read | null |
| 2024-12-10 | Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences | Alan Nawzad Amin et.al. | 2412.07763 | translate | read | link |
| 2024-12-10 | Zero-Shot ATC Coding with Large Language Models for Clinical Assessments | Zijian Chen et.al. | 2412.07743 | translate | read | null |
| 2024-12-10 | Image Retrieval with Intra-Sweep Representation Learning for Neck Ultrasound Scanning Guidance | Wanwen Chen et.al. | 2412.07741 | translate | read | null |
| 2024-12-10 | Granite Guardian | Inkit Padhi et.al. | 2412.07724 | translate | read | link |
| 2024-12-10 | DriveMM: All-in-One Large Multimodal Model for Autonomous Driving | Zhijian Huang et.al. | 2412.07689 | translate | read | link |
| 2024-12-10 | Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions | Anant Prakash Awasthi et.al. | 2412.07687 | translate | read | null |
| 2024-12-10 | TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation | Alfredo Garrachón Ruiz et.al. | 2412.07682 | translate | read | null |
| 2024-12-10 | Ask Humans or AI? Exploring Their Roles in Visualization Troubleshooting | Shuyu Shen et.al. | 2412.07673 | translate | read | null |
| 2024-12-10 | FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks | Bocheng Chen et.al. | 2412.07672 | translate | read | null |
| 2024-12-09 | Training Large Language Models to Reason in a Continuous Latent Space | Shibo Hao et.al. | 2412.06769 | translate | read | null |
| 2024-12-09 | Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code | Joy Krishan Das et.al. | 2412.06757 | translate | read | null |
| 2024-12-09 | Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models | Neel Jain et.al. | 2412.06748 | translate | read | null |
| 2024-12-09 | JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM | Takuro Fujii et.al. | 2412.06738 | translate | read | null |
| 2024-12-09 | AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark | Lan Li et.al. | 2412.06724 | translate | read | null |
| 2024-12-09 | DEEPER: Dense Electroencephalography Passage Retrieval | Niall McGuire et.al. | 2412.06695 | translate | read | null |
| 2024-12-09 | OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions | Yi-Kai Zhang et.al. | 2412.06693 | translate | read | null |
| 2024-12-09 | Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach | Weichao Xu et.al. | 2412.06684 | translate | read | null |
| 2024-12-09 | Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework | Tianming Liu et.al. | 2412.06681 | translate | read | null |
| 2024-12-09 | I Don’t Know: Explicit Modeling of Uncertainty with an [IDK] Token | Roi Cohen et.al. | 2412.06676 | translate | read | null |
| 2024-12-06 | Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling | Zhe Chen et.al. | 2412.05271 | translate | read | null |
| 2024-12-06 | APOLLO: SGD-like Memory, AdamW-level Performance | Hanqing Zhu et.al. | 2412.05270 | translate | read | link |
| 2024-12-06 | CompCap: Improving Multimodal Large Language Models with Composite Captions | Xiaohui Chen et.al. | 2412.05243 | translate | read | null |
| 2024-12-06 | MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale | Jarvis Guo et.al. | 2412.05237 | translate | read | link |
| 2024-12-06 | BEExformer: A Fast Inferencing Transformer Architecture via Binarization with Multiple Early Exits | Wazib Ansar et.al. | 2412.05225 | translate | read | null |
| 2024-12-06 | 100% Hallucination Elimination Using Acurai | Michael C. Wood et.al. | 2412.05223 | translate | read | null |
| 2024-12-06 | Evaluating and Aligning CodeLLMs on Human Preference | Jian Yang et.al. | 2412.05210 | translate | read | link |
| 2024-12-06 | A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges | Aditi Singh et.al. | 2412.05208 | translate | read | null |
| 2024-12-06 | Are Frontier Large Language Models Suitable for Q&A in Science Centres? | Jacob Watson et.al. | 2412.05200 | translate | read | null |
| 2024-12-06 | SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot | Jinlin Wu et.al. | 2412.05187 | translate | read | link |
| 2024-12-05 | p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay | Jun Zhang et.al. | 2412.04449 | translate | read | link |
| 2024-12-05 | EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios | Lu Qiu et.al. | 2412.04447 | translate | read | null |
| 2024-12-05 | Moto: Latent Motion Token as the Bridging Language for Robot Manipulation | Yi Chen et.al. | 2412.04445 | translate | read | link |
| 2024-12-05 | Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation | Yuying Ge et.al. | 2412.04432 | translate | read | link |
| 2024-12-05 | Grounding Descriptions in Images informs Zero-Shot Visual Recognition | Shaunak Halbe et.al. | 2412.04429 | translate | read | link |
| 2024-12-05 | Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion | Jiuhai Chen et.al. | 2412.04424 | translate | read | link |
| 2024-12-05 | Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation | Xuying Li et.al. | 2412.04415 | translate | read | null |
| 2024-12-05 | Retrieval-Augmented Machine Translation with Unstructured Knowledge | Jiaan Wang et.al. | 2412.04342 | translate | read | link |
| 2024-12-05 | Liquid: Language Models are Scalable Multi-modal Generators | Junfeng Wu et.al. | 2412.04332 | translate | read | link |
| 2024-12-05 | The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation | Fredrik Carlsson et.al. | 2412.04318 | translate | read | null |
| 2024-12-04 | From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents | Xinyi Mou et.al. | 2412.03563 | translate | read | link |
| 2024-12-04 | SPICE: Smart Projection Interface for Cooking Enhancement | Vera Prohaska et.al. | 2412.03551 | translate | read | null |
| 2024-12-04 | Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models | Natalie Mackraz et.al. | 2412.03537 | translate | read | null |
| 2024-12-04 | A Review on Scientific Knowledge Extraction using Large Language Models in Biomedical Sciences | Gabriel Lino Garcia et.al. | 2412.03531 | translate | read | null |
| 2024-12-04 | FANAL – Financial Activity News Alerting Language Modeling Framework | Urjitkumar Patel et.al. | 2412.03527 | translate | read | null |
| 2024-12-04 | You’re (Not) My Type – Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks? | Dominic Lohr et.al. | 2412.03516 | translate | read | null |
| 2024-12-04 | Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective | Neta Shaul et.al. | 2412.03487 | translate | read | null |
| 2024-12-04 | Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning | Neale Ratzlaff et.al. | 2412.03467 | translate | read | null |
| 2024-12-04 | From Words to Workflows: Automating Business Processes | Laura Minkova et.al. | 2412.03446 | translate | read | null |
| 2024-12-04 | RedStone: Curating General, Code, Math, and QA Data for Large Language Models | Yaoyao Chang et.al. | 2412.03398 | translate | read | null |
| 2024-12-03 | T-REG: Preference Optimization with Token-Level Reward Regularization | Wenxuan Zhou et.al. | 2412.02685 | translate | read | link |
| 2024-12-03 | Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models | Yuda Song et.al. | 2412.02674 | translate | read | null |
| 2024-12-03 | LLM-Enhanced Path Planning: Safe and Efficient Autonomous Navigation with Instructional Inputs | Pranav Doma et.al. | 2412.02655 | translate | read | null |
| 2024-12-03 | Time-Reversal Provides Unsupervised Feedback to LLMs | Yerram Varun et.al. | 2412.02626 | translate | read | null |
| 2024-12-03 | Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback | Hiroki Furuta et.al. | 2412.02617 | translate | read | null |
| 2024-12-03 | AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? | Kaixiong Gong et.al. | 2412.02611 | translate | read | link |
| 2024-12-03 | Interpretable Company Similarity with Sparse Autoencoders | Marco Molinari et.al. | 2412.02605 | translate | read | null |
| 2024-12-03 | CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs | Abhas Kumar et.al. | 2412.02602 | translate | read | null |
| 2024-12-03 | PrefixLLM: LLM-aided Prefix Circuit Design | Weihua Xiao et.al. | 2412.02594 | translate | read | null |
| 2024-12-03 | OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation | Junyuan Zhang et.al. | 2412.02592 | translate | read | link |
| 2024-12-02 | T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs | Shukang Yin et.al. | 2411.19951 | translate | read | link |
| 2024-12-02 | Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM’s Reasoning Capability | Zicheng Lin et.al. | 2411.19943 | translate | read | link |
| 2024-12-02 | LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states | Luis Ibanez-Lissen et.al. | 2411.19876 | translate | read | null |
(<a href=../LLM.md>back to LLM</a>)