Image Generation - 2024-06

Publish Date Title Authors PDF Translate Read Code
2024-06-28 Wavelets Are All You Need for Autoregressive Image Generation Wael Mattar et.al. 2406.19997 translate read null
2024-06-28 Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs Sangwon Jeong et.al. 2406.19987 translate read null
2024-06-28 Kolmogorov-Smirnov GAN Maciej Falkiewicz et.al. 2406.19948 translate read null
2024-06-28 MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance Yuang Zhang et.al. 2406.19680 translate read link
2024-06-28 PopAlign: Population-Level Alignment for Fair Text-to-Image Generation Shufan Li et.al. 2406.19668 translate read link
2024-06-28 Network Bending of Diffusion Models for Audio-Visual Generation Luke Dzwonczyk et.al. 2406.19589 translate read null
2024-06-27 Understanding Modality Preferences in Search Clarification Leila Tavakoli et.al. 2406.19546 translate read null
2024-06-27 Using diffusion model as constraint: Empower Image Restoration Network Training with Diffusion Model Jiangtong Tan et.al. 2406.19030 translate read null
2024-06-27 Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis Vu Minh Hieu Phan et.al. 2406.18967 translate read link
2024-06-28 AnyControl: Create Your Artwork with Versatile Control on Text-to-Image Generation Yanan Sun et.al. 2406.18958 translate read link
2024-06-27 CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation Zuo Zuo et.al. 2406.18941 translate read null
2024-06-26 MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data William Berman et.al. 2406.18790 translate read null
2024-06-28 CSI4Free: GAN-Augmented mmWave CSI for Improved Pose Classification Nabeel Nisar Bhat et.al. 2406.18684 translate read null
2024-06-26 MultiDiff: Consistent Novel View Synthesis from a Single Image Norman Müller et.al. 2406.18524 translate read null
2024-06-26 DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance Younghyun Kim et.al. 2406.18459 translate read link
2024-06-26 Generalized Deepfake Attribution Sowdagar Mahammad Shahid et.al. 2406.18278 translate read null
2024-06-26 VDG: Vision-Only Dynamic Gaussian for Driving Simulation Hao Li et.al. 2406.18198 translate read null
2024-06-25 Detection of Synthetic Face Images: Accuracy, Robustness, Generalization Nela Petrzelkova et.al. 2406.17547 translate read null
2024-06-25 TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification Joshua Niemeijer et.al. 2406.17473 translate read null
2024-06-25 A Matrix Product State Model for Simultaneous Classification and Generation Alex Mossi et.al. 2406.17441 translate read null
2024-06-25 SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing Ruihuang Li et.al. 2406.17396 translate read null
2024-06-25 Semantic Deep Hiding for Robust Unlearnable Examples Ruohan Meng et.al. 2406.17349 translate read null
2024-06-25 Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers Lei Chen et.al. 2406.17343 translate read link
2024-06-25 Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds Hongliang Zeng et.al. 2406.17342 translate read null
2024-06-25 Expansive Synthesis: Generating Large-Scale Datasets from Minimal Samples Vahid Jebraeeli et.al. 2406.17238 translate read null
2024-06-24 Integrating Generative AI with Network Digital Twins for Enhanced Network Operations Kassi Muhammad et.al. 2406.17112 translate read null
2024-06-24 Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation Zhenyi Liao et.al. 2406.17100 translate read link
2024-06-24 DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation Yuang Peng et.al. 2406.16855 translate read link
2024-06-24 Concentration Inequalities for $(f,Γ)$ -GANs Jeremiah Birrell et.al. 2406.16834 translate read null
2024-06-24 Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation Katherine M. Collins et.al. 2406.16807 translate read null
2024-06-24 Repulsive Score Distillation for Diverse Sampling of Diffusion Models Nicolas Zilberstein et.al. 2406.16683 translate read link
2024-06-24 EvalAlign: Evaluating Text-to-Image Models through Precision Alignment of Multimodal Large Models with Supervised Fine-Tuning to Human Annotations Zhiyu Tan et.al. 2406.16562 translate read link
2024-06-24 Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization Yuhang Ma et.al. 2406.16537 translate read link
2024-06-24 ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance Shuwei Shi et.al. 2406.16476 translate read null
2024-06-24 Improving Generative Adversarial Networks for Video Super-Resolution Daniel Wen et.al. 2406.16359 translate read null
2024-06-24 Prompt-Consistency Image Generation (PCIG): A Unified Framework Integrating LLMs, Knowledge Graphs, and Controllable Diffusion Models Yichen Sun et.al. 2406.16333 translate read null
2024-06-24 Repairing Catastrophic-Neglect in Text-to-Image Diffusion Models via Attention-Guided Feature Enhancement Zhiyuan Chang et.al. 2406.16272 translate read null
2024-06-21 Fingerprint Membership and Identity Inference Against Generative Adversarial Networks Saverio Cavasin et.al. 2406.15253 translate read null
2024-06-21 Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors Ali Naseh et.al. 2406.15213 translate read null
2024-06-21 Disability Representations: Finding Biases in Automatic Image Generation Yannis Tevissen et.al. 2406.14993 translate read null
2024-06-21 Latent diffusion models for parameterization and data assimilation of facies-based geomodels Guido Di Federico et.al. 2406.14815 translate read null
2024-06-20 Evaluating Numerical Reasoning in Text-to-Image Models Ivana Kajić et.al. 2406.14774 translate read null
2024-06-20 Holistic Evaluation for Interleaved Text-and-Image Generation Minqian Liu et.al. 2406.14643 translate read null
2024-06-20 Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps Nikita Starodubcev et.al. 2406.14539 translate read link
2024-06-20 Fantastic Copyrighted Beasts and How (Not) to Generate Them Luxi He et.al. 2406.14526 translate read link
2024-06-20 ForSE+: Simulating non-Gaussian CMB foregrounds at 3 arcminutes in a stochastic way based on a generative adversarial network Jian Yao et.al. 2406.14519 translate read link
2024-06-20 Video Generation with Learned Action Prior Meenakshi Sarkar et.al. 2406.14436 translate read null
2024-06-20 CollaFuse: Collaborative Diffusion Models Simeon Allmendinger et.al. 2406.14429 translate read link
2024-06-20 In Tree Structure Should Sentence Be Generated Yaguang Li et.al. 2406.14189 translate read link
2024-06-20 Urban-Focused Multi-Task Offline Reinforcement Learning with Contrastive Data Sharing Xinbo Zhao et.al. 2406.14054 translate read null
2024-06-20 The Elusive Pursuit of Replicating PATE-GAN: Benchmarking, Auditing, Debugging Georgi Ganev et.al. 2406.13985 translate read link
2024-06-20 Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models Yuan Zhong et.al. 2406.13942 translate read null
2024-06-19 GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation Baiqi Li et.al. 2406.13743 translate read link
2024-06-19 AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation Xinyu Hou et.al. 2406.12805 translate read link
2024-06-18 Cyclic 2.5D Perceptual Loss for Cross-Modal 3D Image Synthesis: T1 MRI to Tau-PET Symac Kim et.al. 2406.12632 translate read null
2024-06-18 Unmasking the Veil: An Investigation into Concept Ablation for Privacy and Copyright Protection in Images Shivank Garg et.al. 2406.12592 translate read link
2024-06-18 Training Diffusion Models with Federated Learning Matthijs de Goede et.al. 2406.12575 translate read null
2024-06-18 SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions Yuexiong Ding et.al. 2406.12395 translate read null
2024-06-17 ARTIST: Improving the Generation of Text-rich Images by Disentanglement Jianyi Zhang et.al. 2406.12044 translate read null
2024-06-17 Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models Alireza Ganjdanesh et.al. 2406.12042 translate read link
2024-06-17 Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI Robert Hönig et.al. 2406.12027 translate read null
2024-06-17 Decomposed evaluations of geographic disparities in text-to-image models Abhishek Sureddy et.al. 2406.11988 translate read null
2024-06-17 Autoregressive Image Generation without Vector Quantization Tianhong Li et.al. 2406.11838 translate read link
2024-06-17 Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% Lei Zhu et.al. 2406.11837 translate read link
2024-06-17 Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Bingqi Ma et.al. 2406.11831 translate read null
2024-06-17 PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models Fanqing Meng et.al. 2406.11802 translate read link
2024-06-17 Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes Aghiles Kebaili et.al. 2406.11659 translate read null
2024-06-17 Style Transfer with Multi-iteration Preference Optimization Shuai Liu et.al. 2406.11581 translate read null
2024-06-17 Quaternion Generative Adversarial Neural Networks and Applications to Color Image Inpainting Duan Wang et.al. 2406.11567 translate read null
2024-06-17 GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation Shihao Cai et.al. 2406.11503 translate read null
2024-06-17 P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models Shuo Yang et.al. 2406.11391 translate read null
2024-06-17 Generative Visual Instruction Tuning Jefferson Hernandez et.al. 2406.11262 translate read null
2024-06-14 Make It Count: Text-to-Image Generation with an Accurate Number of Objects Lital Binyamin et.al. 2406.10210 translate read link
2024-06-14 Crafting Parts for Expressive Object Composition Harsh Rangwani et.al. 2406.10197 translate read null
2024-06-14 Precipitation Nowcasting Using Physics Informed Discriminator Generative Models Junzhe Yin et.al. 2406.10108 translate read null
2024-06-14 High-efficiency generation of vectorial holograms with metasurfaces Tong Liu et.al. 2406.10072 translate read null
2024-06-14 BiVLC: Extending Vision-Language Compositionality Evaluation with Text-to-Image Retrieval Imanol Miranda et.al. 2406.09952 translate read link
2024-06-14 ControlVAR: Exploring Controllable Visual Autoregressive Modeling Xiang Li et.al. 2406.09750 translate read link
2024-06-13 You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes Jabez Magomere et.al. 2406.09496 translate read link
2024-06-13 Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models Qihao Liu et.al. 2406.09416 translate read link
2024-06-13 An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Duy-Kien Nguyen et.al. 2406.09415 translate read null
2024-06-13 Understanding Hallucinations in Diffusion Models through Mode Interpolation Sumukh K Aithal et.al. 2406.09358 translate read link
2024-06-13 Advancing Graph Generation through Beta Diffusion Yilin He et.al. 2406.09357 translate read null
2024-06-13 Investigate the Performance of Distribution Loading with Conditional Quantum Generative Adversarial Network Algorithm on Quantum Hardware with Error Suppression Anh Pham et.al. 2406.09341 translate read null
2024-06-13 Less Cybersickness, Please: Demystifying and Detecting Stereoscopic Visual Inconsistencies in VR Apps Shuqing Li et.al. 2406.09313 translate read null
2024-06-13 Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation Yufan Zhou et.al. 2406.09305 translate read null
2024-06-13 StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning Giuseppe Vecchio et.al. 2406.09293 translate read null
2024-06-13 EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts Yucheng Han et.al. 2406.09162 translate read null
2024-06-13 Complex Image-Generative Diffusion Transformer for Audio Denoising Junhui Li et.al. 2406.09161 translate read null
2024-06-12 ICE-G: Image Conditional Editing of 3D Gaussian Splats Vishnu Jaganathan et.al. 2406.08488 translate read null
2024-06-12 Words Worth a Thousand Pictures: Measuring and Understanding Perceptual Variability in Text-to-Image Generation Raphael Tang et.al. 2406.08482 translate read null
2024-06-12 What If We Recaption Billions of Web Images with LLaMA-3? Xianhang Li et.al. 2406.08478 translate read null
2024-06-12 PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferences Daiwei Chen et.al. 2406.08469 translate read link
2024-06-12 Diffusion Soup: Model Merging for Text-to-Image Diffusion Models Benjamin Biggs et.al. 2406.08431 translate read null
2024-06-12 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jiannan Wu et.al. 2406.08394 translate read link
2024-06-12 FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation Xinzhi Mu et.al. 2406.08392 translate read null
2024-06-12 WMAdapter: Adding WaterMark Control to Latent Diffusion Models Hai Ci et.al. 2406.08337 translate read null
2024-06-12 CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models Hyungjin Chung et.al. 2406.08070 translate read null
2024-06-12 Small Scale Data-Free Knowledge Distillation He Liu et.al. 2406.07876 translate read link
2024-06-11 Image and Video Tokenization with Binary Spherical Quantization Yue Zhao et.al. 2406.07548 translate read link
2024-06-11 Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? Xingyu Fu et.al. 2406.07546 translate read null
2024-06-11 Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance Kuan Heng Lin et.al. 2406.07540 translate read null
2024-06-11 Neural Gaffer: Relighting Any Object via Diffusion Haian Jin et.al. 2406.07520 translate read null
2024-06-11 Instant 3D Human Avatar Generation using Image Diffusion Models Nikos Kolotouros et.al. 2406.07516 translate read null
2024-06-11 Understanding Visual Concepts Across Models Brandon Trabucco et.al. 2406.07506 translate read link
2024-06-11 Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions Renjie Pi et.al. 2406.07502 translate read link
2024-06-11 SPIN: Spacecraft Imagery for Navigation Javier Montalvo et.al. 2406.07500 translate read null
2024-06-11 Beware of Aliases – Signal Preservation is Crucial for Robust Image Restoration Shashank Agnihotri et.al. 2406.07435 translate read null
2024-06-11 Is One GPU Enough? Pushing Image Generation at Higher-Resolutions with Foundation Models Athanasios Tragakis et.al. 2406.07251 translate read link
2024-06-10 Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Peize Sun et.al. 2406.06525 translate read link
2024-06-10 Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer Sigal Raab et.al. 2406.06508 translate read link
2024-06-10 Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models Marek Wodzinski et.al. 2406.06372 translate read null
2024-06-10 The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems Philippe Gonzalez et.al. 2406.06160 translate read null
2024-06-10 ProcessPainter: Learn Painting Process from Sequence Data Yiren Song et.al. 2406.06062 translate read null
2024-06-09 Are Large Language Models Actually Good at Text Style Transfer? Sourabrata Mukherjee et.al. 2406.05885 translate read link
2024-06-09 OmniControlNet: Dual-stage Integration for Conditional Image Generation Yilin Wang et.al. 2406.05871 translate read null
2024-06-09 GANSky – fast curved sky weak lensing simulations using Generative Adversarial Networks Supranta S. Boruah et.al. 2406.05867 translate read null
2024-06-09 Unified Text-to-Image Generation and Retrieval Leigang Qu et.al. 2406.05814 translate read null
2024-06-09 MLCM: Multistep Consistency Distillation of Latent Diffusion Model Qingsong Xie et.al. 2406.05768 translate read link
2024-06-07 GANetic Loss for Generative Adversarial Networks with a Focus on Medical Applications Shakhnaz Akhmedova et.al. 2406.05023 translate read link
2024-06-07 AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation Lianyu Pang et.al. 2406.05000 translate read null
2024-06-07 CityCraft: A Real Crafter for 3D City Generation Jie Deng et.al. 2406.04983 translate read null
2024-06-07 TEDi Policy: Temporally Entangled Diffusion for Robotic Control Sigmund H. Høeg et.al. 2406.04806 translate read link
2024-06-07 PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction Eduard Poesina et.al. 2406.04746 translate read link
2024-06-07 Activation Map-based Vector Quantization for 360-degree Image Semantic Communication Yang Ma et.al. 2406.04740 translate read null
2024-06-07 GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models Diptanu De et.al. 2406.04654 translate read null
2024-06-07 CLoG: Benchmarking Continual Learning of Image Generation Models Haotian Zhang et.al. 2406.04584 translate read link
2024-06-07 SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer Jie Zhao et.al. 2406.04578 translate read null
2024-06-06 Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance Reyhane Askari Hemmat et.al. 2406.04551 translate read link
2024-06-06 Coherent Zero-Shot Visual Instruction Generation Quynh Phung et.al. 2406.04337 translate read null
2024-06-06 BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Yang Sui et.al. 2406.04333 translate read link
2024-06-06 Diffusion-based image inpainting with internal learning Nicolas Cherel et.al. 2406.04206 translate read null
2024-06-06 Machine Learning-Driven Microwave Imaging for Soil Moisture Estimation near Leaky Pipe Mohammad Ramezaninia et.al. 2406.04193 translate read null
2024-06-06 Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis Marianna Ohanyan et.al. 2406.04032 translate read null
2024-06-06 Quantum Implicit Neural Representations Jiaming Zhao et.al. 2406.03873 translate read link
2024-06-06 Semantic Similarity Score for Measuring Visual Similarity at Semantic Level Senran Fan et.al. 2406.03865 translate read null
2024-06-06 Malware Classification Based on Image Segmentation Wanhu Nie et.al. 2406.03831 translate read null
2024-06-07 ReDistill: Residual Encoded Distillation for Peak Memory Reduction Fang Chen et.al. 2406.03744 translate read null
2024-06-05 Style Mixture of Experts for Expressive Text-To-Speech Synthesis Ahad Jawaid et.al. 2406.03637 translate read null
2024-06-05 LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback Timon Ziegenbein et.al. 2406.03363 translate read link
2024-06-05 Tackling GenAI Copyright Issues: Originality Estimation and Genericization Hiroaki Chiba-Okabe et.al. 2406.03341 translate read null
2024-06-05 Deep Generative Models for Proton Zero Degree Calorimeter Simulations in ALICE, CERN Patryk Będkowski et.al. 2406.03263 translate read null
2024-06-05 Generative Diffusion Models for Fast Simulations of Particle Collisions at CERN Mikołaj Kita et.al. 2406.03233 translate read null
2024-06-05 Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Hao Wen et.al. 2406.03184 translate read link
2024-06-05 Phy-Diff: Physics-guided Hourglass Diffusion Model for Diffusion MRI Synthesis Juanhua Zhang et.al. 2406.03002 translate read null
2024-06-05 Adversarial Generation of Hierarchical Gaussians for 3D Generative Model Sangeek Hyun et.al. 2406.02968 translate read link
2024-06-05 Dataset-Distillation Generative Model for Speech Emotion Recognition Fabian Ritter-Gutierrez et.al. 2406.02963 translate read null
2024-06-05 Language-guided Detection and Mitigation of Unknown Dataset Bias Zaiying Zhao et.al. 2406.02889 translate read null
2024-06-05 Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter Peng Xing et.al. 2406.02881 translate read null
2024-06-04 DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering Zhongpai Gao et.al. 2406.02518 translate read null
2024-06-04 Guiding a Diffusion Model with a Bad Version of Itself Tero Karras et.al. 2406.02507 translate read null
2024-06-04 Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation Jiajun Wang et.al. 2406.02485 translate read null
2024-06-04 Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion Colin Hansen et.al. 2406.02477 translate read null
2024-06-04 Generative Active Learning for Long-tailed Instance Segmentation Muzhi Zhu et.al. 2406.02435 translate read link
2024-06-04 Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation Clement Chadebec et.al. 2406.02347 translate read link
2024-06-04 I4VGen: Image as Stepping Stone for Text-to-Video Generation Xiefan Guo et.al. 2406.02230 translate read link
2024-06-04 Analyzing the Feature Extractor Networks for Face Image Synthesis Erdi Sarıtaş et.al. 2406.02153 translate read link
2024-06-04 FaceCom: Towards High-fidelity 3D Facial Shape Completion via Optimization and Inpainting Guidance Yinglong Li et.al. 2406.02074 translate read link
2024-06-04 Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions Wei Yao et.al. 2406.01992 translate read link

(<a href=../Image_Generation.md>back to Image Generation</a>)