Image Generation - 2025-05

Publish Date Title Authors PDF Translate Read Code
2025-05-30 ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL Yu Zhang et.al. 2505.24875 translate read link
2025-05-30 GenSpace: Benchmarking Spatially-Aware Image Generation Zehan Wang et.al. 2505.24870 translate read null
2025-05-30 Draw ALL Your Imagine: A Holistic Benchmark and Agent Framework for Complex Instruction-based Image Generation Yucheng Zhou et.al. 2505.24787 translate read link
2025-05-30 QGAN-based data augmentation for hybrid quantum-classical neural networks Run-Ze He et.al. 2505.24780 translate read null
2025-05-30 DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds Jiaxu Zhang et.al. 2505.24733 translate read null
2025-05-30 un $^2$ CLIP: Improving CLIP’s Visual Detail Capturing Ability via Inverting unCLIP Yinqi Li et.al. 2505.24517 translate read link
2025-05-30 Graph Flow Matching: Enhancing Image Generation with Neighbor-Aware Flow Fields Md Shahriar Rahim Siddiqui et.al. 2505.24434 translate read null
2025-05-30 Category-aware EEG image generation based on wavelet transform and contrast semantic loss Enshang Zhang et.al. 2505.24301 translate read null
2025-05-30 Unleashing High-Quality Image Generation in Diffusion Sampling Using Second-Order Levenberg-Marquardt-Langevin Fangyikang Wang et.al. 2505.24222 translate read link
2025-05-29 LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Yusuf Dalva et.al. 2505.23758 translate read null
2025-05-29 How Animals Dance (When You’re Not Looking) Xiaojuan Wang et.al. 2505.23738 translate read null
2025-05-29 Inference-time Scaling of Diffusion Models through Classical Search Xiangcheng Zhang et.al. 2505.23614 translate read null
2025-05-29 Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model Qingyu Shi et.al. 2505.23606 translate read link
2025-05-29 PCA for Enhanced Cross-Dataset Generalizability in Breast Ultrasound Tumor Segmentation Christian Schmidt et.al. 2505.23587 translate read null
2025-05-29 R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation Kaijie Chen et.al. 2505.23493 translate read null
2025-05-29 VITON-DRR: Details Retention Virtual Try-on via Non-rigid Registration Ben Li et.al. 2505.23439 translate read link
2025-05-29 Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering Sixian Wang et.al. 2505.23343 translate read link
2025-05-29 Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis Hengyuan Cao et.al. 2505.23325 translate read link
2025-05-29 Score-based Generative Modeling for Conditional Independence Testing Yixin Ren et.al. 2505.23309 translate read link
2025-05-28 SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation Dekai Zhu et.al. 2505.22643 translate read null
2025-05-28 Principled Out-of-Distribution Generalization via Simplicity Jiawei Ge et.al. 2505.22622 translate read null
2025-05-28 ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models Dmitrii Sorokin et.al. 2505.22569 translate read link
2025-05-28 TabularQGAN: A Quantum Generative Model for Tabular Data Pallavi Bhardwaj et.al. 2505.22533 translate read null
2025-05-28 PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models Junwen Chen et.al. 2505.22523 translate read null
2025-05-28 ProCrop: Learning Aesthetic Image Cropping from Professional Compositions Ke Zhang et.al. 2505.22490 translate read null
2025-05-28 Self-Reflective Reinforcement Learning for Diffusion-based Image Reasoning Generation Jiadong Pan et.al. 2505.22407 translate read null
2025-05-28 PacTure: Efficient PBR Texture Generation on Packed Views with Visual Autoregressive Models Fan Fei et.al. 2505.22394 translate read null
2025-05-28 Identity-Preserving Text-to-Image Generation via Dual-Level Feature Decoupling and Expert-Guided Fusion Kewen Chen et.al. 2505.22360 translate read null
2025-05-28 Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers Weilun Feng et.al. 2505.22167 translate read null
2025-05-27 Policy Optimized Text-to-Image Pipeline Design Uri Gadot et.al. 2505.21478 translate read null
2025-05-27 DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction Yiheng Liu et.al. 2505.21473 translate read link
2025-05-27 Creativity in LLM-based Multi-Agent Systems: A Survey Yi-Cheng Lin et.al. 2505.21116 translate read null
2025-05-27 Facial Attribute Based Text Guided Face Anonymization Mustafa İzzet Muştu et.al. 2505.21002 translate read null
2025-05-27 OrienText: Surface Oriented Textual Image Generation Shubham Singh Paliwal et.al. 2505.20958 translate read null
2025-05-27 Unveiling Impact of Frequency Components on Membership Inference Attacks for Diffusion Models Puwei Lian et.al. 2505.20955 translate read null
2025-05-27 Create Anything Anywhere: Layout-Controllable Personalized Diffusion Model for Multiple Subjects Wei Li et.al. 2505.20909 translate read null
2025-05-27 Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech Nam-Gyu Kim et.al. 2505.20868 translate read null
2025-05-27 Not All Thats Rare Is Lost: Causal Paths to Rare Concept Synthesis Bo-Kai Ruan et.al. 2505.20808 translate read null
2025-05-27 Unpaired Image-to-Image Translation for Segmentation and Signal Unmixing Nikola Andrejic et.al. 2505.20746 translate read null
2025-05-26 FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities Jin Wang et.al. 2505.20147 translate read null
2025-05-26 Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion Zheqi Lv et.al. 2505.20053 translate read link
2025-05-26 StyleAR: Customizing Multimodal Autoregressive Model for Style-Aligned Text-to-Image Generation Yi Wu et.al. 2505.19874 translate read null
2025-05-26 Applications and Effect Evaluation of Generative Adversarial Networks in Semi-Supervised Learning Jiyu Hu et.al. 2505.19522 translate read null
2025-05-26 Structure Disruption: Subverting Malicious Diffusion-Based Inpainting via Self-Attention Query Perturbation Yuhao He et.al. 2505.19425 translate read null
2025-05-26 MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models Hang Hua et.al. 2505.19415 translate read null
2025-05-25 TextDiffuser-RL: Efficient and Robust Text Layout Optimization for High-Fidelity Text-to-Image Synthesis Kazi Mahathir Rahman et.al. 2505.19291 translate read null
2025-05-25 DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving Chen Shi et.al. 2505.19239 translate read null
2025-05-25 RAISE: Realness Assessment for Image Synthesis and Evaluation Aniruddha Mukherjee et.al. 2505.19233 translate read null
2025-05-25 MedITok: A Unified Tokenizer for Medical Image Synthesis and Interpretation Chenglong Ma et.al. 2505.19225 translate read link
2025-05-23 F-ANcGAN: An Attention-Enhanced Cycle Consistent Generative Adversarial Architecture for Synthetic Image Generation of Nanoparticles Varun Ajith et.al. 2505.18106 translate read null
2025-05-23 RestoreVAR: Visual Autoregressive Generation for All-in-One Image Restoration Sudarshan Rajagopalan et.al. 2505.18047 translate read null
2025-05-23 R-Genie: Reasoning-Guided Generative Image Editing Dong Zhang et.al. 2505.17768 translate read null
2025-05-23 FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving Shuang Zeng et.al. 2505.17685 translate read null
2025-05-23 Audio-to-Audio Emotion Conversion With Pitch And Duration Style Transfer Soumya Dutta et.al. 2505.17655 translate read null
2025-05-23 MMMG: a Comprehensive and Reliable Evaluation Suite for Multitask Multimodal Generation Jihan Yao et.al. 2505.17613 translate read link
2025-05-23 Deeper Diffusion Models Amplify Bias Shahin Hakemi et.al. 2505.17560 translate read null
2025-05-23 Graph Style Transfer for Counterfactual Explainability Bardh Prenkaj et.al. 2505.17542 translate read null
2025-05-23 RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning Mingrui Wu et.al. 2505.17540 translate read link
2025-05-23 Co-Reinforcement Learning for Unified Multimodal Understanding and Generation Jingjing Jiang et.al. 2505.17534 translate read null
2025-05-22 GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Chengqi Duan et.al. 2505.17022 translate read link
2025-05-22 Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO Chengzhuo Tong et.al. 2505.17017 translate read link
2025-05-22 Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On Siqi Wan et.al. 2505.16977 translate read link
2025-05-22 Creatively Upscaling Images with Global-Regional Priors Yurui Qian et.al. 2505.16976 translate read null
2025-05-22 Power-Law Decay Loss for Large Language Model Finetuning: Focusing on Information Sparsity to Enhance Generation Quality Jintian Shao et.al. 2505.16900 translate read null
2025-05-22 Conditional Panoramic Image Generation via Masked Autoregressive Modeling Chaoyang Wang et.al. 2505.16862 translate read null
2025-05-22 Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation Hongji Yang et.al. 2505.16763 translate read null
2025-05-22 Synthesis of Ventilator Dyssynchrony Waveforms using a Hybrid Generative Model and a Lung Model Sagar Deep Deb et.al. 2505.16462 translate read null
2025-05-22 UBGAN: Enhancing Coded Speech with Blind and Guided Bandwidth Extension Kishan Gupta et.al. 2505.16404 translate read null
2025-05-22 Style Transfer with Diffusion Models for Synthetic-to-Real Domain Adaptation Estelle Chigot et.al. 2505.16360 translate read link
2025-05-21 MMaDA: Multimodal Large Diffusion Language Models Ling Yang et.al. 2505.15809 translate read link
2025-05-21 IA-T2I: Internet-Augmented Text-to-Image Generation Chuanhao Li et.al. 2505.15779 translate read null
2025-05-21 FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion Kazuaki Mishima et.al. 2505.15313 translate read null
2025-05-21 BadSR: Stealthy Label Backdoor Attacks on Image Super-Resolution Ji Guo et.al. 2505.15308 translate read null
2025-05-21 Scaling Diffusion Transformers Efficiently via $μ$ P Chenyu Zheng et.al. 2505.15270 translate read link
2025-05-21 Contrastive Learning-Enhanced Trajectory Matching for Small-Scale Dataset Distillation Wenmin Li et.al. 2505.15267 translate read null
2025-05-21 GT^2-GS: Geometry-aware Texture Transfer for Gaussian Splatting Wenjie Liu et.al. 2505.15208 translate read null
2025-05-21 Harnessing Caption Detailness for Data-Efficient Text-to-Image Generation Xinran Wang et.al. 2505.15172 translate read null
2025-05-20 TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis Yu Zhang et.al. 2505.14910 translate read link
2025-05-20 UniGen: Enhanced Training & Test-Time Strategies for Unified Multimodal Understanding and Generation Rui Tian et.al. 2505.14682 translate read null
2025-05-20 Training-Free Watermarking for Autoregressive Image Generation Yu Tong et.al. 2505.14673 translate read link
2025-05-20 SparC: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling Zhihao Li et.al. 2505.14521 translate read null
2025-05-20 Latent Flow Transformer Yen-Chen Wu et.al. 2505.14513 translate read link
2025-05-20 VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank Tianhe Wu et.al. 2505.14460 translate read link
2025-05-20 Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives Xingxing Weng et.al. 2505.14361 translate read null
2025-05-20 Handloom Design Generation Using Generative Networks Rajat Kanti Bhattacharjee et.al. 2505.14330 translate read null
2025-05-20 Towards Generating Realistic Underwater Images Abdul-Kazeem Shamba et.al. 2505.14296 translate read null
2025-05-20 EVA: Red-Teaming GUI Agents via Evolving Indirect Prompt Injection Yijie Lu et.al. 2505.14289 translate read null
2025-05-20 Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization Yuanyuan Chang et.al. 2505.14254 translate read link
2025-05-19 VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation Huawei Lin et.al. 2505.13439 translate read link
2025-05-20 Swin DiT: Diffusion Transformer using Pseudo Shifted Windows Jiafu Wu et.al. 2505.13219 translate read null
2025-05-19 Diffusion Models with Double Guidance: Generate with aggregated datasets Yanfeng Yang et.al. 2505.13213 translate read null
2025-05-19 A Physics-Inspired Optimizer: Velocity Regularized Adam Pranav Vaidhyanathan et.al. 2505.13196 translate read null
2025-05-19 Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model Jonas Brenig et.al. 2505.13152 translate read link
2025-05-19 Accelerate TarFlow Sampling with GS-Jacobi Iteration Ben Liu et.al. 2505.12849 translate read link
2025-05-19 A Comprehensive Benchmarking Platform for Deep Generative Models in Molecular Design Adarsh Singh et.al. 2505.12848 translate read null
2025-05-19 A Study on the Refining Handwritten Font by Mixing Font Styles Avinash Kumar et.al. 2505.12834 translate read link
2025-05-19 SynDec: A Synthesize-then-Decode Approach for Arbitrary Textual Style Transfer via Large Language Models Han Sun et.al. 2505.12821 translate read null
2025-05-19 FRAbench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities Shibo Hong et.al. 2505.12795 translate read link
2025-05-16 PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment Dingbang Huang et.al. 2505.11468 translate read null
2025-05-16 GOUHFI: a novel contrast- and resolution-agnostic segmentation tool for Ultra-High Field MRI Marc-Antoine Fortin et.al. 2505.11445 translate read link
2025-05-16 Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior Chin-Yun Yu et.al. 2505.11315 translate read null
2025-05-16 DRAGON: A Large-Scale Dataset of Realistic Images Generated by Diffusion Models Giulia Bertazzini et.al. 2505.11257 translate read null
2025-05-16 Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models Fu-Yun Wang et.al. 2505.11245 translate read link
2025-05-16 CompAlign: Improving Compositional Text-to-Image Generation with a Complex Benchmark and Fine-Grained Feedback Yixin Wan et.al. 2505.11178 translate read null
2025-05-16 One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework Feiran Li et.al. 2505.11131 translate read link
2025-05-16 Deepfake Forensic Analysis: Source Dataset Attribution and Legal Implications of Synthetic Media Manipulation Massimiliano Cassia et.al. 2505.11110 translate read null
2025-05-16 HSRMamba: Efficient Wavelet Stripe State Space Model for Hyperspectral Image Super-Resolution Baisong Li et.al. 2505.11062 translate read null
2025-05-16 DDAE++: Enhancing Diffusion Models Towards Unified Generative and Discriminative Learning Weilai Xiang et.al. 2505.10999 translate read null
2025-05-15 End-to-End Vision Tokenizer Tuning Wenxuan Wang et.al. 2505.10562 translate read null
2025-05-15 CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs Raman Dutt et.al. 2505.10496 translate read link
2025-05-15 SOS: A Shuffle Order Strategy for Data Augmentation in Industrial Human Activity Recognition Anh Tuan Ha et.al. 2505.10312 translate read null
2025-05-15 Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis Bingda Tang et.al. 2505.10046 translate read link
2025-05-15 CartoAgent: a multimodal large language model-powered multi-agent cartographic framework for map style transfer and evaluation Chenglong Wang et.al. 2505.09936 translate read null
2025-05-14 EnerVerse-AC: Envisioning Embodied Environments with Action Condition Yuxin Jiang et.al. 2505.09723 translate read link
2025-05-14 Don’t Forget your Inverse DDIM for Image Editing Guillermo Gomez-Trenado et.al. 2505.09571 translate read null
2025-05-14 BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Jiuhai Chen et.al. 2505.09568 translate read link
2025-05-14 Train a Multi-Task Diffusion Policy on RLBench-18 in One Day with One GPU Yutong Hu et.al. 2505.09430 translate read link
2025-05-14 Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis Bingxin Ke et.al. 2505.09358 translate read link
2025-05-14 Q-space Guided Collaborative Attention Translation Network for Flexible Diffusion-Weighted Images Synthesis Pengli Zhu et.al. 2505.09323 translate read null
2025-05-14 An Initial Exploration of Default Images in Text-to-Image Generation Hannu Simonen et.al. 2505.09166 translate read null
2025-05-14 DPN-GAN: Inducing Periodic Activations in Generative Adversarial Networks for High-Fidelity Audio Synthesis Zeeshan Ahmad et.al. 2505.09091 translate read null
2025-05-13 SPAST: Arbitrary Style Transfer with Style Priors via Pre-trained Large-scale Model Zhanjie Zhang et.al. 2505.08695 translate read null
2025-05-13 Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models Donghoon Kim et.al. 2505.08622 translate read null
2025-05-13 DFA-CON: A Contrastive Learning Approach for Detecting Copyright Infringement in DeepFake Art Haroon Wahab et.al. 2505.08552 translate read null
2025-05-13 Skeleton-Guided Diffusion Model for Accurate Foot X-ray Synthesis in Hallux Valgus Diagnosis Midi Wan et.al. 2505.08247 translate read link
2025-05-13 Identifying Memorization of Diffusion Models through p-Laplace Analysis Jonathan Brokman et.al. 2505.08246 translate read null
2025-05-13 Unsupervised Raindrop Removal from a Single Image using Conditional Diffusion Models Lhuqita Fazry et.al. 2505.08190 translate read null
2025-05-12 Image-Guided Microstructure Optimization using Diffusion Models: Validated with Li-Mn-rich Cathode Precursors Geunho Choi et.al. 2505.07906 translate read null
2025-05-12 Synthesizing Diverse Network Flow Datasets with Scalable Dynamic Multigraph Generation Arya Grayeli et.al. 2505.07777 translate read null
2025-05-12 Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning Bohan Wang et.al. 2505.07538 translate read null
2025-05-12 Addressing degeneracies in latent interpolation for diffusion models Erik Landolsi et.al. 2505.07481 translate read null
2025-05-12 GAN-based synthetic FDG PET images from T1 brain MRI can serve to improve performance of deep unsupervised anomaly detection models Daria Zotova et.al. 2505.07364 translate read null
2025-05-12 Metrics that matter: Evaluating image quality metrics for medical image generation Yash Deo et.al. 2505.07175 translate read link
2025-05-11 Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation Md. Naimur Asif Borno et.al. 2505.06995 translate read null
2025-05-10 Learning Graph Representation of Agent Diffuser Youcef Djenouri et.al. 2505.06761 translate read link
2025-05-10 HCMA: Hierarchical Cross-model Alignment for Grounded Text-to-Image Generation Hang Wang et.al. 2505.06512 translate read link
2025-05-10 PC-SRGAN: Physically Consistent Super-Resolution Generative Adversarial Network for General Transient Simulations Md Rakibul Hasan et.al. 2505.06502 translate read null
2025-05-10 Climate in a Bottle: Towards a Generative Foundation Model for the Kilometer-Scale Global Atmosphere Noah D. Brenowitz et.al. 2505.06474 translate read null
2025-05-09 Photovoltaic Defect Image Generator with Boundary Alignment Smoothing Constraint for Domain Shift Mitigation Dongying Li et.al. 2505.06117 translate read null
2025-05-09 Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation Kunpeng Qiu et.al. 2505.06068 translate read link
2025-05-09 Discovery of the Polar Ring Galaxies with deep learning D. V. Dobrycheva et.al. 2505.05890 translate read null
2025-05-09 Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition Zhiyuan Chen et.al. 2505.05829 translate read null
2025-05-08 InstanceGen: Image Generation with Instance-level Instructions Etai Sella et.al. 2505.05678 translate read null
2025-05-08 Semantic Style Transfer for Enhancing Animal Facial Landmark Detection Anadil Hussein et.al. 2505.05640 translate read null
2025-05-08 A Preliminary Study for GPT-4o on Image Restoration Hao Yang et.al. 2505.05621 translate read link
2025-05-08 Prompt to Polyp: Clinically-Aware Medical Image Synthesis with Diffusion Models Mikhail Chaichuk et.al. 2505.05573 translate read link
2025-05-08 OXSeg: Multidimensional attention UNet-based lip segmentation using semi-supervised lip contours Hanie Moghaddasi et.al. 2505.05531 translate read null
2025-05-08 Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation Chao Liao et.al. 2505.05472 translate read null
2025-05-08 Does CLIP perceive art the same way we do? Andrea Asperti et.al. 2505.05229 translate read null
2025-05-08 Normalize Everything: A Preconditioned Magnitude-Preserving Architecture for Diffusion-Based Speech Enhancement Julius Richter et.al. 2505.05216 translate read null
2025-05-09 FlexSpeech: Towards Stable, Controllable and Expressive Text-to-Speech Linhan Ma et.al. 2505.05159 translate read null
2025-05-08 PIDiff: Image Customization for Personalized Identities with Diffusion Models Jinyu Gu et.al. 2505.05081 translate read null
2025-05-08 ViCTr: Vital Consistency Transfer for Pathology Aware Image Synthesis Onkar Susladkar et.al. 2505.04963 translate read null
2025-05-07 CRAFT: Cultural Russian-Oriented Dataset Adaptation for Focused Text-to-Image Generation Viacheslav Vasilev et.al. 2505.04851 translate read null
2025-05-07 Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers Divyansh Srivastava et.al. 2505.04718 translate read null
2025-05-08 Defining and Quantifying Creative Behavior in Popular Image Generators Aditi Ramaswamy et.al. 2505.04497 translate read null
2025-05-07 Efficient Flow Matching using Latent Variables Anirban Samaddar et.al. 2505.04486 translate read null
2025-05-07 RLMiniStyler: Light-weight RL Style Agent for Arbitrary Sequential Neural Style Generation Jing Hu et.al. 2505.04424 translate read link
2025-05-07 CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion Yanyu Li et.al. 2505.04347 translate read null
2025-05-07 A Large Language Model for Feasible and Diverse Population Synthesis Sung Yoo Lim et.al. 2505.04196 translate read null
2025-05-07 Unmasking the Canvas: A Dynamic Benchmark for Image Generation Jailbreaking and LLM Content Safety Variath Madhupal Gautham Nair et.al. 2505.04146 translate read null
2025-05-07 RFNNS: Robust Fixed Neural Network Steganography with Popular Deep Generative Models Yu Cheng et.al. 2505.04116 translate read null
2025-05-08 MAISY: Motion-Aware Image SYnthesis for Medical Image Motion Correction Andrew Zhang et.al. 2505.04105 translate read null
2025-05-06 nuGAN: Generative Adversarial Emulator for Cosmic Web with Neutrinos Neerav Kaushal et.al. 2505.03936 translate read null
2025-05-06 CaRaFFusion: Improving 2D Semantic Segmentation with Camera-Radar Point Cloud Fusion and Zero-Shot Image Inpainting Huawei Sun et.al. 2505.03679 translate read null
2025-05-06 Distribution-Conditional Generation: From Class Distribution to Creative Generation Fu Feng et.al. 2505.03667 translate read null
2025-05-06 Revolutionizing Brain Tumor Imaging: Generating Synthetic 3D FA Maps from T1-Weighted MRI using CycleGAN Models Xin Du et.al. 2505.03662 translate read null
2025-05-06 Real-Time Person Image Synthesis Using a Flow Matching Model Jiwoo Jeong et.al. 2505.03562 translate read null
2025-05-06 Safer Prompts: Reducing IP Risk in Visual Generative AI Lena Reissinger et.al. 2505.03338 translate read null
2025-05-06 Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Yibin Wang et.al. 2505.03318 translate read link
2025-05-06 Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation Jincheng Zhang et.al. 2505.03314 translate read link
2025-05-05 Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models Kuofeng Gao et.al. 2505.02824 translate read null
2025-05-06 MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation Mingcheng Li et.al. 2505.02648 translate read null
2025-05-05 Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Xinjie Zhang et.al. 2505.02567 translate read link
2025-05-05 Text to Image Generation and Editing: A Survey Pengfei Yang et.al. 2505.02527 translate read null
2025-05-05 Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction Biao Gong et.al. 2505.02471 translate read link
2025-05-04 Enhancing AI Face Realism: Cost-Efficient Quality Improvement in Distilled Diffusion Models with a Fully Synthetic Dataset Jakub Wąsala et.al. 2505.02255 translate read null
2025-05-04 Improving Physical Object State Representation in Text-to-Image Generative Systems Tianle Chen et.al. 2505.02236 translate read link
2025-05-04 Robust AI-Generated Face Detection with Imbalanced Data Yamini Sri Krubha et.al. 2505.02182 translate read link
2025-05-06 Regression is all you need for medical image translation Sebastian Rassmann et.al. 2505.02048 translate read null
2025-05-03 Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling Javier E. Santos et.al. 2505.01917 translate read null
2025-05-02 Deep Learning-Enabled System Diagnosis in Microgrids: A Feature-Feedback GAN Approach Swetha Rani Kasimalla et.al. 2505.01366 translate read null
2025-05-02 Improving Editability in Image Generation with Layer-wise Memory Daneul Kim et.al. 2505.01079 translate read link
2025-05-01 Data-Driven Optical To Thermal Inference in Pool Boiling Using Generative Adversarial Networks Qianxi Fu et.al. 2505.00823 translate read null
2025-05-01 T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Dongzhi Jiang et.al. 2505.00703 translate read link
2025-05-01 Steering Large Language Models with Register Analysis for Arbitrary Style Transfer Xinchen Yang et.al. 2505.00679 translate read null
2025-05-01 JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers Kwon Byung-Ki et.al. 2505.00482 translate read link
2025-05-01 Stealth Signals: Multi-Discriminator GANs for Covert Communications Against Diverse Wardens Afan Ali et.al. 2505.00399 translate read null
2025-05-01 GAN-based Generator of Adversarial Attack on Intelligent End-to-End Autoencoder-based Communication System Jianyuan Chen et.al. 2505.00395 translate read null
2025-05-01 Denoising weak lensing mass maps with diffusion model: systematic comparison with generative adversarial network Shohei D. Aoyama et.al. 2505.00345 translate read null

(<a href=../Image_Generation.md>back to Image Generation</a>)