Image Generation - 2025-06
Image Generation - 2025-06
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-06-30 | Calligrapher: Freestyle Text Image Customization | Yue Ma et.al. | 2506.24123 | translate | read | link |
| 2025-06-30 | Navigating with Annealing Guidance Scale in Diffusion Space | Shai Yehezkel et.al. | 2506.24108 | translate | read | null |
| 2025-06-30 | Imagine for Me: Creative Conceptual Blending of Real Images and Text via Blended Attention | Wonwoong Cho et.al. | 2506.24085 | translate | read | null |
| 2025-06-30 | World4Omni: A Zero-Shot Framework from Image Generation World Model to Robotic Manipulation | Haonan Chen et.al. | 2506.23919 | translate | read | null |
| 2025-06-30 | Radioactive Watermarks in Diffusion and Autoregressive Image Generative Models | Michel Meintz et.al. | 2506.23731 | translate | read | null |
| 2025-06-30 | A Unified Framework for Stealthy Adversarial Generation via Latent Optimization and Transferability Enhancement | Gaozheng Pei et.al. | 2506.23676 | translate | read | null |
| 2025-06-30 | Modelling effective electrical resistance in particle reinforced composites using Generative Adversarial Network | Vinit Vijay Deshpande et.al. | 2506.23655 | translate | read | null |
| 2025-06-30 | VAP-Diffusion: Enriching Descriptions with MLLMs for Enhanced Medical Image Generation | Peng Huang et.al. | 2506.23641 | translate | read | null |
| 2025-06-30 | Blending Concepts with Text-to-Image Diffusion Models | Lorenzo Olearo et.al. | 2506.23630 | translate | read | null |
| 2025-06-30 | Pyramidal Patchification Flow for Visual Generation | Hui Li et.al. | 2506.23543 | translate | read | null |
| 2025-06-27 | Low-Rank Implicit Neural Representation via Schatten-p Quasi-Norm and Jacobian Regularization | Zhengyun Cheng et.al. | 2506.22134 | translate | read | null |
| 2025-06-27 | Advancing Facial Stylization through Semantic Preservation Constraint and Pseudo-Paired Supervision | Zhanyi Lu et.al. | 2506.22022 | translate | read | null |
| 2025-06-27 | CERBERUS: Crack Evaluation & Recognition Benchmark for Engineering Reliability & Urban Stability | Justin Reinman et.al. | 2506.21909 | translate | read | null |
| 2025-06-27 | On the Feasibility of Poisoning Text-to-Image AI Models via Adversarial Mislabeling | Stanley Wu et.al. | 2506.21874 | translate | read | null |
| 2025-06-27 | PrefPaint: Enhancing Image Inpainting through Expert Human Feedback | Duy-Bao Bui et.al. | 2506.21834 | translate | read | null |
| 2025-06-27 | TaleForge: Interactive Multimodal System for Personalized Story Creation | Minh-Loi Nguyen et.al. | 2506.21832 | translate | read | null |
| 2025-06-26 | BASS. XLIV. Morphological preferences of local hard X-ray selected AGN | Miguel Parra Tello et.al. | 2506.21800 | translate | read | null |
| 2025-06-26 | Exploring Image Generation via Mutually Exclusive Probability Spaces and Local Correlation Hypothesis | Chenqiu Zhao et.al. | 2506.21731 | translate | read | null |
| 2025-06-26 | $\textrm{ODE}_t \left(\textrm{ODE}_l \right)$ : Shortcutting the Time and Length in Diffusion and Flow Models for Faster Sampling | Denis Gudovskiy et.al. | 2506.21714 | translate | read | null |
| 2025-06-26 | TanDiT: Tangent-Plane Diffusion Transformer for High-Quality 360° Panorama Generation | Hakan Çapuk et.al. | 2506.21681 | translate | read | null |
| 2025-06-26 | XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation | Bowen Chen et.al. | 2506.21416 | translate | read | link |
| 2025-06-26 | High-quality metalens enables minimally invasive CFB endoscopy | Ruixiang Song et.al. | 2506.21379 | translate | read | null |
| 2025-06-26 | GenFlow: Interactive Modular System for Image Generation | Duc-Hung Nguyen et.al. | 2506.21369 | translate | read | null |
| 2025-06-26 | BitMark for Infinity: Watermarking Bitwise Autoregressive Image Generative Models | Louis Kerner et.al. | 2506.21209 | translate | read | null |
| 2025-06-26 | Generative Adversarial Evasion and Out-of-Distribution Detection for UAV Cyber-Attacks | Deepak Kumar Panda et.al. | 2506.21142 | translate | read | null |
| 2025-06-26 | Improving Diffusion-Based Image Editing Faithfulness via Guidance and Scheduling | Hansam Cho et.al. | 2506.21045 | translate | read | null |
| 2025-06-26 | Instella-T2I: Pushing the Limits of 1D Discrete Latent Space Image Generation | Ze Wang et.al. | 2506.21022 | translate | read | null |
| 2025-06-26 | HybridQ: Hybrid Classical-Quantum Generative Adversarial Network for Skin Disease Image Generation | Qingyue Jiao et.al. | 2506.21015 | translate | read | null |
| 2025-06-26 | Distilling Normalizing Flows | Steven Walton et.al. | 2506.21003 | translate | read | null |
| 2025-06-26 | Rethink Sparse Signals for Pose-guided Text-to-image Generation | Wenjie Xuan et.al. | 2506.20983 | translate | read | null |
| 2025-06-25 | Video Perception Models for 3D Scene Synthesis | Rui Huang et.al. | 2506.20601 | translate | read | null |
| 2025-06-25 | Pay Less Attention to Deceptive Artifacts: Robust Detection of Compressed Deepfakes on Online Social Networks | Manyi Li et.al. | 2506.20548 | translate | read | null |
| 2025-06-25 | HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling | Tobias Vontobel et.al. | 2506.20452 | translate | read | link |
| 2025-06-25 | Med-Art: Diffusion Transformer for 2D Medical Text-to-Image Generation | Changlu Guo et.al. | 2506.20449 | translate | read | null |
| 2025-06-25 | Time-series surrogates from energy consumers generated by machine learning approaches for long-term forecasting scenarios | Ben Gerhards et.al. | 2506.20253 | translate | read | null |
| 2025-06-25 | FedBKD: Distilled Federated Learning to Embrace Gerneralization and Personalization on Non-IID Data | Yushan Zhao et.al. | 2506.20245 | translate | read | null |
| 2025-06-25 | EAR: Erasing Concepts from Unified Autoregressive Models | Haipeng Fan et.al. | 2506.20151 | translate | read | link |
| 2025-06-24 | Who Does What in Deep Learning? Multidimensional Game-Theoretic Attribution of Function of Neural Units | Shrey Dixit et.al. | 2506.19732 | translate | read | null |
| 2025-06-24 | Varif.ai to Vary and Verify User-Driven Diversity in Scalable Image Generation | M. Michelessa et.al. | 2506.19644 | translate | read | null |
| 2025-06-24 | Stylized Structural Patterns for Improved Neural Network Pre-training | Farnood Salehi et.al. | 2506.19465 | translate | read | null |
| 2025-06-24 | Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation | Zhifeng Wang et.al. | 2506.19455 | translate | read | null |
| 2025-06-24 | Enhancing Galaxy Classification with U-Net Variational Autoencoders for Image Denoising | Sergey Mirzoyan et.al. | 2506.19434 | translate | read | null |
| 2025-06-24 | SoK: Can Synthetic Images Replace Real Data? A Survey of Utility and Privacy of Synthetic Image Generation | Yunsung Chung et.al. | 2506.19360 | translate | read | null |
| 2025-06-24 | Style Transfer: A Decade Survey | Tianshan Zhang et.al. | 2506.19278 | translate | read | null |
| 2025-06-23 | Diffusion Transformer-to-Mamba Distillation for High-Resolution Image Generation | Yuan Yao et.al. | 2506.18999 | translate | read | null |
| 2025-06-23 | OmniGen2: Exploration to Advanced Multimodal Generation | Chenyuan Wu et.al. | 2506.18871 | translate | read | link |
| 2025-06-23 | TAMMs: Temporal-Aware Multimodal Model for Satellite Image Change Understanding and Forecasting | Zhongbin Guo et.al. | 2506.18862 | translate | read | null |
| 2025-06-23 | VisualChef: Generating Visual Aids in Cooking via Mask Inpainting | Oleh Kuzyk et.al. | 2506.18569 | translate | read | null |
| 2025-06-23 | Efficient Beam Selection for ISAC in Cell-Free Massive MIMO via Digital Twin-Assisted Deep Reinforcement Learning | Jiexin Zhang et.al. | 2506.18560 | translate | read | null |
| 2025-06-23 | Auto-Regressively Generating Multi-View Consistent Images | JiaKui Hu et.al. | 2506.18527 | translate | read | link |
| 2025-06-23 | ShowFlow: From Robust Single Concept to Condition-Free Multi-Concept Generation | Trong-Vu Hoang et.al. | 2506.18493 | translate | read | null |
| 2025-06-23 | GANs vs. Diffusion Models for virtual staining with the HER2match dataset | Pascal Klöckner et.al. | 2506.18484 | translate | read | null |
| 2025-06-23 | Frequency-Domain Fusion Transformer for Image Inpainting | Sijin He et.al. | 2506.18437 | translate | read | null |
| 2025-06-23 | Transforming H&E images into IHC: A Variance-Penalized GAN for Precision Oncology | Sara Rehmat et.al. | 2506.18371 | translate | read | null |
| 2025-06-23 | Geometry-Aware Preference Learning for 3D Texture Generation | AmirHossein Zamani et.al. | 2506.18331 | translate | read | link |
| 2025-06-20 | Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens | Zeyuan Yang et.al. | 2506.17218 | translate | read | link |
| 2025-06-20 | DreamCube: 3D Panorama Generation via Multi-plane Synchronization | Yukun Huang et.al. | 2506.17206 | translate | read | link |
| 2025-06-20 | Deep generative models as the probability transformation functions | Vitalii Bondar et.al. | 2506.17171 | translate | read | null |
| 2025-06-20 | Proportional Sensitivity in Generative Adversarial Network (GAN)-Augmented Brain Tumor Classification Using Convolutional Neural Network | Mahin Montasir Afif et.al. | 2506.17165 | translate | read | null |
| 2025-06-20 | The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation | Giulia Bertazzini et.al. | 2506.17016 | translate | read | null |
| 2025-06-20 | AI’s Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario | Ciro Beneduce et.al. | 2506.16898 | translate | read | null |
| 2025-06-20 | ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors | Junghyun Koo et.al. | 2506.16889 | translate | read | link |
| 2025-06-20 | Reward-Agnostic Prompt Optimization for Text-to-Image Diffusion Models | Semin Kim et.al. | 2506.16853 | translate | read | null |
| 2025-06-20 | Beyond Blur: A Fluid Perspective on Generative Diffusion Models | Grzegorz Gruszczynski et.al. | 2506.16827 | translate | read | null |
| 2025-06-20 | FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation | Fan Yang et.al. | 2506.16806 | translate | read | null |
| 2025-06-18 | Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model | Anirud Aggarwal et.al. | 2506.15682 | translate | read | link |
| 2025-06-18 | Control and Realism: Best of Both Worlds in Layout-to-Image without Training | Bonan Li et.al. | 2506.15563 | translate | read | null |
| 2025-06-18 | When Model Knowledge meets Diffusion Model: Diffusion-assisted Data-free Image Synthesis with Alignment of Domain and Class | Yujin Kim et.al. | 2506.15381 | translate | read | null |
| 2025-06-18 | Sampling 3D Molecular Conformers with Diffusion Transformers | J. Thorben Frank et.al. | 2506.15378 | translate | read | link |
| 2025-06-18 | ALMASOP. A Rotating Feature Rich in Complex Organic Molecules in a Protostellar Core | Shih-Ying Hsu et.al. | 2506.15140 | translate | read | null |
| 2025-06-18 | CWGAN-GP Augmented CAE for Jamming Detection in 5G-NR in Non-IID Datasets | Samhita Kuili et.al. | 2506.15075 | translate | read | null |
| 2025-06-18 | GalaxyGenius: A Mock Galaxy Image Generator for Various Telescopes from Hydrodynamical Simulations | Xingchen Zhou et.al. | 2506.15060 | translate | read | null |
| 2025-06-18 | Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer? | Gary Song Yan et.al. | 2506.15033 | translate | read | null |
| 2025-06-17 | Frequency-Calibrated Membership Inference Attacks on Medical Image Diffusion Models | Xinkai Zhao et.al. | 2506.14919 | translate | read | null |
| 2025-06-17 | DETONATE: A Benchmark for Text-to-Image Alignment and Kernelized Direct Preference Optimization | Renjith Prasad et.al. | 2506.14903 | translate | read | null |
| 2025-06-17 | Cost-Aware Routing for Efficient Text-To-Image Generation | Qinchan et.al. | 2506.14753 | translate | read | null |
| 2025-06-17 | Align Your Flow: Scaling Continuous-Time Flow Map Distillation | Amirmojtaba Sabour et.al. | 2506.14603 | translate | read | null |
| 2025-06-17 | Risk Estimation of Knee Osteoarthritis Progression via Predictive Multi-task Modelling from Efficient Diffusion Model using X-ray Images | David Butler et.al. | 2506.14560 | translate | read | null |
| 2025-06-17 | Decoupled Classifier-Free Guidance for Counterfactual Diffusion Models | Tian Xia et.al. | 2506.14399 | translate | read | null |
| 2025-06-17 | DiffusionBlocks: Blockwise Training for Generative Models via Score-Based Diffusion | Makoto Shing et.al. | 2506.14202 | translate | read | null |
| 2025-06-17 | VideoMAR: Autoregressive Video Generatio with Continuous Tokens | Hu Yu et.al. | 2506.14168 | translate | read | null |
| 2025-06-16 | Fake it till You Make it: Reward Modeling as Discriminative Prediction | Runtao Liu et.al. | 2506.13846 | translate | read | null |
| 2025-06-16 | Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis | Martina Pastorino et.al. | 2506.13484 | translate | read | null |
| 2025-06-16 | SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer | Zerui Gong et.al. | 2506.13465 | translate | read | link |
| 2025-06-16 | Overcoming Occlusions in the Wild: A Multi-Task Age Head Approach to Age Estimation | Waqar Tanveer et.al. | 2506.13445 | translate | read | null |
| 2025-06-16 | PRO: Projection Domain Synthesis for CT Imaging | Kang Chen et.al. | 2506.13443 | translate | read | null |
| 2025-06-16 | Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Image Concepts | Solène Debuysère et.al. | 2506.13307 | translate | read | null |
| 2025-06-16 | Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention | Jeonghoon Park et.al. | 2506.13298 | translate | read | link |
| 2025-06-14 | Towards Seamless Borders: A Method for Mitigating Inconsistencies in Image Inpainting and Outpainting | Xingzhong Hou et.al. | 2506.12530 | translate | read | null |
| 2025-06-14 | Retrieval Augmented Comic Image Generation | Yunhao Shui et.al. | 2506.12517 | translate | read | null |
| 2025-06-14 | Fine-Grained HDR Image Quality Assessment From Noticeably Distorted to Very High Fidelity | Mohsen Jenadeleh et.al. | 2506.12505 | translate | read | null |
| 2025-06-14 | Stacked Intelligent Metasurfaces for Multi-Modal Semantic Communications | Guojun Huang et.al. | 2506.12368 | translate | read | null |
| 2025-06-13 | Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation | Min-Seop Kwak et.al. | 2506.11924 | translate | read | link |
| 2025-06-13 | Exploring the Effectiveness of Deep Features from Domain-Specific Foundation Models in Retinal Image Synthesis | Zuzanna Skorniewska et.al. | 2506.11753 | translate | read | null |
| 2025-06-13 | A Watermark for Auto-Regressive Image Generation Models | Yihan Wu et.al. | 2506.11371 | translate | read | null |
| 2025-06-12 | TARDIS STRIDE: A Spatio-Temporal Road Image Dataset for Exploration and Autonomy | Héctor Carrión et.al. | 2506.11302 | translate | read | link |
| 2025-06-13 | MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning | Yuxuan Luo et.al. | 2506.10963 | translate | read | link |
| 2025-06-12 | The Role of Generative AI in Facilitating Social Interactions: A Scoping Review | T. T. J. E. Arets et.al. | 2506.10927 | translate | read | null |
| 2025-06-12 | Symmetrical Flow Matching: Unified Image Generation, Segmentation, and Classification with Score-Based Generative Models | Francisco Caetano et.al. | 2506.10634 | translate | read | link |
| 2025-06-12 | Anatomy-Grounded Weakly Supervised Prompt Tuning for Chest X-ray Latent Diffusion Models | Konstantinos Vilouras et.al. | 2506.10633 | translate | read | null |
| 2025-06-12 | High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model | Eshan Ramesh et.al. | 2506.10605 | translate | read | null |
| 2025-06-12 | Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning | Chun-Mei Feng et.al. | 2506.10575 | translate | read | null |
| 2025-06-12 | Unitary Scrambling and Collapse: A Quantum Diffusion Framework for Generative Modeling | Yihua Li et.al. | 2506.10571 | translate | read | null |
| 2025-06-12 | Edit360: 2D Image Edits to 3D Assets from Any Angle | Junchao Huang et.al. | 2506.10507 | translate | read | null |
| 2025-06-12 | Generative Algorithms for Wildfire Progression Reconstruction from Multi-Modal Satellite Active Fire Measurements and Terrain Height | Bryan Shaddy et.al. | 2506.10404 | translate | read | null |
| 2025-06-12 | Pisces: An Auto-regressive Foundation Model for Image Understanding and Generation | Zhiyang Xu et.al. | 2506.10395 | translate | read | null |
| 2025-06-11 | Canonical Latent Representations in Conditional Diffusion Models | Yitao Xu et.al. | 2506.09955 | translate | read | null |
| 2025-06-11 | HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations | Marco Federici et.al. | 2506.09932 | translate | read | null |
| 2025-06-11 | Only-Style: Stylistic Consistency in Image Generation without Content Leakage | Tilemachos Aravanis et.al. | 2506.09916 | translate | read | null |
| 2025-06-11 | Wasserstein Distances on Quantum Structures: an Overview | Emily Beatty et.al. | 2506.09794 | translate | read | null |
| 2025-06-11 | ELBO-T2IAlign: A Generic ELBO-Based Method for Calibrating Pixel-level Text-Image Alignment in Diffusion Models | Qin Zhou et.al. | 2506.09740 | translate | read | null |
| 2025-06-11 | DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning | Dongxu Liu et.al. | 2506.09644 | translate | read | null |
| 2025-06-11 | Consistent Story Generation with Asymmetry Zigzag Sampling | Mingxiao LI et.al. | 2506.09612 | translate | read | link |
| 2025-06-11 | Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression | Dingcheng Zhen et.al. | 2506.09482 | translate | read | link |
| 2025-06-11 | Noise Conditional Variational Score Distillation | Xinyu Peng et.al. | 2506.09416 | translate | read | null |
| 2025-06-11 | SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing | Hongguang Zhu et.al. | 2506.09363 | translate | read | null |
| 2025-06-09 | StableMTL: Repurposing Latent Diffusion Models for Multi-Task Learning from Partially Annotated Synthetic Datasets | Anh-Quan Cao et.al. | 2506.08013 | translate | read | null |
| 2025-06-09 | MADFormer: Mixed Autoregressive and Diffusion Transformers for Continuous Image Generation | Junhao Chen et.al. | 2506.07999 | translate | read | null |
| 2025-06-09 | OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation | Jingjing Chang et.al. | 2506.07977 | translate | read | null |
| 2025-06-09 | Diffuse Everything: Multimodal Diffusion Models on Arbitrary State Spaces | Kevin Rojas et.al. | 2506.07903 | translate | read | null |
| 2025-06-09 | Diffusion Counterfactual Generation with Semantic Abduction | Rajat Rasal et.al. | 2506.07883 | translate | read | null |
| 2025-06-09 | VIVAT: Virtuous Improving VAE Training through Artifact Mitigation | Lev Novitskiy et.al. | 2506.07863 | translate | read | null |
| 2025-06-09 | Evaluating Robustness in Latent Diffusion Models via Embedding Level Augmentation | Boris Martirosyan et.al. | 2506.07706 | translate | read | null |
| 2025-06-09 | Explore the vulnerability of black-box models via diffusion models | Jiacheng Shi et.al. | 2506.07590 | translate | read | null |
| 2025-06-09 | Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries | Haoxiang Wang et.al. | 2506.07555 | translate | read | null |
| 2025-06-09 | APTOS-2024 challenge report: Generation of synthetic 3D OCT images from fundus photographs | Bowen Liu et.al. | 2506.07542 | translate | read | null |
| 2025-06-06 | STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis | Jiatao Gu et.al. | 2506.06276 | translate | read | null |
| 2025-06-06 | GenIR: Generative Visual Feedback for Mental Image Retrieval | Diji Yang et.al. | 2506.06220 | translate | read | null |
| 2025-06-06 | Feedback Guidance of Diffusion Models | Koulischer Felix et.al. | 2506.06085 | translate | read | null |
| 2025-06-06 | Optimization-Free Universal Watermark Forgery with Regenerative Diffusion Models | Chaoyi Zhu et.al. | 2506.06018 | translate | read | null |
| 2025-06-06 | Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection | Yu Li et.al. | 2506.05872 | translate | read | null |
| 2025-06-06 | Microstructural Studies Using Generative Adversarial Network (GAN): a Case Study | Owais Ahmad et.al. | 2506.05860 | translate | read | null |
| 2025-06-06 | Peer-Ranked Precision: Creating a Foundational Dataset for Fine-Tuning Vision Models from DataSeeds’ Annotated Imagery | Sajjad Abdoli et.al. | 2506.05673 | translate | read | link |
| 2025-06-05 | UniRes: Universal Image Restoration for Complex Degradations | Mo Zhou et.al. | 2506.05599 | translate | read | null |
| 2025-06-05 | On Fitting Flow Models with Large Sinkhorn Couplings | Michal Klein et.al. | 2506.05526 | translate | read | null |
| 2025-06-05 | FocusDiff: Advancing Fine-Grained Text-Image Alignment for Autoregressive Visual Generation through RL | Kaihang Pan et.al. | 2506.05501 | translate | read | null |
| 2025-06-05 | ContentV: Efficient Training of Video Generation Models with Limited Compute | Wenfeng Lin et.al. | 2506.05343 | translate | read | null |
| 2025-06-05 | AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model | Pingyu Wu et.al. | 2506.05289 | translate | read | null |
| 2025-06-05 | Aligning Latent Spaces with Flow Priors | Yizhuo Li et.al. | 2506.05240 | translate | read | link |
| 2025-06-05 | PixCell: A generative foundation model for digital histopathology images | Srikar Yellapragada et.al. | 2506.05127 | translate | read | link |
| 2025-06-05 | Membership Inference Attacks on Sequence Models | Lorenzo Rossi et.al. | 2506.05126 | translate | read | null |
| 2025-06-05 | DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models | Revant Teotia et.al. | 2506.05108 | translate | read | null |
| 2025-06-05 | CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx | Lukas Picek et.al. | 2506.04931 | translate | read | null |
| 2025-06-05 | Invisible Backdoor Triggers in Image Editing Model via Deep Watermarking | Yu-Feng Chen et.al. | 2506.04879 | translate | read | link |
| 2025-06-05 | Geological Field Restoration through the Lens of Image Inpainting | Vladislav Trifonov et.al. | 2506.04869 | translate | read | null |
| 2025-06-05 | Improving AI-generated music with user-guided training | Vishwa Mohan Singh et.al. | 2506.04852 | translate | read | null |
| 2025-06-04 | Image Editing As Programs with Diffusion Models | Yujia Hu et.al. | 2506.04158 | translate | read | link |
| 2025-06-04 | Towards Better Disentanglement in Non-Autoregressive Zero-Shot Expressive Voice Conversion | Seymanur Akti et.al. | 2506.04013 | translate | read | null |
| 2025-06-04 | RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors | Hicham Eddoubi et.al. | 2506.03988 | translate | read | link |
| 2025-06-04 | Improving Post-Processing for Quantitative Precipitation Forecasting Using Deep Learning: Learning Precipitation Physics from High-Resolution Observations | ChangJae Lee et.al. | 2506.03842 | translate | read | null |
| 2025-06-04 | Advancements in Artificial Intelligence Applications for Cardiovascular Disease Research | Yuanlin Mo et.al. | 2506.03698 | translate | read | null |
| 2025-06-04 | EmoArt: A Multidimensional Dataset for Emotion-Aware Artistic Generation | Cheng Zhang et.al. | 2506.03652 | translate | read | null |
| 2025-06-04 | ControlThinker: Unveiling Latent Semantics for Controllable Image Generation through Visual Reasoning | Feng Han et.al. | 2506.03596 | translate | read | link |
| 2025-06-03 | Robustness in Both Domains: CLIP Needs a Robust Text Encoder | Elias Abad Rocamora et.al. | 2506.03355 | translate | read | null |
| 2025-06-04 | UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation | Bin Lin et.al. | 2506.03147 | translate | read | link |
| 2025-06-03 | SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation | Siqi Chen et.al. | 2506.03139 | translate | read | link |
| 2025-06-03 | Native-Resolution Image Synthesis | Zidong Wang et.al. | 2506.03131 | translate | read | link |
| 2025-06-03 | EDITOR: Effective and Interpretable Prompt Inversion for Text-to-Image Diffusion Models | Mingzhe Li et.al. | 2506.03067 | translate | read | null |
| 2025-06-03 | Rethinking Machine Unlearning in Image Generation Models | Renyang Liu et.al. | 2506.02761 | translate | read | null |
| 2025-06-03 | Solving Inverse Problems with FLAIR | Julius Erbach et.al. | 2506.02680 | translate | read | null |
| 2025-06-03 | ControlMambaIR: Conditional Controls with State-Space Model for Image Restoration | Cheng Yang et.al. | 2506.02633 | translate | read | null |
| 2025-06-03 | Synthetic Iris Image Databases and Identity Leakage: Risks and Mitigation Strategies | Ada Sawilska et.al. | 2506.02626 | translate | read | null |
| 2025-06-03 | Hyperspectral Image Generation with Unmixing Guided Diffusion Model | Shiyu Shen et.al. | 2506.02601 | translate | read | null |
| 2025-06-03 | DCI: Dual-Conditional Inversion for Boosting Diffusion-Based Image Editing | Zixiang Li et.al. | 2506.02560 | translate | read | null |
| 2025-06-03 | Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning | Stepan Shabalin et.al. | 2505.24360 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)