Image Generation - 2025-04
Image Generation - 2025-04
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-04-30 | 3D Stylization via Large Reconstruction Model | Ipek Oztas et.al. | 2504.21836 | translate | read | null |
| 2025-04-30 | Why Compress What You Can Generate? When GPT-4o Generation Ushers in Image Compression Fields | Yixin Gao et.al. | 2504.21814 | translate | read | null |
| 2025-04-30 | Latent Feature-Guided Conditional Diffusion for High-Fidelity Generative Image Semantic Communication | Zehao Chen et.al. | 2504.21577 | translate | read | null |
| 2025-04-30 | Wasserstein-Aitchison GAN for angular measures of multivariate extremes | Stéphane Lhaut et.al. | 2504.21438 | translate | read | null |
| 2025-04-30 | Sparse-to-Sparse Training of Diffusion Models | Inês Cardoso Oliveira et.al. | 2504.21380 | translate | read | null |
| 2025-04-30 | Nexus-Gen: A Unified Model for Image Understanding, Generation, and Editing | Hong Zhang et.al. | 2504.21356 | translate | read | link |
| 2025-04-30 | Text-Conditioned Diffusion Model for High-Fidelity Korean Font Generation | Abdul Sami et.al. | 2504.21325 | translate | read | null |
| 2025-04-30 | AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images | Yunhao Li et.al. | 2504.21308 | translate | read | null |
| 2025-04-30 | Can We Achieve Efficient Diffusion without Self-Attention? Distilling Self-Attention into Convolutions | ZiYi Dong et.al. | 2504.21292 | translate | read | null |
| 2025-04-29 | Artificial Intelligence for Personalized Prediction of Alzheimer’s Disease Progression: A Survey of Methods, Data Challenges, and Future Directions | Gulsah Hancerliogullari Koksalmis et.al. | 2504.21189 | translate | read | null |
| 2025-04-29 | YoChameleon: Personalized Vision and Language Generation | Thao Nguyen et.al. | 2504.20998 | translate | read | link |
| 2025-04-29 | Efficient Listener: Dyadic Facial Motion Synthesis via Action Diffusion | Zesheng Wang et.al. | 2504.20685 | translate | read | null |
| 2025-04-29 | DiffusionRIR: Room Impulse Response Interpolation using Diffusion Models | Sagi Della Torre et.al. | 2504.20625 | translate | read | null |
| 2025-04-29 | Digital Shielding for Cross-Domain Wi-Fi Signal Adaptation using Relativistic Average Generative Adversarial Network | Danilo Avola et.al. | 2504.20568 | translate | read | null |
| 2025-04-29 | Generate more than one child in your co-evolutionary semi-supervised learning GAN | Francisco Sedeño et.al. | 2504.20560 | translate | read | null |
| 2025-04-29 | PixelHacker: Image Inpainting with Structural and Semantic Consistency | Ziyang Xu et.al. | 2504.20438 | translate | read | link |
| 2025-04-29 | Inception: Jailbreak the Memory Mechanism of Text-to-Image Generation Systems | Shiqian Zhao et.al. | 2504.20376 | translate | read | null |
| 2025-04-29 | A Picture is Worth a Thousand Prompts? Efficacy of Iterative Human-Driven Prompt Refinement in Image Regeneration Tasks | Khoi Trinh et.al. | 2504.20340 | translate | read | null |
| 2025-04-28 | Physics-Informed Diffusion Models for SAR Ship Wake Generation from Text Prompts | Kamirul Kamirul et.al. | 2504.20241 | translate | read | null |
| 2025-04-28 | CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition | Quynh Phung et.al. | 2504.19894 | translate | read | null |
| 2025-04-28 | DeeCLIP: A Robust and Generalizable Transformer-Based Framework for Detecting AI-Generated Images | Mamadou Keita et.al. | 2504.19876 | translate | read | link |
| 2025-04-28 | RepText: Rendering Visual Text via Replicating | Haofan Wang et.al. | 2504.19724 | translate | read | link |
| 2025-04-28 | Transformation & Translation Occupancy Grid Mapping: 2-Dimensional Deep Learning Refined SLAM | Leon Davies et.al. | 2504.19654 | translate | read | null |
| 2025-04-28 | GAN-SLAM: Real-Time GAN Aided Floor Plan Creation Through SLAM | Leon Davies et.al. | 2504.19653 | translate | read | null |
| 2025-04-28 | Image Generation Method Based on Heat Diffusion Models | Pengfei Zhang et.al. | 2504.19600 | translate | read | null |
| 2025-04-28 | WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution | Pietro Bongini et.al. | 2504.19595 | translate | read | null |
| 2025-04-28 | GenPTW: In-Generation Image Watermarking for Provenance Tracing and Tamper Localization | Zhenliang Gan et.al. | 2504.19567 | translate | read | null |
| 2025-04-28 | Masked Language Prompting for Generative Data Augmentation in Few-shot Fashion Style Recognition | Yuki Hirakawa et.al. | 2504.19455 | translate | read | null |
| 2025-04-27 | Flow Along the K-Amplitude for Generative Modeling | Weitao Du et.al. | 2504.19353 | translate | read | null |
| 2025-04-25 | Intelligent Attacks and Defense Methods in Federated Learning-enabled Energy-Efficient Wireless Networks | Han Zhang et.al. | 2504.18519 | translate | read | null |
| 2025-04-25 | HepatoGEN: Generating Hepatobiliary Phase MRI with Perceptual and Adversarial Models | Jens Hooge et.al. | 2504.18405 | translate | read | null |
| 2025-04-25 | COCO-Inpaint: A Benchmark for Image Inpainting Detection and Manipulation Localization | Haozhen Yan et.al. | 2504.18361 | translate | read | null |
| 2025-04-25 | TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation | Shintaro Ozaki et.al. | 2504.18269 | translate | read | null |
| 2025-04-25 | Optimizing Multi-Round Enhanced Training in Diffusion Models for Improved Preference Understanding | Kun Li et.al. | 2504.18204 | translate | read | null |
| 2025-04-25 | Diffusion-Driven Universal Model Inversion Attack for Face Recognition | Hanrui Wang et.al. | 2504.18015 | translate | read | null |
| 2025-04-24 | CANet: ChronoAdaptive Network for Enhanced Long-Term Time Series Forecasting under Non-Stationarity | Mert Sonmezer et.al. | 2504.17913 | translate | read | null |
| 2025-04-24 | Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models | Xu Ma et.al. | 2504.17789 | translate | read | null |
| 2025-04-24 | Generative Fields: Uncovering Hierarchical Feature Control for StyleGAN via Inverted Receptive Fields | Zhuo He et.al. | 2504.17712 | translate | read | null |
| 2025-04-24 | STCL:Curriculum learning Strategies for deep learning image steganography models | Fengchun Liu et.al. | 2504.17609 | translate | read | null |
| 2025-04-24 | A Machine Learning Approach for Denoising and Upsampling HRTFs | Xuyi Hu et.al. | 2504.17586 | translate | read | null |
| 2025-04-24 | Text-to-Image Alignment in Denoising-Based Models through Step Selection | Paul Grimal et.al. | 2504.17525 | translate | read | null |
| 2025-04-24 | ESDiff: Encoding Strategy-inspired Diffusion Model with Few-shot Learning for Color Image Inpainting | Junyan Zhang et.al. | 2504.17524 | translate | read | null |
| 2025-04-24 | RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation | Aviv Slobodkin et.al. | 2504.17502 | translate | read | null |
| 2025-04-24 | StereoMamba: Real-time and Robust Intraoperative Stereo Disparity Estimation via Long-range Spatial Dependencies | Xu Wang et.al. | 2504.17401 | translate | read | null |
| 2025-04-24 | DRC: Enhancing Personalized Image Generation via Disentangled Representation Composition | Yiyan Xu et.al. | 2504.17349 | translate | read | null |
| 2025-04-24 | Physics-based super-resolved simulation of 3D elastic wave propagation adopting scalable Diffusion Transformer | Hugo Gabrielidis et.al. | 2504.17308 | translate | read | null |
| 2025-04-23 | High-Quality Cloud-Free Optical Image Synthesis Using Multi-Temporal SAR and Contaminated Optical Data | Chenxi Duan et.al. | 2504.16870 | translate | read | null |
| 2025-04-23 | A Comprehensive Survey of Synthetic Tabular Data Generation | Ruxue Shi et.al. | 2504.16506 | translate | read | null |
| 2025-04-23 | CLPSTNet: A Progressive Multi-Scale Convolutional Steganography Model Integrating Curriculum Learning | Fengchun Liu et.al. | 2504.16364 | translate | read | null |
| 2025-04-22 | Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching | Junn Yong Loo et.al. | 2504.16262 | translate | read | null |
| 2025-04-22 | Survey of Video Diffusion Models: Foundations, Implementations, and Applications | Yimu Wang et.al. | 2504.16081 | translate | read | null |
| 2025-04-22 | From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning | Le Zhuo et.al. | 2504.16080 | translate | read | link |
| 2025-04-22 | Boosting Generative Image Modeling via Joint Image-Feature Synthesis | Theodoros Kouzelis et.al. | 2504.16064 | translate | read | link |
| 2025-04-22 | FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation | Zebin Yao et.al. | 2504.15958 | translate | read | link |
| 2025-04-22 | New Recipe for Semi-supervised Community Detection: Clique Annealing under Crystallization Kinetics | Ling Cheng et.al. | 2504.15927 | translate | read | null |
| 2025-04-22 | DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers | Xuyang Zhong et.al. | 2504.15827 | translate | read | null |
| 2025-04-22 | Satellite to GroundScape – Large-scale Consistent Ground View Generation from Satellite Views | Ningli Xu et.al. | 2504.15786 | translate | read | null |
| 2025-04-21 | Application of Deep Generative Models for Anomaly Detection in Complex Financial Transactions | Tengda Tang et.al. | 2504.15491 | translate | read | null |
| 2025-04-21 | Emergence and Evolution of Interpretable Concepts in Diffusion Models | Berk Tinaz et.al. | 2504.15473 | translate | read | null |
| 2025-04-21 | StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians | Cailin Zhuang et.al. | 2504.15281 | translate | read | null |
| 2025-04-22 | LACE: Controlled Image Prompting and Iterative Refinement with GenAI for Professional Visual Art Creators | Yenkai Huang et.al. | 2504.15189 | translate | read | null |
| 2025-04-21 | Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration | Junyuan Deng et.al. | 2504.15159 | translate | read | null |
| 2025-04-21 | GIFDL: Generated Image Fluctuation Distortion Learning for Enhancing Steganographic Security | Xiangkun Wang et.al. | 2504.15139 | translate | read | null |
| 2025-04-21 | Fast-Slow Co-advancing Optimizer: Toward Harmonious Adversarial Training of GAN | Lin Wang et.al. | 2504.15099 | translate | read | null |
| 2025-04-22 | VistaDepth: Frequency Modulation With Bias Reweighting For Enhanced Long-Range Depth Estimation | Mingxia Zhan et.al. | 2504.15095 | translate | read | null |
| 2025-04-21 | TWIG: Two-Step Image Generation using Segmentation Masks in Diffusion Models | Mazharul Islam Rakib et.al. | 2504.14933 | translate | read | null |
| 2025-04-21 | Twin Co-Adaptive Dialogue for Progressive Image Generation | Jianhui Wang et.al. | 2504.14868 | translate | read | null |
| 2025-04-21 | LACE: Exploring Turn-Taking and Parallel Interaction Modes in Human-AI Co-Creation for Iterative Image Generation | YenKai Huang et.al. | 2504.14827 | translate | read | null |
| 2025-04-21 | What Lurks Within? Concept Auditing for Shared Diffusion Models at Scale | Xiaoyong Yuan et.al. | 2504.14815 | translate | read | null |
| 2025-04-18 | Collective Learning Mechanism based Optimal Transport Generative Adversarial Network for Non-parallel Voice Conversion | Sandipan Dhar et.al. | 2504.13791 | translate | read | null |
| 2025-04-18 | MLEP: Multi-granularity Local Entropy Patterns for Universal AI-generated Image Detection | Lin Yuan et.al. | 2504.13726 | translate | read | null |
| 2025-04-18 | SupResDiffGAN a new approach for the Super-Resolution task | Dawid Kopeć et.al. | 2504.13622 | translate | read | null |
| 2025-04-18 | U-Shape Mamba: State Space Model for faster diffusion | Alex Ergasti et.al. | 2504.13499 | translate | read | link |
| 2025-04-18 | Early Timestep Zero-Shot Candidate Selection for Instruction-Guided Image Editing | Joowon Kim et.al. | 2504.13490 | translate | read | null |
| 2025-04-18 | POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation | Evans Xu Han et.al. | 2504.13392 | translate | read | null |
| 2025-04-17 | SMPL-GPTexture: Dual-View 3D Human Texture Estimation using Text-to-Image Generation Models | Mingxiao Tu et.al. | 2504.13378 | translate | read | null |
| 2025-04-17 | Personalized Text-to-Image Generation with Auto-Regressive Models | Kaiyue Sun et.al. | 2504.13162 | translate | read | link |
| 2025-04-17 | Science-T2I: Addressing Scientific Illusions in Image Synthesis | Jialuo Li et.al. | 2504.13129 | translate | read | null |
| 2025-04-17 | Hadamard product in deep learning: Introduction, Advances and Challenges | Grigorios G Chrysos et.al. | 2504.13112 | translate | read | null |
| 2025-04-17 | HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation | Wenqi Dong et.al. | 2504.13072 | translate | read | null |
| 2025-04-17 | ArtistAuditor: Auditing Artist Style Pirate in Text-to-Image Generation Models | Linkang Du et.al. | 2504.13061 | translate | read | link |
| 2025-04-17 | RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins | Yao Mu et.al. | 2504.13059 | translate | read | null |
| 2025-04-17 | High-Fidelity Image Inpainting with Multimodal Guided GAN Inversion | Libo Zhang et.al. | 2504.12844 | translate | read | null |
| 2025-04-17 | Supporting Urban Low-Altitude Economy: Channel Gain Map Inference Based on 3D Conditional GAN | Yonghao Wang et.al. | 2504.12794 | translate | read | null |
| 2025-04-17 | Privacy Protection Against Personalized Text-to-Image Synthesis via Cross-image Consistency Constraints | Guanyu Wang et.al. | 2504.12747 | translate | read | null |
| 2025-04-17 | SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding | Qianqian Sun et.al. | 2504.12704 | translate | read | link |
| 2025-04-16 | Beyond Reconstruction: A Physics Based Neural Deferred Shader for Photo-realistic Rendering | Zhuo He et.al. | 2504.12273 | translate | read | null |
| 2025-04-16 | SIDME: Self-supervised Image Demoiréing via Masked Encoder-Decoder Reconstruction | Xia Wang et.al. | 2504.12245 | translate | read | null |
| 2025-04-16 | Cobra: Efficient Line Art COlorization with BRoAder References | Junhao Zhuang et.al. | 2504.12240 | translate | read | link |
| 2025-04-16 | Anti-Aesthetics: Protecting Facial Privacy against Customized Text-to-Image Synthesis | Songping Wang et.al. | 2504.12129 | translate | read | null |
| 2025-04-16 | Instruction-augmented Multimodal Alignment for Image-Text and Element Matching | Xinli Yue et.al. | 2504.12018 | translate | read | null |
| 2025-04-16 | Novel-view X-ray Projection Synthesis through Geometry-Integrated Deep Learning | Daiqi Liu et.al. | 2504.11953 | translate | read | null |
| 2025-04-16 | Mind2Matter: Creating 3D Models from EEG Signals | Xia Deng et.al. | 2504.11936 | translate | read | null |
| 2025-04-16 | Synthetic Data for Blood Vessel Network Extraction | Joël Mathys et.al. | 2504.11858 | translate | read | null |
| 2025-04-16 | ACE: Attentional Concept Erasure in Diffusion Models | Finn Carter et.al. | 2504.11850 | translate | read | null |
| 2025-04-16 | HyperKING: Quantum-Classical Generative Adversarial Networks for Hyperspectral Image Restoration | Chia-Hsiang Lin et.al. | 2504.11782 | translate | read | null |
| 2025-04-15 | Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception | Ziqi Pang et.al. | 2504.11457 | translate | read | link |
| 2025-04-15 | ADT: Tuning Diffusion Models with Adversarial Supervision | Dazhong Shen et.al. | 2504.11423 | translate | read | null |
| 2025-04-15 | Omni $^2$ : Unifying Omnidirectional Image Generation and Editing in an Omni Model | Liu Yang et.al. | 2504.11379 | translate | read | link |
| 2025-04-15 | Seedream 3.0 Technical Report | Yu Gao et.al. | 2504.11346 | translate | read | null |
| 2025-04-15 | Using LLMs as prompt modifier to avoid biases in AI image generators | René Peinl et.al. | 2504.11104 | translate | read | null |
| 2025-04-15 | UKDM: Underwater keypoint detection and matching using underwater image enhancement techniques | Pedro Diaz-Garcia et.al. | 2504.11063 | translate | read | null |
| 2025-04-15 | AnimeDL-2M: Million-Scale AI-Generated Anime Image Detection and Localization in Diffusion Era | Chenyang Zhu et.al. | 2504.11015 | translate | read | null |
| 2025-04-15 | Bringing together invertible UNets with invertible attention modules for memory-efficient diffusion models | Karan Jain et.al. | 2504.10883 | translate | read | null |
| 2025-04-15 | IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism | Janna Bruner et.al. | 2504.10822 | translate | read | null |
| 2025-04-15 | Generative and Explainable AI for High-Dimensional Channel Estimation | Nghia Thinh Nguyen et.al. | 2504.10775 | translate | read | link |
| 2025-04-14 | Art3D: Training-Free 3D Generation from Flat-Colored Illustration | Xiaoyan Cong et.al. | 2504.10466 | translate | read | null |
| 2025-04-14 | Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing | Taihang Hu et.al. | 2504.10434 | translate | read | null |
| 2025-04-14 | InstructEngine: Instruction-driven Text-to-Image Alignment | Xingyu Lu et.al. | 2504.10329 | translate | read | null |
| 2025-04-14 | Trade-offs in Privacy-Preserving Eye Tracking through Iris Obfuscation: A Benchmarking Study | Mengdi Wang et.al. | 2504.10267 | translate | read | link |
| 2025-04-14 | VibrantLeaves: A principled parametric image generator for training deep restoration models | Raphael Achddou et.al. | 2504.10201 | translate | read | null |
| 2025-04-14 | GeoUni: A Unified Model for Generating Geometry Diagrams, Problems and Problem Solutions | Jo-Ku Cheng et.al. | 2504.10146 | translate | read | link |
| 2025-04-14 | Masked Autoencoder Self Pre-Training for Defect Detection in Microelectronics | Nikolai Röhrich et.al. | 2504.10021 | translate | read | null |
| 2025-04-14 | Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese Dishes | Huijie Liu et.al. | 2504.09948 | translate | read | null |
| 2025-04-14 | EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise | Chao Liu et.al. | 2504.09789 | translate | read | null |
| 2025-04-13 | Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training | Lexington Whalen et.al. | 2504.09606 | translate | read | null |
| 2025-04-11 | GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation | Tianwei Xiong et.al. | 2504.08736 | translate | read | link |
| 2025-04-11 | End-to-End Demonstration of Quantum Generative Adversarial Networks for Steel Microstructure Image Augmentation on a Trapped-Ion Quantum Computer | Samwel Sekwao et.al. | 2504.08728 | translate | read | null |
| 2025-04-11 | Generating Fine Details of Entity Interactions | Xinyi Gu et.al. | 2504.08714 | translate | read | null |
| 2025-04-11 | Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging | Gabriele Lozupone et.al. | 2504.08635 | translate | read | link |
| 2025-04-11 | Discriminator-Free Direct Preference Optimization for Video Diffusion | Haoran Cheng et.al. | 2504.08542 | translate | read | null |
| 2025-04-11 | On the Design of Diffusion-based Neural Speech Codecs | Pietro Foti et.al. | 2504.08470 | translate | read | null |
| 2025-04-11 | Muon-Accelerated Attention Distillation for Real-Time Edge Synthesis via Optimized Latent Diffusion | Weiye Chen et.al. | 2504.08451 | translate | read | null |
| 2025-04-11 | MixDiT: Accelerating Image Diffusion Transformer Inference with Mixed-Precision MX Quantization | Daeun Kim et.al. | 2504.08398 | translate | read | null |
| 2025-04-11 | LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs | Jiarui Wang et.al. | 2504.08358 | translate | read | link |
| 2025-04-11 | Geometric Consistency Refinement for Single Image Novel View Synthesis via Test-Time Adaptation of Diffusion Models | Josef Bengtson et.al. | 2504.08348 | translate | read | null |
| 2025-04-10 | PixelFlow: Pixel-Space Generative Models with Flow | Shoufa Chen et.al. | 2504.07963 | translate | read | link |
| 2025-04-10 | VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning | Zhong-Yu Li et.al. | 2504.07960 | translate | read | link |
| 2025-04-10 | DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows | Mashrur M. Morshed et.al. | 2504.07894 | translate | read | null |
| 2025-04-10 | Towards Sustainable Creativity Support: An Exploratory Study on Prompt Based Image Generation | Daniel Hove Paludan et.al. | 2504.07879 | translate | read | null |
| 2025-04-10 | Conformalized Generative Bayesian Imaging: An Uncertainty Quantification Framework for Computational Imaging | Canberk Ekmekci et.al. | 2504.07696 | translate | read | null |
| 2025-04-10 | FlexIP: Dynamic Control of Preservation and Personality for Customized Image Generation | Linyan Huang et.al. | 2504.07405 | translate | read | null |
| 2025-04-10 | ID-Booth: Identity-consistent Face Generation with Diffusion Models | Darian Tomašević et.al. | 2504.07392 | translate | read | link |
| 2025-04-10 | Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction | Qingchao Jiang et.al. | 2504.07382 | translate | read | link |
| 2025-04-09 | OmniCaptioner: One Captioner to Rule Them All | Yiting Lu et.al. | 2504.07089 | translate | read | link |
| 2025-04-09 | A Unified Agentic Framework for Evaluating Conditional Image Generation | Jifang Wang et.al. | 2504.07046 | translate | read | link |
| 2025-04-09 | MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs | Jiawei Mao et.al. | 2504.06897 | translate | read | null |
| 2025-04-09 | EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation | Diljeet Jagpal et.al. | 2504.06861 | translate | read | null |
| 2025-04-09 | DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation | Wangbo Zhao et.al. | 2504.06803 | translate | read | null |
| 2025-04-09 | A Meaningful Perturbation Metric for Evaluating Explainability Methods | Danielle Cohen et.al. | 2504.06800 | translate | read | null |
| 2025-04-10 | Compass Control: Multi Object Orientation Control for Text-to-Image Generation | Rishubh Parihar et.al. | 2504.06752 | translate | read | null |
| 2025-04-09 | Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception | Ruotian Peng et.al. | 2504.06666 | translate | read | link |
| 2025-04-09 | Collision avoidance from monocular vision trained with novel view synthesis | Valentin Tordjman–Levavasseur et.al. | 2504.06651 | translate | read | null |
| 2025-04-09 | PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering | Yifan Gao et.al. | 2504.06632 | translate | read | null |
| 2025-04-08 | Transfer between Modalities with MetaQueries | Xichen Pan et.al. | 2504.06256 | translate | read | null |
| 2025-04-08 | HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance | Jiazi Bu et.al. | 2504.06232 | translate | read | link |
| 2025-04-08 | A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model | Jihun Park et.al. | 2504.06144 | translate | read | null |
| 2025-04-08 | Explainable AI for building energy retrofitting under data scarcity | Panagiota Rempi et.al. | 2504.06055 | translate | read | null |
| 2025-04-08 | An Empirical Study of GPT-4o Image Generation Capabilities | Sixiang Chen et.al. | 2504.05979 | translate | read | link |
| 2025-04-08 | CKGAN: Training Generative Adversarial Networks Using Characteristic Kernel Integral Probability Metrics | Kuntian Zhang et.al. | 2504.05945 | translate | read | null |
| 2025-04-08 | Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking | Junxi Chen et.al. | 2504.05838 | translate | read | null |
| 2025-04-08 | Parasite: A Steganography-based Backdoor Attack Framework for Diffusion Models | Jiahao Chen et.al. | 2504.05815 | translate | read | null |
| 2025-04-08 | Storybooth: Training-free Multi-Subject Consistency for Improved Visual Storytelling | Jaskirat Singh et.al. | 2504.05800 | translate | read | null |
| 2025-04-07 | Generative Adversarial Networks with Limited Data: A Survey and Benchmarking | Omar De Mitri et.al. | 2504.05456 | translate | read | null |
| 2025-04-07 | Gaussian Mixture Flow Matching Models | Hansheng Chen et.al. | 2504.05304 | translate | read | link |
| 2025-04-07 | Fine tuning generative adversarial networks with universal force fields: application to two-dimensional topological insulators | Alexander C. Tyner et.al. | 2504.04940 | translate | read | null |
| 2025-04-07 | Imagining the Far East: Exploring Perceived Biases in AI-Generated Images of East Asian Women | Xingyu Lan et.al. | 2504.04865 | translate | read | null |
| 2025-04-07 | AnyArtisticGlyph: Multilingual Controllable Artistic Glyph Generation | Xiongbo Lu et.al. | 2504.04743 | translate | read | link |
| 2025-04-07 | Bridging Knowledge Gap Between Image Inpainting and Large-Area Visible Watermark Removal | Yicheng Leng et.al. | 2504.04687 | translate | read | null |
| 2025-04-06 | Your Image Generator Is Your New Private Dataset | Nicolo Resmini et.al. | 2504.04582 | translate | read | null |
| 2025-04-06 | Attributed Synthetic Data Generation for Zero-shot Domain-specific Image Classification | Shijian Wang et.al. | 2504.04510 | translate | read | null |
| 2025-04-06 | Thermoxels: a voxel-based method to generate simulation-ready 3D thermal models | Etienne Chassaing et.al. | 2504.04448 | translate | read | null |
| 2025-04-06 | FluentLip: A Phonemes-Based Two-stage Approach for Audio-Driven Lip Synthesis with Optical Flow Consistency | Shiyan Liu et.al. | 2504.04427 | translate | read | null |
| 2025-04-06 | UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding | Yang Jiao et.al. | 2504.04423 | translate | read | link |
| 2025-04-04 | MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models | Wulin Xie et.al. | 2504.03641 | translate | read | link |
| 2025-04-04 | Dynamic Importance in Diffusion U-Net for Enhanced Image Synthesis | Xi Wang et.al. | 2504.03471 | translate | read | null |
| 2025-04-04 | FLAIRBrainSeg: Fine-grained brain segmentation using FLAIR MRI only | Edern Le Bot et.al. | 2504.03376 | translate | read | null |
| 2025-04-04 | QIRL: Boosting Visual Question Answering via Optimized Question-Image Relation Learning | Quanxing Xu et.al. | 2504.03337 | translate | read | null |
| 2025-04-03 | VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning | Xianwei Zhuang et.al. | 2504.02949 | translate | read | link |
| 2025-04-03 | Bias in Large Language Models Across Clinical Applications: A Systematic Review | Thanathip Suenghataiphorn et.al. | 2504.02917 | translate | read | null |
| 2025-04-03 | F-ViTA: Foundation Model Guided Visible to Thermal Translation | Jay N. Paranjape et.al. | 2504.02801 | translate | read | link |
| 2025-04-03 | GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation | Zhiyuan Yan et.al. | 2504.02782 | translate | read | link |
| 2025-04-03 | RoSMM: A Robust and Secure Multi-Modal Watermarking Framework for Diffusion Models | ZhongLi Fang et.al. | 2504.02640 | translate | read | null |
| 2025-04-03 | Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation | Jiwoo Chung et.al. | 2504.02612 | translate | read | link |
| 2025-04-03 | AC-LoRA: Auto Component LoRA for Personalized Artistic Style Image Generation | Zhipu Cui et.al. | 2504.02231 | translate | read | null |
| 2025-04-02 | Foreground Focus: Enhancing Coherence and Fidelity in Camouflaged Image Generation | Pei-Chi Chen et.al. | 2504.02180 | translate | read | null |
| 2025-04-02 | Neural Style Transfer for Synthesising a Dataset of Ancient Egyptian Hieroglyphs | Lewis Matheson Creed et.al. | 2504.02163 | translate | read | null |
| 2025-04-02 | Less-to-More Generalization: Unlocking More Controllability by In-Context Generation | Shaojin Wu et.al. | 2504.02160 | translate | read | link |
| 2025-04-03 | ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement | Runhui Huang et.al. | 2504.01934 | translate | read | link |
| 2025-04-02 | FineLIP: Extending CLIP’s Reach via Fine-Grained Alignment with Longer Text Inputs | Mothilal Asokan et.al. | 2504.01916 | translate | read | link |
| 2025-04-02 | A $^\text{T}$ A: Adaptive Transformation Agent for Text-Guided Subject-Position Variable Background Inpainting | Yizhe Tang et.al. | 2504.01603 | translate | read | null |
| 2025-04-02 | Instance Migration Diffusion for Nuclear Instance Segmentation in Pathology | Lirui Qi et.al. | 2504.01577 | translate | read | null |
| 2025-04-02 | Pro-DG: Procedural Diffusion Guidance for Architectural Facade Generation | Aleksander Plocharski et.al. | 2504.01571 | translate | read | null |
| 2025-04-02 | Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis | Zixuan Wang et.al. | 2504.01515 | translate | read | null |
| 2025-04-02 | High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model | Yiyang Shen et.al. | 2504.01512 | translate | read | null |
| 2025-04-02 | From Easy to Hard: Building a Shortcut for Differentially Private Image Synthesis | Kecen Li et.al. | 2504.01395 | translate | read | link |
| 2025-04-01 | rPPG-SysDiaGAN: Systolic-Diastolic Feature Localization in rPPG Using Generative Adversarial Network with Multi-Domain Discriminator | Banafsheh Adami et.al. | 2504.01220 | translate | read | null |
| 2025-04-01 | Prompting Forgetting: Unlearning in GANs via Textual Guidance | Piyush Nagasubramaniam et.al. | 2504.01218 | translate | read | null |
| 2025-04-01 | Time-Series Forecasting via Topological Information Supervised Framework with Efficient Topological Feature Learning | ZiXin Lin et.al. | 2503.23757 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)