Image Generation - 2025-01
Image Generation - 2025-01
| Publish Date | Title | Authors | Translate | Read | Code | ||
|---|---|---|---|---|---|---|---|
| 2025-01-31 | Application of Generative Adversarial Network (GAN) for Synthetic Training Data Creation to improve performance of ANN Classifier for extracting Built-Up pixels from Landsat Satellite Imagery | Amritendu Mukherjee et.al. | 2501.19283 | translate | read | null | |
| 2025-01-31 | Ambient Denoising Diffusion Generative Adversarial Networks for Establishing Stochastic Object Models from Noisy Image Data | Xichen Xu et.al. | 2501.19094 | translate | read | null | |
| 2025-01-31 | Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations | Dahye Kim et.al. | 2501.19066 | translate | read | link | |
| 2025-01-31 | BCAT: A Block Causal Transformer for PDE Foundation Models for Fluid Dynamics | Yuxuan Liu et.al. | 2501.18972 | translate | read | null | |
| 2025-01-31 | Distorting Embedding Space for Safety: A Defense Mechanism for Adversarially Robust Diffusion Models | Jaesin Ahn et.al. | 2501.18877 | translate | read | link | |
| 2025-01-31 | REG: Rectified Gradient Guidance for Conditional Diffusion Models | Zhengqi Gao et.al. | 2501.18865 | translate | read | null | |
| 2025-01-30 | High-Accuracy ECG Image Interpretation using Parameter-Efficient LoRA Fine-Tuning with Multimodal LLaMA 3.2 | Nandakishor M et.al. | 2501.18670 | translate | read | null | |
| 2025-01-30 | Diffusion Autoencoders are Scalable Image Tokenizers | Yinbo Chen et.al. | 2501.18593 | translate | read | null | |
| 2025-01-30 | CGAN-Based Framework for Meson Mass and Width Prediction | S. Rostami et.al. | 2501.18562 | translate | read | null | |
| 2025-01-30 | SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer | Enze Xie et.al. | 2501.18427 | translate | read | null | |
| 2025-01-30 | MatIR: A Hybrid Mamba-Transformer Image Restoration Model | Juan Wen et.al. | 2501.18401 | translate | read | null | |
| 2025-01-30 | Simulation of microstructures and machine learning | Katja Schladitz et.al. | 2501.18313 | translate | read | null | |
| 2025-01-30 | LLMs can see and hear without any training | Kumar Ashutosh et.al. | 2501.18096 | translate | read | link | |
| 2025-01-29 | Generative AI for Vision: A Comprehensive Study of Frameworks and Applications | Fouad Bousetouane et.al. | 2501.18033 | translate | read | null | |
| 2025-01-29 | Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling | Xiaokang Chen et.al. | 2501.17811 | translate | read | link | |
| 2025-01-29 | A Framework for Generating Realistic Synthetic Tabular Data in a Randomized Controlled Trial Setting | Niki Z. Petrakos et.al. | 2501.17719 | translate | read | null | |
| 2025-01-29 | Segmentation-Aware Generative Reinforcement Network (GRN) for Tissue Layer Segmentation in 3-D Ultrasound Images for Chronic Low-back Pain (cLBP) Assessment | Zixue Zeng et.al. | 2501.17690 | translate | read | null | |
| 2025-01-29 | Trustworthy image-to-image translation: evaluating uncertainty calibration in unpaired training scenarios | Ciaran Bench et.al. | 2501.17570 | translate | read | null | |
| 2025-01-28 | Text-to-Image Generation for Vocabulary Learning Using the Keyword Method | Nuwan T. Attygalle et.al. | 2501.17099 | translate | read | null | |
| 2025-01-28 | MAUCell: An Adaptive Multi-Attention Framework for Video Frame Prediction | Shreyam Gupta et.al. | 2501.16997 | translate | read | null | |
| 2025-01-28 | DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation | Chenguo Lin et.al. | 2501.16764 | translate | read | link | |
| 2025-01-29 | Polyp-Gen: Realistic and Diverse Polyp Image Generation for Endoscopic Dataset Expansion | Shengyuan Liu et.al. | 2501.16679 | translate | read | link | |
| 2025-01-28 | Variational Schrödinger Momentum Diffusion | Kevin Rojas et.al. | 2501.16675 | translate | read | null | |
| 2025-01-27 | LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation | Farzad Farhadzadeh et.al. | 2501.16559 | translate | read | null | |
| 2025-01-27 | Towards Robust Stability Prediction in Smart Grids: GAN-based Approach under Data Constraints and Adversarial Challenges | Emad Efatinasab et.al. | 2501.16490 | translate | read | null | |
| 2025-01-27 | RelightVid: Temporal-Consistent Diffusion Model for Video Relighting | Ye Fang et.al. | 2501.16330 | translate | read | link | |
| 2025-01-27 | MetaDecorator: Generating Immersive Virtual Tours through Multimodality | Shuang Xie et.al. | 2501.16164 | translate | read | null | |
| 2025-01-27 | Generating Spatial Synthetic Populations Using Wasserstein Generative Adversarial Network: A Case Study with EU-SILC Data for Helsinki and Thessaloniki | Vanja Falck et.al. | 2501.16080 | translate | read | null | |
| 2025-01-27 | Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation | Adil Kaan Akan et.al. | 2501.15878 | translate | read | null | |
| 2025-01-27 | Can Location Embeddings Enhance Super-Resolution of Satellite Imagery? | Daniel Panangian et.al. | 2501.15847 | translate | read | null | |
| 2025-01-27 | Autonomous Horizon-based Asteroid Navigation With Observability-constrained Maneuvers | Aditya Arjun Anibha et.al. | 2501.15806 | translate | read | null | |
| 2025-01-27 | Do Existing Testing Tools Really Uncover Gender Bias in Text-to-Image Models? | Yunbo Lyu et.al. | 2501.15775 | translate | read | null | |
| 2025-01-26 | Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting | Yuxin Zhang et.al. | 2501.15641 | translate | read | link | |
| 2025-01-26 | Comparative clinical evaluation of “memory-efficient” synthetic 3d generative adversarial networks (gan) head-to-head to state of art: results on computed tomography of the chest | Mahshid shiri et.al. | 2501.15572 | translate | read | null | |
| 2025-01-26 | SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity | Zichen Fan et.al. | 2501.15448 | translate | read | null | |
| 2025-01-24 | Towards Scalable Topological Regularizers | Hiu-Tung Wong et.al. | 2501.14641 | translate | read | null | |
| 2025-01-24 | Training-Free Style and Content Transfer by Leveraging U-Net Skip Connections in Stable Diffusion 2.* | Ludovica Schaerf et.al. | 2501.14524 | translate | read | null | |
| 2025-01-24 | PAID: A Framework of Product-Centric Advertising Image Design | Hongyu Chen et.al. | 2501.14316 | translate | read | null | |
| 2025-01-24 | CDI: Blind Image Restoration Fidelity Evaluation based on Consistency with Degraded Image | Xiaojun Tang et.al. | 2501.14264 | translate | read | null | |
| 2025-01-24 | VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking | Runyi Hu et.al. | 2501.14195 | translate | read | link | |
| 2025-01-24 | Fully Guided Neural Schrödinger bridge for Brain MR image synthesis | Hanyeol Yang et.al. | 2501.14171 | translate | read | null | |
| 2025-01-23 | LLM-guided Instance-level Image Manipulation with Diffusion U-Net Cross-Attention Maps | Andrey Palaev et.al. | 2501.14046 | translate | read | link | |
| 2025-01-23 | Can We Generate Images with CoT? Let’s Verify and Reinforce Image Generation Step by Step | Ziyu Guo et.al. | 2501.13926 | translate | read | link | |
| 2025-01-23 | IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models | Jiayi Lei et.al. | 2501.13920 | translate | read | link | |
| 2025-01-23 | Binary Diffusion Probabilistic Model | Vitaliy Kinakh et.al. | 2501.13915 | translate | read | null | |
| 2025-01-23 | Generating Realistic Forehead-Creases for User Verification via Conditioned Piecewise Polynomial Curves | Abhishek Tandon et.al. | 2501.13889 | translate | read | link | |
| 2025-01-23 | Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference | Shuqi Dai et.al. | 2501.13870 | translate | read | null | |
| 2025-01-23 | The Lock Generative Adversarial Network for Medical Waveform Anomaly Detection | Wenjie Xu et.al. | 2501.13858 | translate | read | null | |
| 2025-01-23 | Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing | Hao Zhang et.al. | 2501.13831 | translate | read | null | |
| 2025-01-23 | PhotoGAN: Generative Adversarial Neural Network Acceleration with Silicon Photonics | Tharini Suresh et.al. | 2501.13828 | translate | read | null | |
| 2025-01-23 | A Mutual Information Perspective on Multiple Latent Variable Generative Models for Positive View Generation | Dario Serez et.al. | 2501.13718 | translate | read | link | |
| 2025-01-23 | One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt | Tao Liu et.al. | 2501.13554 | translate | read | link | |
| 2025-01-22 | Accelerate High-Quality Diffusion Models with Inner Loop Feedback | Matthew Gwilliam et.al. | 2501.13107 | translate | read | null | |
| 2025-01-22 | Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation | Akshay Krishnan et.al. | 2501.13087 | translate | read | null | |
| 2025-01-22 | On the Use of WGANs for Super Resolution in Dark-Matter Simulations | John Brennan et.al. | 2501.13056 | translate | read | null | |
| 2025-01-22 | LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation | Jiahao Wang et.al. | 2501.12976 | translate | read | null | |
| 2025-01-22 | PreciseCam: Precise Camera Control for Text-to-Image Generation | Edurne Bernal-Berdun et.al. | 2501.12910 | translate | read | null | |
| 2025-01-22 | Certified Guidance for Planning with Deep Generative Models | Francesco Giacomarra et.al. | 2501.12815 | translate | read | null | |
| 2025-01-22 | T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation | Lijun Li et.al. | 2501.12612 | translate | read | link | |
| 2025-01-21 | Bidirectional Brain Image Translation using Transfer Learning from Generic Pre-trained Models | Fatima Haimour et.al. | 2501.12488 | translate | read | null | |
| 2025-01-22 | GPS as a Control Signal for Image Generation | Chao Feng et.al. | 2501.12390 | translate | read | null | |
| 2025-01-21 | Parallel Sequence Modeling via Generalized Spatial Propagation Network | Hongjun Wang et.al. | 2501.12381 | translate | read | null | |
| 2025-01-21 | Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists | Thomas F. Eisenmann et.al. | 2501.12374 | translate | read | link | |
| 2025-01-21 | VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model | Xianwei Zhuang et.al. | 2501.12327 | translate | read | link | |
| 2025-01-21 | Regressor-Guided Image Editing Regulates Emotional Response to Reduce Online Engagement | Christoph Gebhardt et.al. | 2501.12289 | translate | read | null | |
| 2025-01-21 | VipDiff: Towards Coherent and Diverse Video Inpainting via Training-free Denoising Diffusion Models | Chaohao Xie et.al. | 2501.12267 | translate | read | null | |
| 2025-01-21 | ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions | Shiyue Zhang et.al. | 2501.12173 | translate | read | link | |
| 2025-01-20 | Are generative models fair? A study of racial bias in dermatological image generation | Miguel López-Pérez et.al. | 2501.11752 | translate | read | null | |
| 2025-01-20 | StAyaL | Multilingual Style Transfer | Karishma Thakrar et.al. | 2501.11639 | translate | read | null |
| 2025-01-20 | StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer | Ruojun Xu et.al. | 2501.11319 | translate | read | null | |
| 2025-01-17 | Credit Risk Identification in Supply Chains Using Generative Adversarial Networks | Zizhou Zhang et.al. | 2501.10348 | translate | read | null | |
| 2025-01-17 | DiffVSR: Enhancing Real-World Video Super-Resolution with Diffusion Models for Advanced Visual Quality and Temporal Consistency | Xiaohui Li et.al. | 2501.10110 | translate | read | null | |
| 2025-01-17 | HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution | Shengkui Zhao et.al. | 2501.10045 | translate | read | link | |
| 2025-01-17 | Spatiotemporal Prediction of Secondary Crashes by Rebalancing Dynamic and Static Data with Generative Adversarial Networks | Junlan Chen et.al. | 2501.10041 | translate | read | null | |
| 2025-01-17 | Physics-informed DeepCT: Sinogram Wavelet Decomposition Meets Masked Diffusion | Zekun Zhou et.al. | 2501.09935 | translate | read | null | |
| 2025-01-17 | IE-Bench: Advancing the Measurement of Text-Driven Image Editing for Human Perception Alignment | Shangkun Sun et.al. | 2501.09927 | translate | read | null | |
| 2025-01-16 | PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery | Shristi Das Biswas et.al. | 2501.09826 | translate | read | link | |
| 2025-01-16 | Learnings from Scaling Visual Tokenizers for Reconstruction and Generation | Philippe Hansen-Estruch et.al. | 2501.09755 | translate | read | null | |
| 2025-01-16 | Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps | Nanye Ma et.al. | 2501.09732 | translate | read | null | |
| 2025-01-16 | AnyStory: Towards Unified Single and Multiple Subject Personalization in Text-to-Image Generation | Junjie He et.al. | 2501.09503 | translate | read | link | |
| 2025-01-16 | Mining the time axis with TRON. I. Millisecond pulsars in Omega Centauri, Terzan 5 and 47 Tucanae detected through MeerKAT interferometric imaging | Oleg M. Smirnov et.al. | 2501.09488 | translate | read | null | |
| 2025-01-16 | Dynamic Neural Style Transfer for Artistic Image Generation using VGG19 | Kapil Kashyap et.al. | 2501.09420 | translate | read | null | |
| 2025-01-16 | SVIA: A Street View Image Anonymization Framework for Self-Driving Applications | Dongyu Liu et.al. | 2501.09393 | translate | read | null | |
| 2025-01-16 | Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse | Guangyuan Liu et.al. | 2501.09391 | translate | read | null | |
| 2025-01-16 | SEAL: Entangled White-box Watermarks on Low-Rank Adaptation | Giyeong Oh et.al. | 2501.09284 | translate | read | link | |
| 2025-01-15 | Grounding Text-To-Image Diffusion Models For Controlled High-Quality Image Generation | Ahmad Süleyman et.al. | 2501.09194 | translate | read | null | |
| 2025-01-15 | Generative diffusion model with inverse renormalization group flows | Kanta Masuki et.al. | 2501.09064 | translate | read | link | |
| 2025-01-15 | How Do Generative Models Draw a Software Engineer? A Case Study on Stable Diffusion Bias | Tosin Fadahunsi et.al. | 2501.09014 | translate | read | link | |
| 2025-01-15 | Multimodal LLMs Can Reason about Aesthetics in Zero-Shot | Ruixiang Jiang et.al. | 2501.09012 | translate | read | link | |
| 2025-01-15 | VECT-GAN: A variationally encoded generative model for overcoming data scarcity in pharmaceutical science | Youssef Abdalla et.al. | 2501.08995 | translate | read | link | |
| 2025-01-15 | Enhanced Multi-Scale Cross-Attention for Person Image Generation | Hao Tang et.al. | 2501.08900 | translate | read | null | |
| 2025-01-15 | XMusic: Towards a Generalized and Controllable Symbolic Music Generation Framework | Sida Tian et.al. | 2501.08809 | translate | read | null | |
| 2025-01-15 | Investigating Parameter-Efficiency of Hybrid QuGANs Based on Geometric Properties of Generated Sea Route Graphs | Tobias Rohe et.al. | 2501.08678 | translate | read | null | |
| 2025-01-15 | StereoGen: High-quality Stereo Image Generation from a Single Image | Xianqi Wang et.al. | 2501.08654 | translate | read | link | |
| 2025-01-15 | Joint Learning of Depth and Appearance for Portrait Image Animation | Xinya Ji et.al. | 2501.08649 | translate | read | null | |
| 2025-01-15 | Watermarking in Diffusion Model: Gaussian Shading with Exact Diffusion Inversion via Coupled Transformations (EDICT) | Krishna Panthi et.al. | 2501.08604 | translate | read | null | |
| 2025-01-15 | Stability and convergence of relaxed scalar auxiliary variable schemes for Cahn-Hilliard systems with bounded mass source | Kei Fong Lam et.al. | 2501.08543 | translate | read | null | |
| 2025-01-14 | D $^2$ -DPM: Dual Denoising for Quantized Diffusion Probabilistic Models | Qian Zeng et.al. | 2501.08180 | translate | read | link | |
| 2025-01-14 | Benchmarking Multimodal Models for Fine-Grained Image Analysis: A Comparative Study Across Diverse Visual Features | Evgenii Evstafev et.al. | 2501.08170 | translate | read | null | |
| 2025-01-14 | Prediction Interval Construction Method for Electricity Prices | Xin Lu et.al. | 2501.07827 | translate | read | null | |
| 2025-01-14 | On the Statistical Capacity of Deep Generative Models | Edric Tam et.al. | 2501.07763 | translate | read | link | |
| 2025-01-13 | Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens | Dongwon Kim et.al. | 2501.07730 | translate | read | null | |
| 2025-01-13 | Pedestrian Trajectory Prediction Based on Social Interactions Learning With Random Weights | Jiajia Xie et.al. | 2501.07711 | translate | read | null | |
| 2025-01-13 | OCORD: Open-Campus Object Removal Dataset | Shuo Zhang et.al. | 2501.07397 | translate | read | null | |
| 2025-01-13 | Boosting Text-To-Image Generation via Multilingual Prompting in Large Multimodal Models | Yongyu Mu et.al. | 2501.07086 | translate | read | link | |
| 2025-01-13 | Enhancing Image Generation Fidelity via Progressive Prompts | Zhen Xiong et.al. | 2501.07070 | translate | read | link | |
| 2025-01-13 | SFC-GAN: A Generative Adversarial Network for Brain Functional and Structural Connectome Translation | Yee-Fan Tan et.al. | 2501.07055 | translate | read | null | |
| 2025-01-13 | Detection of AI Deepfake and Fraud in Online Payments Using GAN-Based Models | Zong Ke et.al. | 2501.07033 | translate | read | null | |
| 2025-01-12 | Super-Resolution of 3D Micro-CT Images Using Generative Adversarial Networks: Enhancing Resolution and Segmentation Accuracy | Evgeny Ugolkov et.al. | 2501.06939 | translate | read | link | |
| 2025-01-12 | Defect Detection Network In PCB Circuit Devices Based on GAN Enhanced YOLOv11 | Jiayi Huang et.al. | 2501.06879 | translate | read | null | |
| 2025-01-12 | SuperNeRF-GAN: A Universal 3D-Consistent Super-Resolution Framework for Efficient and Enhanced 3D-Aware Image Synthesis | Peng Zheng et.al. | 2501.06770 | translate | read | null | |
| 2025-01-12 | ODPG: Outfitting Diffusion with Pose Guided Condition | Seohyun Lee et.al. | 2501.06769 | translate | read | null | |
| 2025-01-12 | Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models | Michael Toker et.al. | 2501.06751 | translate | read | null | |
| 2025-01-10 | Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models | Sofia Jamil et.al. | 2501.05839 | translate | read | link | |
| 2025-01-10 | EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model | Yi He et.al. | 2501.05710 | translate | read | null | |
| 2025-01-10 | HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection | Anant Mehta et.al. | 2501.05631 | translate | read | link | |
| 2025-01-09 | Towards Probabilistic Inference of Human Motor Intentions by Assistive Mobile Robots Controlled via a Brain-Computer Interface | Xiaoshan Zhou et.al. | 2501.05610 | translate | read | null | |
| 2025-01-09 | Consistent Flow Distillation for Text-to-3D Generation | Runjie Yan et.al. | 2501.05445 | translate | read | null | |
| 2025-01-09 | Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation | Xuyi Meng et.al. | 2501.05427 | translate | read | null | |
| 2025-01-09 | Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation | Darius Petermann et.al. | 2501.05413 | translate | read | null | |
| 2025-01-09 | CROPS: Model-Agnostic Training-Free Framework for Safe Image Synthesis with Latent Diffusion Models | Junha Park et.al. | 2501.05359 | translate | read | null | |
| 2025-01-09 | Patch-GAN Transfer Learning with Reconstructive Models for Cloud Removal | Wanli Ma et.al. | 2501.05265 | translate | read | null | |
| 2025-01-09 | 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering | Dewei Zhou et.al. | 2501.05131 | translate | read | link | |
| 2025-01-08 | LayerMix: Enhanced Data Augmentation through Fractal Integration for Robust Deep Learning | Hafiz Mughees Ahmad et.al. | 2501.04861 | translate | read | null | |
| 2025-01-08 | EditAR: Unified Conditional Generation with Autoregressive Models | Jiteng Mu et.al. | 2501.04699 | translate | read | null | |
| 2025-01-08 | Optimal Trading of a Charging-Station Company in Auction Markets for Electricity | Farnaz Sohrabi et.al. | 2501.04647 | translate | read | null | |
| 2025-01-08 | Accelerated Discovery of Vanadium Oxide Compositions: A WGAN-VAE Framework for Materials Design | Danial Ebrahimzadeh et.al. | 2501.04604 | translate | read | null | |
| 2025-01-08 | Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time | Uri Berger et.al. | 2501.04513 | translate | read | null | |
| 2025-01-08 | On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis | Yekun Ke et.al. | 2501.04377 | translate | read | null | |
| 2025-01-08 | Circuit Complexity Bounds for Visual Autoregressive Model | Yekun Ke et.al. | 2501.04299 | translate | read | null | |
| 2025-01-07 | HistoryPalette: Supporting Exploration and Reuse of Past Alternatives in Image Generation and Editing | Karim Benharrak et.al. | 2501.04163 | translate | read | null | |
| 2025-01-07 | BiasGuard: Guardrailing Fairness in Machine Learning Production Systems | Nurit Cohen-Inger et.al. | 2501.04142 | translate | read | link | |
| 2025-01-07 | ZDySS – Zero-Shot Dynamic Scene Stylization using Gaussian Splatting | Abhishek Saroha et.al. | 2501.03875 | translate | read | null | |
| 2025-01-07 | A Value Mapping Virtual Staining Framework for Large-scale Histological Imaging | Junjia Wang et.al. | 2501.03592 | translate | read | null | |
| 2025-01-07 | Evaluating Image Caption via Cycle-consistent Text-to-Image Generation | Tianyu Cui et.al. | 2501.03567 | translate | read | null | |
| 2025-01-07 | PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models | Lingzhi Yuan et.al. | 2501.03544 | translate | read | null | |
| 2025-01-07 | Textualize Visual Prompt for Image Editing via Diffusion Bridge | Pengcheng Xu et.al. | 2501.03495 | translate | read | null | |
| 2025-01-07 | SceneBooth: Diffusion-based Framework for Subject-preserved Text-to-Image Generation | Shang Chai et.al. | 2501.03490 | translate | read | null | |
| 2025-01-07 | Physics-Constrained Generative Artificial Intelligence for Rapid Takeoff Trajectory Design | Samuel Sisk et.al. | 2501.03445 | translate | read | null | |
| 2025-01-06 | License Plate Images Generation with Diffusion Models | Mariia Shpir et.al. | 2501.03374 | translate | read | null | |
| 2025-01-06 | InpDiffusion: Image Inpainting Localization via Conditional Diffusion Models | Kai Wang et.al. | 2501.02816 | translate | read | null | |
| 2025-01-06 | DarkFarseer: Inductive Spatio-temporal Kriging via Hidden Style Enhancement and Sparsity-Noise Mitigation | Zhuoxuan Liang et.al. | 2501.02808 | translate | read | null | |
| 2025-01-06 | Enhancing Robot Route Optimization in Smart Logistics with Transformer and GNN Integration | Hao Luo et.al. | 2501.02749 | translate | read | null | |
| 2025-01-06 | Artificial Intelligence in Creative Industries: Advances Prior to 2025 | Nantheera Anantrasirichai et.al. | 2501.02725 | translate | read | null | |
| 2025-01-05 | Vision-Driven Prompt Optimization for Large Language Models in Multimodal Generative Tasks | Leo Franklin et.al. | 2501.02527 | translate | read | null | |
| 2025-01-05 | Face-MakeUp: Multimodal Facial Prompts for Text-to-Image Generation | Dawei Dai et.al. | 2501.02523 | translate | read | link | |
| 2025-01-05 | ACE++: Instruction-Based Image Creation and Editing via Context-Aware Content Filling | Chaojie Mao et.al. | 2501.02487 | translate | read | null | |
| 2025-01-05 | MedSegDiffNCA: Diffusion Models With Neural Cellular Automata for Skin Lesion Segmentation | Avni Mittal et.al. | 2501.02447 | translate | read | null | |
| 2025-01-04 | CorrFill: Enhancing Faithfulness in Reference-based Inpainting with Correspondence Guidance in Diffusion Models | Kuan-Hung Liu et.al. | 2501.02355 | translate | read | link | |
| 2025-01-04 | Design and Benchmarking of A Multi-Modality Sensor for Robotic Manipulation with GAN-Based Cross-Modality Interpretation | Dandan Zhang et.al. | 2501.02303 | translate | read | null | |
| 2025-01-03 | Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation | Mohammad Khalil et.al. | 2501.01793 | translate | read | null | |
| 2025-01-03 | Controlling your Attributes in Voice | Xuyuan Li et.al. | 2501.01674 | translate | read | null | |
| 2025-01-03 | Multivariate Time Series Anomaly Detection using DiffGAN Model | Guangqiang Wu et.al. | 2501.01591 | translate | read | null | |
| 2025-01-02 | Object-level Visual Prompts for Compositional Image Generation | Gaurav Parmar et.al. | 2501.01424 | translate | read | null | |
| 2025-01-02 | On Unifying Video Generation and Camera Pose Estimation | Chun-Hao Paul Huang et.al. | 2501.01409 | translate | read | null | |
| 2025-01-02 | ProjectedEx: Enhancing Generation in Explainable AI for Prostate Cancer | Xuyin Qi et.al. | 2501.01392 | translate | read | link | |
| 2025-01-02 | Test-time Controllable Image Generation by Explicit Spatial Constraint Enforcement | Z. Zhang et.al. | 2501.01368 | translate | read | null | |
| 2025-01-02 | LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge | Kyoungkook Kang et.al. | 2501.01197 | translate | read | null | |
| 2025-01-02 | HarmonyIQA: Pioneering Benchmark and Model for Image Harmonization Quality Assessment | Zitong Xu et.al. | 2501.01116 | translate | read | null | |
| 2025-01-02 | MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification | Jimin Park et.al. | 2501.01110 | translate | read | link | |
| 2025-01-02 | EliGen: Entity-Level Controlled Image Generation with Regional Attention | Hong Zhang et.al. | 2501.01097 | translate | read | null | |
| 2025-01-02 | State-of-the-art AI-based Learning Approaches for Deepfake Generation and Detection, Analyzing Opportunities, Threading through Pros, Cons, and Future Prospects | Harshika Goyal et.al. | 2501.01029 | translate | read | null | |
| 2025-01-01 | OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes | Sepehr Dehdashtian et.al. | 2501.00962 | translate | read | null | |
| 2025-01-02 | Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation | Yuanbo Yang et.al. | 2412.21117 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)