Image Generation - 2024-08
Image Generation - 2024-08
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-08-30 | Image-Perfect Imperfections: Safety, Bias, and Authenticity in the Shadow of Text-To-Image Model Evolution | Yixin Wu et.al. | 2408.17285 | translate | read | null |
| 2024-08-30 | VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers | Juncan Deng et.al. | 2408.17131 | translate | read | null |
| 2024-08-30 | FissionVAE: Federated Non-IID Image Generation with Latent Space and Decoder Decomposition | Chen Hu et.al. | 2408.17090 | translate | read | link |
| 2024-08-30 | Text-to-Image Generation Via Energy-Based CLIP | Roy Ganz et.al. | 2408.17046 | translate | read | null |
| 2024-08-30 | AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding | Yonghui Wang et.al. | 2408.16986 | translate | read | link |
| 2024-08-30 | Contrastive Learning with Synthetic Positives | Dewen Zeng et.al. | 2408.16965 | translate | read | link |
| 2024-08-29 | GameIR: A Large-Scale Synthesized Ground-Truth Dataset for Image Restoration over Gaming Content | Lebin Zhou et.al. | 2408.16866 | translate | read | null |
| 2024-08-29 | STEREO: Towards Adversarially Robust Concept Erasing from Text-to-Image Generation Models | Koushik Srivatsan et.al. | 2408.16807 | translate | read | link |
| 2024-08-29 | CSGO: Content-Style Composition in Text-to-Image Generation | Peng Xing et.al. | 2408.16766 | translate | read | link |
| 2024-08-29 | GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models | Moreno D’Incà et.al. | 2408.16700 | translate | read | link |
| 2024-08-29 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | translate | read | null |
| 2024-08-29 | GRPose: Learning Graph Relations for Human Image Generation with Pose Priors | Xiangchen Yin et.al. | 2408.16540 | translate | read | null |
| 2024-08-29 | Spiking Diffusion Models | Jiahang Cao et.al. | 2408.16467 | translate | read | link |
| 2024-08-29 | ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding | Minghang Zheng et.al. | 2408.16314 | translate | read | link |
| 2024-08-29 | Improving Diffusion-based Data Augmentation with Inversion Spherical Interpolation | Yanghao Wang et.al. | 2408.16266 | translate | read | null |
| 2024-08-29 | Enhancing Conditional Image Generation with Explainable Latent Space Manipulation | Kshitij Pathania et.al. | 2408.16232 | translate | read | link |
| 2024-08-29 | Anchor-Controlled Generative Adversarial Network for High-Fidelity Electromagnetic and Structurally Diverse Metasurface Design | Yunhui Zeng et.al. | 2408.16231 | translate | read | null |
| 2024-08-28 | Simulating realistic short tandem repeat capillary electrophoretic signal using a generative adversarial network | Duncan Taylor et.al. | 2408.16169 | translate | read | null |
| 2024-08-28 | CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization | Feize Wu et.al. | 2408.15914 | translate | read | null |
| 2024-08-28 | Disentangled Diffusion Autoencoder for Harmonization of Multi-site Neuroimaging Data | Ayodeji Ijishakin et.al. | 2408.15890 | translate | read | null |
| 2024-08-28 | Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas | Fabio Quattrini et.al. | 2408.15660 | translate | read | link |
| 2024-08-28 | GANs Conditioning Methods: A Survey | Anis Bourou et.al. | 2408.15640 | translate | read | null |
| 2024-08-28 | Dissipation-driven quantum generative adversarial networks | He Wang et.al. | 2408.15597 | translate | read | null |
| 2024-08-28 | Hand1000: Generating Realistic Hands from Text with Only 1,000 Images | Haozhuo Zhang et.al. | 2408.15461 | translate | read | null |
| 2024-08-28 | Avoiding Generative Model Writer’s Block With Embedding Nudging | Ali Zand et.al. | 2408.15450 | translate | read | null |
| 2024-08-27 | Histo-Diffusion: A Diffusion Super-Resolution Method for Digital Pathology with Comprehensive Quality Assessment | Xuan Xu et.al. | 2408.15218 | translate | read | null |
| 2024-08-27 | Automatic 8-tissue Segmentation for 6-month Infant Brains | Yilan Dong et.al. | 2408.15198 | translate | read | null |
| 2024-08-27 | T-FAKE: Synthesizing Thermal Images for Facial Landmarking | Philipp Flotho et.al. | 2408.15127 | translate | read | link |
| 2024-08-28 | User-level Social Multimedia Traffic Anomaly Detection with Meta-Learning | Tongtong Feng et.al. | 2408.14884 | translate | read | null |
| 2024-08-27 | Alfie: Democratising RGBA Image Generation With No $$$ | Fabio Quattrini et.al. | 2408.14826 | translate | read | link |
| 2024-08-27 | Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation | Abdelrahman Eldesokey et.al. | 2408.14819 | translate | read | null |
| 2024-08-27 | MaskCycleGAN-based Whisper to Normal Speech Conversion | K. Rohith Gupta et.al. | 2408.14797 | translate | read | null |
| 2024-08-27 | CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis | Weijia Li et.al. | 2408.14765 | translate | read | null |
| 2024-08-27 | Sequential-Scanning Dual-Energy CT Imaging Using High Temporal Resolution Image Reconstruction and Error-Compensated Material Basis Image Generation | Qiaoxin Li et.al. | 2408.14754 | translate | read | null |
| 2024-08-27 | Learning Differentially Private Diffusion Models via Stochastic Adversarial Distillation | Bochao Liu et.al. | 2408.14738 | translate | read | null |
| 2024-08-26 | GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy | Peiyan Li et.al. | 2408.14368 | translate | read | null |
| 2024-08-26 | ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty | Xindi Wu et.al. | 2408.14339 | translate | read | null |
| 2024-08-26 | Efficient Active Flow Control Strategy for Confined Square Cylinder Wake Using Deep Learning-Based Surrogate Model and Reinforcement Learning | Meng Zhang et.al. | 2408.14232 | translate | read | null |
| 2024-08-26 | Foodfusion: A Novel Approach for Food Image Composition via Diffusion Models | Chaohua Shi et.al. | 2408.14135 | translate | read | null |
| 2024-08-26 | Rate-Distortion-Perception Controllable Joint Source-Channel Coding for High-Fidelity Generative Communications | Kailin Tan et.al. | 2408.14127 | translate | read | null |
| 2024-08-25 | Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems | Mohammad Hossein Amini et.al. | 2408.13950 | translate | read | null |
| 2024-08-25 | RT-Attack: Jailbreaking Text-to-Image Models via Random Token | Sensen Gao et.al. | 2408.13896 | translate | read | null |
| 2024-08-25 | Prior Learning in Introspective VAEs | Ioannis Athanasiadis et.al. | 2408.13805 | translate | read | null |
| 2024-08-25 | SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting | Wenrui Li et.al. | 2408.13711 | translate | read | link |
| 2024-08-27 | Prompt-Softbox-Prompt: A free-text Embedding Control for Image Editing | Yitong Yang et.al. | 2408.13623 | translate | read | null |
| 2024-08-23 | Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation | Bonan Li et.al. | 2408.13149 | translate | read | null |
| 2024-08-23 | G3FA: Geometry-guided GAN for Face Animation | Alireza Javanmardi et.al. | 2408.13049 | translate | read | null |
| 2024-08-23 | EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation | Cong Wang et.al. | 2408.13005 | translate | read | null |
| 2024-08-23 | What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance | Yilun Liu et.al. | 2408.12910 | translate | read | link |
| 2024-08-22 | Unlocking Intrinsic Fairness in Stable Diffusion | Eunji Kim et.al. | 2408.12692 | translate | read | null |
| 2024-08-22 | Enhancing Transferability of Adversarial Attacks with GE-AdvGAN+: A Comprehensive Framework for Gradient Editing | Zhibo Jin et.al. | 2408.12673 | translate | read | null |
| 2024-08-22 | Show-o: One Single Transformer to Unify Multimodal Understanding and Generation | Jinheng Xie et.al. | 2408.12528 | translate | read | link |
| 2024-08-22 | CODE: Confident Ordinary Differential Editing | Bastien van Delft et.al. | 2408.12418 | translate | read | link |
| 2024-08-22 | Dynamic Product Image Generation and Recommendation at Scale for Personalized E-commerce | Ádám Tibor Czapp et.al. | 2408.12392 | translate | read | null |
| 2024-08-22 | Scalable Autoregressive Image Generation with Mamba | Haopeng Li et.al. | 2408.12245 | translate | read | link |
| 2024-08-22 | MedDiT: A Knowledge-Controlled Diffusion Transformer Framework for Dynamic Medical Image Generation in Virtual Simulated Patient | Yanzeng Li et.al. | 2408.12236 | translate | read | null |
| 2024-08-22 | BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking | Hanzheng Wang et.al. | 2408.12232 | translate | read | null |
| 2024-08-22 | DimeRec: A Unified Framework for Enhanced Sequential Recommendation via Generative Diffusion Models | Wuchao Li et.al. | 2408.12153 | translate | read | null |
| 2024-08-22 | Query-Efficient Video Adversarial Attack with Stylized Logo | Duoxun Tang et.al. | 2408.12099 | translate | read | null |
| 2024-08-22 | High-Quality Data Augmentation for Low-Resource NMT: Combining a Translation Memory, a GAN Generator, and Filtering | Hengjie Liu et.al. | 2408.12079 | translate | read | null |
| 2024-08-21 | Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization | Tianyi Lin et.al. | 2408.11974 | translate | read | null |
| 2024-08-21 | Pixel Is Not A Barrier: An Effective Evasion Attack for Pixel-Domain Diffusion Models | Chun-Yen Shih et.al. | 2408.11810 | translate | read | link |
| 2024-08-21 | Approaching Deep Learning through the Spectral Dynamics of Weights | David Yunis et.al. | 2408.11804 | translate | read | link |
| 2024-08-21 | JieHua Paintings Style Feature Extracting Model using Stable Diffusion with ControlNet | Yujia Gu et.al. | 2408.11744 | translate | read | null |
| 2024-08-21 | Iterative Object Count Optimization for Text-to-image Diffusion Models | Oz Zafar et.al. | 2408.11721 | translate | read | null |
| 2024-08-21 | FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting | Liyao Jiang et.al. | 2408.11706 | translate | read | link |
| 2024-08-21 | Latent Feature and Attention Dual Erasure Attack against Multi-View Diffusion Models for 3D Assets Protection | Jingwei Sun et.al. | 2408.11408 | translate | read | null |
| 2024-08-21 | Gender Bias Evaluation in Text-to-image Generation: A Survey | Yankun Wu et.al. | 2408.11358 | translate | read | null |
| 2024-08-21 | UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and Generation | Xiangyu Zhao et.al. | 2408.11305 | translate | read | link |
| 2024-08-20 | Compress Guidance in Conditional Diffusion Sampling | Anh-Dung Dinh et.al. | 2408.11194 | translate | read | null |
| 2024-08-20 | MS $^3$ D: A RG Flow-Based Regularization for GAN Training with Limited Data | Jian Wang et.al. | 2408.11135 | translate | read | null |
| 2024-08-20 | MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning | Haoning Wu et.al. | 2408.11001 | translate | read | link |
| 2024-08-20 | A Grey-box Attack against Latent Diffusion Model-based Image Editing by Posterior Collapse | Zhongliang Guo et.al. | 2408.10901 | translate | read | link |
| 2024-08-20 | Generating Multi-frame Ultrawide-field Fluorescein Angiography from Ultrawide-field Color Imaging Improves Diabetic Retinopathy Stratification | Ruoyu Chen et.al. | 2408.10636 | translate | read | null |
| 2024-08-20 | TextMastero: Mastering High-Quality Scene Text Editing in Diverse Languages and Styles | Tong Wang et.al. | 2408.10623 | translate | read | null |
| 2024-08-20 | MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration | Yanbo Ding et.al. | 2408.10605 | translate | read | link |
| 2024-08-20 | Prompt-Agnostic Adversarial Perturbation for Customized Diffusion Models | Cong Wan et.al. | 2408.10571 | translate | read | null |
| 2024-08-21 | FAGStyle: Feature Augmentation on Geodesic Surface for Zero-shot Text-guided Diffusion Image Style Transfer | Yuexing Han et.al. | 2408.10533 | translate | read | null |
| 2024-08-19 | The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks | Niyar R Barman et.al. | 2408.10446 | translate | read | null |
| 2024-08-19 | Fashion Image-to-Image Translation for Complementary Item Retrieval | Matteo Attimonelli et.al. | 2408.09847 | translate | read | null |
| 2024-08-19 | Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation | Yunxin Li et.al. | 2408.09787 | translate | read | link |
| 2024-08-19 | TraDiffusion: Trajectory-Based Training-Free Image Generation | Mingrui Wu et.al. | 2408.09739 | translate | read | link |
| 2024-08-19 | Diff2CT: Diffusion Learning to Reconstruct Spine CT from Biplanar X-Rays | Zhi Qiao et.al. | 2408.09731 | translate | read | null |
| 2024-08-19 | GANPrompt: Enhancing Robustness in LLM-Based Recommendations with GAN-Enhanced Diversity Prompts | Xinyu Li et.al. | 2408.09671 | translate | read | null |
| 2024-08-18 | AnomalyFactory: Regard Anomaly Generation as Unsupervised Anomaly Localization | Ying Zhao et.al. | 2408.09533 | translate | read | null |
| 2024-08-18 | Deformation-aware GAN for Medical Image Synthesis with Substantially Misaligned Pairs | Bowen Xin et.al. | 2408.09432 | translate | read | null |
| 2024-08-18 | FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model | Ziyu Yao et.al. | 2408.09384 | translate | read | null |
| 2024-08-17 | Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration | Xin Lin et.al. | 2408.09241 | translate | read | link |
| 2024-08-16 | Fire Dynamic Vision: Image Segmentation and Tracking for Multi-Scale Fire and Plume Behavior | Daryn Sagel et.al. | 2408.08984 | translate | read | null |
| 2024-08-16 | PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future | Guangyi Wang et.al. | 2408.08822 | translate | read | null |
| 2024-08-16 | Comparative Analysis of Generative Models: Enhancing Image Synthesis with VAEs, GANs, and Stable Diffusion | Sanchayan Vivekananthan et.al. | 2408.08751 | translate | read | null |
| 2024-08-16 | An End-to-End Model for Photo-Sharing Multi-modal Dialogue Generation | Peiming Guo et.al. | 2408.08650 | translate | read | null |
| 2024-08-16 | SketchRef: A Benchmark Dataset and Evaluation Metrics for Automated Sketch Synthesis | Xingyue Lin et.al. | 2408.08623 | translate | read | null |
| 2024-08-16 | Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness | Hefei Mei et.al. | 2408.08502 | translate | read | link |
| 2024-08-16 | TEXTOC: Text-driven Object-Centric Style Transfer | Jihun Park et.al. | 2408.08461 | translate | read | null |
| 2024-08-15 | JPEG-LM: LLMs as Image Generators with Canonical Codec Representations | Xiaochuang Han et.al. | 2408.08459 | translate | read | null |
| 2024-08-15 | Can Large Language Models Understand Symbolic Graphics Programs? | Zeju Qiu et.al. | 2408.08313 | translate | read | null |
| 2024-08-15 | Accelerated Image-Aware Generative Diffusion Modeling | Tanmay Asthana et.al. | 2408.08306 | translate | read | null |
| 2024-08-15 | Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding | Xiner Li et.al. | 2408.08252 | translate | read | link |
| 2024-08-15 | The Dawn of KAN in Image-to-Image (I2I) Translation: Integrating Kolmogorov-Arnold Networks with GANs for Unpaired I2I Translation | Arpan Mahara et.al. | 2408.08216 | translate | read | null |
| 2024-08-15 | Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images | Zhiyuan Li et.al. | 2408.08105 | translate | read | link |
| 2024-08-15 | Single-image coherent reconstruction of objects and humans | Sarthak Batra et.al. | 2408.08086 | translate | read | null |
| 2024-08-15 | Conditional Brownian Bridge Diffusion Model for VHR SAR to Optical Image Translation | Seon-Hoon Kim et.al. | 2408.07947 | translate | read | null |
| 2024-08-15 | A Novel Generative Artificial Intelligence Method for Interference Study on Multiplex Brightfield Immunohistochemistry Images | Satarupa Mukherjee et.al. | 2408.07860 | translate | read | null |
| 2024-08-14 | Boosting Unconstrained Face Recognition with Targeted Style Adversary | Mohammad Saeed Ebrahimi Saadabadi et.al. | 2408.07642 | translate | read | null |
| 2024-08-15 | MagicFace: Training-free Universal-Style Human Image Customized Synthesis | Yibin Wang et.al. | 2408.07433 | translate | read | null |
| 2024-08-14 | KIND: Knowledge Integration and Diversion in Diffusion Models | Yucheng Xie et.al. | 2408.07337 | translate | read | link |
| 2024-08-14 | GRIF-DM: Generation of Rich Impression Fonts using Diffusion Models | Lei Kang et.al. | 2408.07259 | translate | read | link |
| 2024-08-13 | SeLoRA: Self-Expanding Low-Rank Adaptation of Latent Diffusion Model for Medical Image Synthesis | Yuchen Mao et.al. | 2408.07196 | translate | read | null |
| 2024-08-13 | Generative Photomontage | Sean J. Liu et.al. | 2408.07116 | translate | read | null |
| 2024-08-14 | Content and Style Aware Audio-Driven Facial Animation | Qingju Liu et.al. | 2408.07005 | translate | read | null |
| 2024-08-13 | SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis | Saptarshi Neil Sinha et.al. | 2408.06975 | translate | read | null |
| 2024-08-13 | VNet: A GAN-based Multi-Tier Discriminator Network for Speech Synthesis Vocoders | Yubing Cao et.al. | 2408.06906 | translate | read | null |
| 2024-08-13 | Definition of multispectral camera system parameters to model the asteroid 2001 SN263 | Gabriela de Carvalho Assis Goulart et.al. | 2408.06886 | translate | read | null |
| 2024-08-13 | A Comprehensive Survey on Synthetic Infrared Image synthesis | Avinash Upadhyay et.al. | 2408.06868 | translate | read | null |
| 2024-08-13 | Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective | Ouxiang Li et.al. | 2408.06741 | translate | read | link |
| 2024-08-13 | DiffLoRA: Generating Personalized Low-Rank Adaptation Weights with Diffusion | Yujia Wu et.al. | 2408.06740 | translate | read | null |
| 2024-08-13 | DiffSG: A Generative Solver for Network Optimization with Diffusion Model | Ruihuai Liang et.al. | 2408.06701 | translate | read | null |
| 2024-08-13 | Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models | Chenqian Yan et.al. | 2408.06646 | translate | read | null |
| 2024-08-12 | Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers | Joshua Nathaniel Williams et.al. | 2408.06502 | translate | read | null |
| 2024-08-12 | Open-Source Molecular Processing Pipeline for Generating Molecules | Shreyas V et.al. | 2408.06261 | translate | read | null |
| 2024-08-12 | Deep Learning System Boundary Testing through Latent Space Style Mixing | Amr Abdellatif et.al. | 2408.06258 | translate | read | null |
| 2024-08-12 | An Analysis for Image-to-Image Translation and Style Transfer | Xiaoming Yu et.al. | 2408.06000 | translate | read | null |
| 2024-08-12 | A Simple Early Exiting Framework for Accelerated Sampling in Diffusion Models | Taehong Moon et.al. | 2408.05927 | translate | read | link |
| 2024-08-11 | Egocentric Vision Language Planning | Zhirui Fang et.al. | 2408.05802 | translate | read | null |
| 2024-08-11 | SSL: A Self-similarity Loss for Improving Generative Image Super-resolution | Du Chen et.al. | 2408.05713 | translate | read | null |
| 2024-08-10 | Generative Adversarial Networks for Solving Hand-Eye Calibration without Data Correspondence | Ilkwon Hong et.al. | 2408.05613 | translate | read | null |
| 2024-08-10 | ZePo: Zero-Shot Portrait Stylization with Faster Sampling | Jin Liu et.al. | 2408.05492 | translate | read | link |
| 2024-08-10 | Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE | Yiying Yang et.al. | 2408.05477 | translate | read | null |
| 2024-08-10 | Artworks Reimagined: Exploring Human-AI Co-Creation through Body Prompting | Jonas Oppenlaender et.al. | 2408.05476 | translate | read | null |
| 2024-08-09 | Instruction Tuning-free Visual Token Complement for Multimodal LLMs | Dongsheng Wang et.al. | 2408.05019 | translate | read | null |
| 2024-08-09 | DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting | Jihoon Lee et.al. | 2408.04962 | translate | read | null |
| 2024-08-08 | Deep Learning-based Unsupervised Domain Adaptation via a Unified Model for Prostate Lesion Detection Using Multisite Bi-parametric MRI Datasets | Hao Li et.al. | 2408.04777 | translate | read | null |
| 2024-08-08 | Zero-Shot Uncertainty Quantification using Diffusion Probabilistic Models | Dule Shu et.al. | 2408.04718 | translate | read | null |
| 2024-08-08 | Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations | Julen Urain et.al. | 2408.04380 | translate | read | null |
| 2024-08-08 | InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting | Xin-Yi Yu et.al. | 2408.04249 | translate | read | null |
| 2024-08-08 | Cross-View Meets Diffusion: Aerial Image Synthesis with Geometry and Text Guidance | Ahmad Arrabi et.al. | 2408.04224 | translate | read | link |
| 2024-08-08 | Artificial Intelligence based Approach for Identification and Mitigation of Cyber-Attacks in Wide-Area Control of Power Systems | Jishnudeep Kar et.al. | 2408.04189 | translate | read | null |
| 2024-08-07 | ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling | William Y. Zhu et.al. | 2408.04102 | translate | read | null |
| 2024-08-07 | Counterfactuals and Uncertainty-Based Explainable Paradigm for the Automated Detection and Segmentation of Renal Cysts in Computed Tomography Images: A Multi-Center Study | Zohaib Salahuddin et.al. | 2408.03789 | translate | read | null |
| 2024-08-07 | Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model | Guoqing Zhu et.al. | 2408.03748 | translate | read | link |
| 2024-08-07 | Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling | Zilyu Ye et.al. | 2408.03695 | translate | read | link |
| 2024-08-07 | Consumer Transactions Simulation through Generative Adversarial Networks | Sergiy Tkachuk et.al. | 2408.03655 | translate | read | null |
| 2024-08-07 | Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis | Zebin Yao et.al. | 2408.03632 | translate | read | link |
| 2024-08-07 | A comparative study of generative adversarial networks for image recognition algorithms based on deep learning and traditional methods | Yihao Zhong et.al. | 2408.03568 | translate | read | null |
| 2024-08-07 | Unlocking Exocentric Video-Language Data for Egocentric Video Representation Learning | Zi-Yi Dou et.al. | 2408.03567 | translate | read | null |
| 2024-08-07 | SLRQA: A Sparse Low-Rank Quaternion Model for Color Image Processing with Convergence Analysis | Zhanwang Deng et.al. | 2408.03563 | translate | read | null |
| 2024-08-07 | D2Styler: Advancing Arbitrary Style Transfer with Discrete Diffusion Methods | Onkar Susladkar et.al. | 2408.03558 | translate | read | link |
| 2024-08-06 | Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey | Vu Tuan Truong et.al. | 2408.03400 | translate | read | null |
| 2024-08-06 | IPAdapter-Instruct: Resolving Ambiguity in Image-based Conditioning using Instruct Prompts | Ciara Rowles et.al. | 2408.03209 | translate | read | null |
| 2024-08-06 | An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion | Xingguang Yan et.al. | 2408.03178 | translate | read | null |
| 2024-08-06 | Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models | Sho Ozaki et.al. | 2408.03156 | translate | read | null |
| 2024-08-06 | Multitask and Multimodal Neural Tuning for Large Models | Hao Sun et.al. | 2408.03001 | translate | read | null |
| 2024-08-06 | DreamLCM: Towards High-Quality Text-to-3D Generation via Latent Consistency Model | Yiming Zhong et.al. | 2408.02993 | translate | read | null |
| 2024-08-06 | A generative adversarial network for stellar core-collapse gravitational-waves | Tarin Eccleston et.al. | 2408.02895 | translate | read | null |
| 2024-08-05 | Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services | Shaopeng Fu et.al. | 2408.02814 | translate | read | null |
| 2024-08-05 | Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining | Dongyang Liu et.al. | 2408.02657 | translate | read | null |
| 2024-08-06 | ProCreate, Don’t Reproduce! Propulsive Energy Diffusion for Creative Generation | Jack Lu et.al. | 2408.02226 | translate | read | null |
| 2024-08-05 | Dense Feature Interaction Network for Image Inpainting Localization | Ye Yao et.al. | 2408.02191 | translate | read | null |
| 2024-08-04 | PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance | Aoming Liu et.al. | 2408.02157 | translate | read | null |
| 2024-08-04 | View-consistent Object Removal in Radiance Fields | Yiren Lu et.al. | 2408.02100 | translate | read | null |
| 2024-08-04 | LDFaceNet: Latent Diffusion-based Network for High-Fidelity Deepfake Generation | Dwij Mehta et.al. | 2408.02078 | translate | read | null |
| 2024-08-04 | Step Saver: Predicting Minimum Denoising Steps for Diffusion Model Image Generation | Jean Yu et.al. | 2408.02054 | translate | read | null |
| 2024-08-04 | Robustness of Watermarking on Text-to-Image Diffusion Models | Xiaodong Wu et.al. | 2408.02035 | translate | read | null |
| 2024-08-03 | Supervised Image Translation from Visible to Infrared Domain for Object Detection | Prahlad Anand et.al. | 2408.01843 | translate | read | null |
| 2024-08-03 | ST-SACLF: Style Transfer Informed Self-Attention Classifier for Bias-Aware Painting Classification | Mridula Vijendran et.al. | 2408.01827 | translate | read | null |
| 2024-08-02 | Out-Of-Distribution Detection for Audio-visual Generalized Zero-Shot Learning: A General Framework | Liuyuan Wen et.al. | 2408.01284 | translate | read | null |
| 2024-08-02 | VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling | Qian Zhang et.al. | 2408.01181 | translate | read | null |
| 2024-08-02 | PINNs for Medical Image Analysis: A Survey | Chayan Banerjee et.al. | 2408.01026 | translate | read | null |
| 2024-08-02 | EIUP: A Training-Free Approach to Erase Non-Compliant Concepts Conditioned on Implicit Unsafe Prompts | Die Chen et.al. | 2408.01014 | translate | read | null |
| 2024-08-02 | FBSDiff: Plug-and-Play Frequency Band Substitution of Diffusion Features for Highly Controllable Text-Driven Image Translation | Xiang Gao et.al. | 2408.00998 | translate | read | null |
| 2024-08-01 | Temporal Evolution of Knee Osteoarthritis: A Diffusion-based Morphing Model for X-ray Medical Image Synthesis | Zhe Wang et.al. | 2408.00891 | translate | read | null |
| 2024-08-01 | Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention | Susung Hong et.al. | 2408.00760 | translate | read | null |
| 2024-08-01 | Synthetic dual image generation for reduction of labeling efforts in semantic segmentation of micrographs with a customized metric function | Matias Oscar Volman Stern et.al. | 2408.00707 | translate | read | null |
| 2024-08-01 | Modeling stochastic eye tracking data: A comparison of quantum generative adversarial networks and Markov models | Shailendra Bhandari et.al. | 2408.00673 | translate | read | null |
| 2024-08-01 | Evaluation Metrics and Methods for Generative Models in the Wireless PHY Layer | Michael Baur et.al. | 2408.00634 | translate | read | null |
| 2024-08-01 | A new approach for encoding code and assisting code understanding | Mengdan Fan et.al. | 2408.00521 | translate | read | null |
| 2024-08-01 | Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion | Manuel Kansy et.al. | 2408.00458 | translate | read | null |
| 2024-08-01 | Towards Reliable Advertising Image Generation Using Human Feedback | Zhenbang Du et.al. | 2408.00418 | translate | read | null |
| 2024-08-01 | DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving | Xuemeng Yang et.al. | 2408.00415 | translate | read | null |
| 2024-08-01 | Deepfake Media Forensics: State of the Art and Challenges Ahead | Irene Amerini et.al. | 2408.00388 | translate | read | null |
| 2024-08-01 | On the Limitations and Prospects of Machine Unlearning for Generative AI | Shiji Zhou et.al. | 2408.00376 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)