Image Generation - 2024-05
Image Generation - 2024-05
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-05-31 | Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling | Jiatao Gu et.al. | 2405.21048 | translate | read | null |
| 2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022 | translate | read | null |
| 2024-05-31 | Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging | Muhammad Muneeb Saad et.al. | 2405.20987 | translate | read | null |
| 2024-05-31 | Generative Adversarial Networks in Ultrasound Imaging: Extending Field of View Beyond Conventional Limits | Matej Gazda et.al. | 2405.20981 | translate | read | null |
| 2024-05-31 | Amortizing intractable inference in diffusion models for vision, language, and control | Siddarth Venkatraman et.al. | 2405.20971 | translate | read | link |
| 2024-05-31 | MegActor: Harness the Power of Raw Video for Vivid Portrait Animation | Shurong Yang et.al. | 2405.20851 | translate | read | link |
| 2024-05-31 | Multilingual Text Style Transfer: Datasets & Models for Indian Languages | Sourabrata Mukherjee et.al. | 2405.20805 | translate | read | null |
| 2024-05-31 | Information Theoretic Text-to-Image Alignment | Chao Wang et.al. | 2405.20759 | translate | read | null |
| 2024-05-31 | Diffusion Models Are Innate One-Step Generators | Bowen Zheng et.al. | 2405.20750 | translate | read | link |
| 2024-05-31 | GANcrop: A Contrastive Defense Against Backdoor Attacks in Federated Learning | Xiaoyun Gan et.al. | 2405.20727 | translate | read | null |
| 2024-05-30 | SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow | Chaoyang Wang et.al. | 2405.20282 | translate | read | link |
| 2024-05-30 | ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections | Massimo Bini et.al. | 2405.20271 | translate | read | link |
| 2024-05-30 | Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback | Sanghyeon Na et.al. | 2405.20216 | translate | read | null |
| 2024-05-30 | RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection | Zhiyuan He et.al. | 2405.20112 | translate | read | null |
| 2024-05-30 | RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection | Fangyi Chen et.al. | 2405.19854 | translate | read | null |
| 2024-05-30 | Puff-Net: Efficient Style Transfer with Pure Content and Style Feature Fusion Network | Sizhe Zheng et.al. | 2405.19775 | translate | read | null |
| 2024-05-30 | MAE-GAN: A Novel Strategy for Simultaneous Super-resolution Reconstruction and Denoising of Post-stack Seismic Profile | Wenshuo Yu et.al. | 2405.19767 | translate | read | null |
| 2024-05-30 | Mitigating annotation shift in cancer classification using single image generative models | Marta Buetas Arcas et.al. | 2405.19754 | translate | read | link |
| 2024-05-30 | Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian | Wei Sun et.al. | 2405.19657 | translate | read | null |
| 2024-05-29 | Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models | Venkat Venkatasubramanian et.al. | 2405.19561 | translate | read | null |
| 2024-05-29 | ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning | Ruchika Chavhan et.al. | 2405.19237 | translate | read | link |
| 2024-05-29 | Going beyond compositional generalization, DDPMs can produce zero-shot interpolation | Justin Deschenaux et.al. | 2405.19201 | translate | read | link |
| 2024-05-29 | The ethical situation of DALL-E 2 | Eduard Hogea et.al. | 2405.19176 | translate | read | null |
| 2024-05-29 | Patch-enhanced Mask Encoder Prompt Image Generation | Shusong Xu et.al. | 2405.19085 | translate | read | null |
| 2024-05-29 | EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture | Jiaqi Xu et.al. | 2405.18991 | translate | read | link |
| 2024-05-29 | Topological Perspectives on Optimal Multimodal Embedding Spaces | Abdul Aziz A. B et.al. | 2405.18867 | translate | read | null |
| 2024-05-29 | Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching | Yasi Zhang et.al. | 2405.18816 | translate | read | null |
| 2024-05-29 | SketchTriplet: Self-Supervised Scenarized Sketch-Text-Image Triplet Generation | Zhenbei Wu et.al. | 2405.18801 | translate | read | null |
| 2024-05-29 | Inpaint Biases: A Pathway to Accurate and Unbiased Image Generation | Jiyoon Myung et.al. | 2405.18762 | translate | read | null |
| 2024-05-29 | SketchDeco: Decorating B&W Sketches with Colour | Chaitat Utintu et.al. | 2405.18716 | translate | read | null |
| 2024-05-28 | Phased Consistency Model | Fu-Yun Wang et.al. | 2405.18407 | translate | read | link |
| 2024-05-28 | Multi-modal Generation via Cross-Modal In-Context Learning | Amandeep Kumar et.al. | 2405.18304 | translate | read | link |
| 2024-05-28 | Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers? | Zebin You et.al. | 2405.18029 | translate | read | null |
| 2024-05-28 | Cycle-YOLO: A Efficient and Robust Framework for Pavement Damage Detection | Zhengji Li et.al. | 2405.17905 | translate | read | null |
| 2024-05-27 | RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance | Jiaojiao Fan et.al. | 2405.17661 | translate | read | null |
| 2024-05-27 | Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba | Jiahao Huang et.al. | 2405.17659 | translate | read | null |
| 2024-05-27 | EM-GANSim: Real-time and Accurate EM Simulation Using Conditional GANs for 3D Indoor Scenes | Ruichen Wang et.al. | 2405.17366 | translate | read | null |
| 2024-05-27 | Prompt Optimization with Human Feedback | Xiaoqiang Lin et.al. | 2405.17346 | translate | read | link |
| 2024-05-27 | From Text to Blueprint: Leveraging Text-to-Image Tools for Floor Plan Creation | Xiaoyu Li et.al. | 2405.17236 | translate | read | null |
| 2024-05-27 | MCGAN: Enhancing GAN Training with Regression-Based Generator Loss | Baoren Xiao et.al. | 2405.17191 | translate | read | null |
| 2024-05-27 | Training-free Editioning of Text-to-Image Models | Jinqi Wang et.al. | 2405.17069 | translate | read | null |
| 2024-05-27 | The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models | Saravanan Kandasamy et.al. | 2405.17068 | translate | read | null |
| 2024-05-27 | Glauber Generative Model: Discrete Diffusion Models via Binary Classification | Harshit Varma et.al. | 2405.17035 | translate | read | null |
| 2024-05-27 | A Correlation- and Mean-Aware Loss Function and Benchmarking Framework to Improve GAN-based Tabular Data Synthesis | Minh H. Vu et.al. | 2405.16971 | translate | read | null |
| 2024-05-27 | Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation | Liang Shi et.al. | 2405.16895 | translate | read | null |
| 2024-05-27 | Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks | Yunqi Zhang et.al. | 2405.16860 | translate | read | link |
| 2024-05-24 | Learning to Discretize Denoising Diffusion ODEs | Vinh Tong et.al. | 2405.15506 | translate | read | link |
| 2024-05-24 | A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence | Ali Kashefi et.al. | 2405.15406 | translate | read | null |
| 2024-05-24 | Stochastic SR for Gaussian microtextures | Emile Pierret et.al. | 2405.15399 | translate | read | null |
| 2024-05-24 | Challenges and Opportunities in 3D Content Generation | Ke Zhao et.al. | 2405.15335 | translate | read | null |
| 2024-05-24 | Towards Understanding the Working Mechanism of Text-to-Image Diffusion Model | Mingyang Yi et.al. | 2405.15330 | translate | read | null |
| 2024-05-24 | SG-Adapter: Enhancing Text-to-Image Generation with Scene Graph Guidance | Guibao Shen et.al. | 2405.15321 | translate | read | null |
| 2024-05-24 | Decaf: Data Distribution Decompose Attack against Federated Learning | Zhiyang Dai et.al. | 2405.15316 | translate | read | null |
| 2024-05-24 | Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient | Yongliang Wu et.al. | 2405.15304 | translate | read | null |
| 2024-05-24 | StyleMaster: Towards Flexible Stylized Image Generation with Diffusion Models | Chengming Xu et.al. | 2405.15287 | translate | read | null |
| 2024-05-24 | Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models | Yimeng Zhang et.al. | 2405.15234 | translate | read | link |
| 2024-05-23 | Improved Distribution Matching Distillation for Fast Image Synthesis | Tianwei Yin et.al. | 2405.14867 | translate | read | link |
| 2024-05-23 | Semantica: An Adaptable Image-Conditioned Diffusion Model | Manoj Kumar et.al. | 2405.14857 | translate | read | null |
| 2024-05-23 | TerDiT: Ternary Diffusion Models with Transformers | Xudong Lu et.al. | 2405.14854 | translate | read | link |
| 2024-05-23 | Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models | Katherine Xu et.al. | 2405.14828 | translate | read | null |
| 2024-05-24 | Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation | Hongxu Jiang et.al. | 2405.14802 | translate | read | null |
| 2024-05-23 | Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy | Shengfang Zhai et.al. | 2405.14800 | translate | read | link |
| 2024-05-23 | RetAssist: Facilitating Vocabulary Learners with Generative Images in Story Retelling Practices | Qiaoyi Chen et.al. | 2405.14794 | translate | read | null |
| 2024-05-23 | OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance | Shuheng Ge et.al. | 2405.14709 | translate | read | null |
| 2024-05-23 | Learning Multi-dimensional Human Preference for Text-to-Image Generation | Sixian Zhang et.al. | 2405.14705 | translate | read | null |
| 2024-05-23 | RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance | Zhicheng Sun et.al. | 2405.14677 | translate | read | link |
| 2024-05-21 | Personalized Residuals for Concept-Driven Text-to-Image Generation | Cusuh Ham et.al. | 2405.12978 | translate | read | null |
| 2024-05-21 | An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation | Zhiyu Tan et.al. | 2405.12914 | translate | read | link |
| 2024-05-21 | Spatial-aware Attention Generative Adversarial Network for Semi-supervised Anomaly Detection in Medical Image | Zerui Zhang et.al. | 2405.12872 | translate | read | null |
| 2024-05-21 | A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability | Li-Yang Tseng et.al. | 2405.12847 | translate | read | link |
| 2024-05-21 | Leveraging Neural Radiance Fields for Pose Estimation of an Unknown Space Object during Proximity Operations | Antoine Legrand et.al. | 2405.12728 | translate | read | null |
| 2024-05-21 | CustomText: Customized Textual Image Generation using Diffusion Models | Shubham Paliwal et.al. | 2405.12531 | translate | read | null |
| 2024-05-20 | Diffusion for World Modeling: Visual Details Matter in Atari | Eloi Alonso et.al. | 2405.12399 | translate | read | link |
| 2024-05-20 | Paired Conditional Generative Adversarial Network for Highly Accelerated Liver 4D MRI | Di Xu et.al. | 2405.12357 | translate | read | null |
| 2024-05-20 | EGAN: Evolutional GAN for Ransomware Evasion | Daniel Commey et.al. | 2405.12266 | translate | read | null |
| 2024-05-20 | Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices | Nathaniel Cohen et.al. | 2405.12211 | translate | read | link |
| 2024-05-20 | Diffusion Models for Generating Ballistic Spacecraft Trajectories | Tyler Presser et.al. | 2405.11738 | translate | read | null |
| 2024-05-19 | URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images | Zoey Chen et.al. | 2405.11656 | translate | read | null |
| 2024-05-19 | Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation | Sangyeop Yeo et.al. | 2405.11614 | translate | read | null |
| 2024-05-19 | A GAN-Based Data Poisoning Attack Against Federated Learning Systems and Its Countermeasure | Wei Sun et.al. | 2405.11440 | translate | read | null |
| 2024-05-18 | UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers | Duo Peng et.al. | 2405.11336 | translate | read | null |
| 2024-05-18 | On the Trajectory Regularity of ODE-based Diffusion Sampling | Defang Chen et.al. | 2405.11326 | translate | read | null |
| 2024-05-18 | Few-Shot API Attack Detection: Overcoming Data Scarcity with GAN-Inspired Learning | Udi Aharon et.al. | 2405.11258 | translate | read | null |
| 2024-05-18 | TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation | Chengcheng Feng et.al. | 2405.11236 | translate | read | null |
| 2024-05-17 | Improving face generation quality and prompt following with synthetic captions | Michail Tarasiou et.al. | 2405.10864 | translate | read | null |
| 2024-05-17 | Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image | Jianshun Zeng et.al. | 2405.10504 | translate | read | null |
| 2024-05-17 | Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers | Rya Sanovar et.al. | 2405.10480 | translate | read | null |
| 2024-05-16 | Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model | Zheng Gu et.al. | 2405.10316 | translate | read | null |
| 2024-05-16 | UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models | Sahel Sharifymoghaddam et.al. | 2405.10311 | translate | read | link |
| 2024-05-16 | VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing | Binghui Chen et.al. | 2405.09985 | translate | read | null |
| 2024-05-16 | KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment | Zhengxu Shi et.al. | 2405.09964 | translate | read | null |
| 2024-05-16 | Chameleon: Mixed-Modal Early-Fusion Foundation Models | Chameleon Team et.al. | 2405.09818 | translate | read | link |
| 2024-05-16 | MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis | Joseph Cho et.al. | 2405.09806 | translate | read | null |
| 2024-05-16 | An Autoencoder and Generative Adversarial Networks Approach for Multi-Omics Data Imbalanced Class Handling and Classification | Ibrahim Al-Hurani et.al. | 2405.09756 | translate | read | null |
| 2024-05-15 | Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer | Weifei Jin et.al. | 2405.09470 | translate | read | null |
| 2024-05-16 | Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images | Memoona Aziz et.al. | 2405.09426 | translate | read | null |
| 2024-05-15 | DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations | Nima Fathi et.al. | 2405.09288 | translate | read | link |
| 2024-05-15 | SOEDiff: Efficient Distillation for Small Object Editing | Qihe Pan et.al. | 2405.09114 | translate | read | null |
| 2024-05-15 | Deep Learning in Earthquake Engineering: A Comprehensive Review | Yazhou Xie et.al. | 2405.09021 | translate | read | null |
| 2024-05-14 | Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding | Zhimin Li et.al. | 2405.08748 | translate | read | link |
| 2024-05-15 | Similarity Metrics for MR Image-To-Image Translation | Melanie Dohmen et.al. | 2405.08431 | translate | read | null |
| 2024-05-14 | Compositional Text-to-Image Generation with Dense Blob Representations | Weili Nie et.al. | 2405.08246 | translate | read | null |
| 2024-05-13 | RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations | Chengde Lin et.al. | 2405.08114 | translate | read | link |
| 2024-05-13 | CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models | Nick Stracke et.al. | 2405.07913 | translate | read | null |
| 2024-05-13 | SAR Image Synthesis with Diffusion Models | Denisa Qosja et.al. | 2405.07776 | translate | read | null |
| 2024-05-12 | Semantic Loss Functions for Neuro-Symbolic Structured Prediction | Kareem Ahmed et.al. | 2405.07387 | translate | read | null |
| 2024-05-12 | Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning | Jiarui Wang et.al. | 2405.07346 | translate | read | link |
| 2024-05-12 | PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification | Mohammad Shafiul Alam et.al. | 2405.07332 | translate | read | link |
| 2024-05-12 | Stable Signature is Unstable: Removing Image Watermark from Diffusion Models | Yuepeng Hu et.al. | 2405.07145 | translate | read | null |
| 2024-05-12 | MAxPrototyper: A Multi-Agent Generation System for Interactive User Interface Prototyping | Mingyue Yuan et.al. | 2405.07131 | translate | read | null |
| 2024-05-11 | Unsupervised Density Neural Representation for CT Metal Artifact Reduction | Qing Wu et.al. | 2405.07047 | translate | read | null |
| 2024-05-11 | Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior | Ce Wang et.al. | 2405.07044 | translate | read | link |
| 2024-05-11 | Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation | Shengyuan Liu et.al. | 2405.06948 | translate | read | null |
| 2024-05-10 | Controllable Image Generation With Composed Parallel Token Prediction | Jamie Stirling et.al. | 2405.06535 | translate | read | null |
| 2024-05-10 | SketchDream: Sketch-based Text-to-3D Generation and Editing | Feng-Lin Liu et.al. | 2405.06461 | translate | read | null |
| 2024-05-09 | Photonic quantum generative adversarial networks for classical data | Tigran Sedrakyan et.al. | 2405.06023 | translate | read | null |
| 2024-05-09 | Frame Interpolation with Consecutive Brownian Bridge Diffusion | Zonglin Lyu et.al. | 2405.05953 | translate | read | link |
| 2024-05-09 | Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models | Zhe Ma et.al. | 2405.05846 | translate | read | null |
| 2024-05-10 | MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation | Yuxiang Wei et.al. | 2405.05806 | translate | read | link |
| 2024-05-09 | Exploring Text-Guided Single Image Editing for Remote Sensing Images | Fangzhou Han et.al. | 2405.05769 | translate | read | null |
| 2024-05-09 | End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base | Shuling Li et.al. | 2405.05738 | translate | read | null |
| 2024-05-09 | VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis | Zhihan Ju et.al. | 2405.05667 | translate | read | null |
| 2024-05-09 | A Survey on Personalized Content Synthesis with Diffusion Models | Xulu Zhang et.al. | 2405.05538 | translate | read | null |
| 2024-05-09 | Characteristic Learning for Provable One Step Generation | Zhao Ding et.al. | 2405.05512 | translate | read | link |
| 2024-05-08 | Cross-Modality Translation with Generative Adversarial Networks to Unveil Alzheimer’s Disease Biomarkers | Reihaneh Hassanzadeh et.al. | 2405.05462 | translate | read | null |
| 2024-05-08 | DrawL: Understanding the Effects of Non-Mainstream Dialects in Prompted Image Generation | Joshua N. Williams et.al. | 2405.05382 | translate | read | null |
| 2024-05-08 | Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo | Nayantara Mudur et.al. | 2405.05255 | translate | read | link |
| 2024-05-08 | StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer | Zijia Wang et.al. | 2405.05027 | translate | read | null |
| 2024-05-08 | Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI | Keqiang Fan et.al. | 2405.04974 | translate | read | null |
| 2024-05-08 | Improving Long Text Understanding with Knowledge Distilled from Summarization Model | Yan Liu et.al. | 2405.04955 | translate | read | null |
| 2024-05-08 | HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis | Zhihan Ju et.al. | 2405.04902 | translate | read | null |
| 2024-05-08 | FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation | Xuehai He et.al. | 2405.04834 | translate | read | null |
| 2024-05-07 | TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model | Yongming Zhang et.al. | 2405.04675 | translate | read | null |
| 2024-05-07 | ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography | Syed Jamal Safdar Gardezi et.al. | 2405.04629 | translate | read | null |
| 2024-05-07 | SingIt! Singer Voice Transformation | Amit Eliav et.al. | 2405.04627 | translate | read | null |
| 2024-05-07 | Towards Geographic Inclusion in the Evaluation of Text-to-Image Models | Melissa Hall et.al. | 2405.04457 | translate | read | null |
| 2024-05-07 | Data augmentation experiments with style-based quantum generative adversarial networks on trapped-ion and superconducting-qubit technologies | Julien Baglio et.al. | 2405.04401 | translate | read | null |
| 2024-05-07 | Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation | Jihyun Kim et.al. | 2405.04356 | translate | read | null |
| 2024-05-07 | Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer | Zhuoyi Yang et.al. | 2405.04312 | translate | read | link |
| 2024-05-07 | Improving Offline Reinforcement Learning with Inaccurate Simulators | Yiwen Hou et.al. | 2405.04307 | translate | read | null |
| 2024-05-07 | Bayesian Simultaneous Localization and Multi-Lane Tracking Using Onboard Sensors and a SD Map | Yuxuan Xia et.al. | 2405.04290 | translate | read | null |
| 2024-05-07 | Bidirectional Adversarial Autoencoders for the design of Plasmonic Metasurfaces | Yuansan Liu et.al. | 2405.04056 | translate | read | link |
| 2024-05-07 | Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model | Joo Young Choi et.al. | 2405.03958 | translate | read | null |
| 2024-05-06 | Generated Contents Enrichment | Mahdi Naseri et.al. | 2405.03650 | translate | read | null |
| 2024-05-06 | CCDM: Continuous Conditional Diffusion Models for Image Generation | Xin Ding et.al. | 2405.03546 | translate | read | link |
| 2024-05-06 | GLIP: Electromagnetic Field Exposure Map Completion by Deep Generative Networks | Mohammed Mallik et.al. | 2405.03384 | translate | read | null |
| 2024-05-05 | AnoGAN for Tabular Data: A Novel Approach to Anomaly Detection | Aditya Singh et.al. | 2405.03075 | translate | read | null |
| 2024-05-05 | Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling | Jinmin Li et.al. | 2405.02941 | translate | read | null |
| 2024-05-05 | Data-Efficient Molecular Generation with Hierarchical Textual Inversion | Seojin Kim et.al. | 2405.02845 | translate | read | null |
| 2024-05-05 | SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion | Ziyun Qian et.al. | 2405.02844 | translate | read | null |
| 2024-05-05 | ImageInWords: Unlocking Hyper-Detailed Image Descriptions | Roopal Garg et.al. | 2405.02793 | translate | read | link |
| 2024-05-04 | U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers | Yuchuan Tian et.al. | 2405.02730 | translate | read | null |
| 2024-05-03 | Functional Imaging Constrained Diffusion for Brain PET Synthesis from Structural MRI | Minhui Yu et.al. | 2405.02504 | translate | read | null |
| 2024-05-03 | Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification | Siqi Yin et.al. | 2405.02155 | translate | read | null |
| 2024-05-03 | Reconstructing the mid-infrared spectra of galaxies using ultraviolet to submillimeter photometry and Deep Generative Networks | Agapi Rissaki et.al. | 2405.02153 | translate | read | null |
| 2024-05-03 | Three-Dimensional Amyloid-Beta PET Synthesis from Structural MRI with Conditional Generative Adversarial Networks | Fernando Vega et.al. | 2405.02109 | translate | read | null |
| 2024-05-03 | AI-generated art perceptions with GenFrame – an image-generating picture frame | Peter Kun et.al. | 2405.01901 | translate | read | null |
| 2024-05-03 | Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition | Yichun Tai et.al. | 2405.01872 | translate | read | null |
| 2024-05-03 | Report on the AAPM Grand Challenge on deep generative modeling for learning medical image statistics | Rucha Deshpande et.al. | 2405.01822 | translate | read | null |
| 2024-05-02 | Long Tail Image Generation Through Feature Space Augmentation and Iterated Learning | Rafael Elberg et.al. | 2405.01705 | translate | read | link |
| 2024-05-02 | Investigation on optimal microstructure of dual-phase steel with high strength and ductility by machine learning | Misato Suzuki et.al. | 2405.01689 | translate | read | null |
| 2024-05-02 | Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance | Kelvin C. K. Chan et.al. | 2405.01356 | translate | read | null |
| 2024-05-02 | Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration | Praveen Kumar Chandaliya et.al. | 2405.01273 | translate | read | null |
| 2024-05-02 | DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines | Ye Tian et.al. | 2405.01248 | translate | read | null |
| 2024-05-02 | On Mechanistic Knowledge Localization in Text-to-Image Generative Models | Samyadeep Basu et.al. | 2405.01008 | translate | read | null |
| 2024-05-01 | SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models | Burak Can Biner et.al. | 2405.00878 | translate | read | null |
| 2024-05-01 | Guided Conditional Diffusion Classifier (ConDiff) for Enhanced Prediction of Infection in Diabetic Foot Ulcers | Palawat Busaranuvong et.al. | 2405.00858 | translate | read | null |
| 2024-05-01 | RGB $\leftrightarrow$ X: Image decomposition and synthesis using material- and lighting-aware diffusion models | Zheng Zeng et.al. | 2405.00666 | translate | read | null |
| 2024-05-01 | UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement | Ruiquan Ge et.al. | 2405.00542 | translate | read | link |
| 2024-05-01 | Compressive Sensing Imaging Using Caustic Lens Mask Generated by Periodic Perturbation in a Ripple Tank | Doğan Tunca Arık et.al. | 2405.00407 | translate | read | null |
| 2024-05-01 | Beamforming Inferring by Conditional WGAN-GP for Holographic Antenna Arrays | Fenghao Zhu et.al. | 2405.00391 | translate | read | null |
| 2024-05-01 | Streamlining Image Editing with Layered Diffusion Brushes | Peyman Gholami et.al. | 2405.00313 | translate | read | null |
| 2024-05-01 | Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation | Zhenglin Li et.al. | 2404.19265 | translate | read | null |
| 2024-05-01 | FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of Tactile-motor Robot Manipulation Skills | Yongqiang Zhao et.al. | 2404.19217 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)