Image Generation - 2024-10
Image Generation - 2024-10
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-10-31 | Generative modelling for mass-mapping with fast uncertainty quantification | Jessica J. Whitney et.al. | 2410.24197 | translate | read | null |
| 2024-10-31 | A Practical Style Transfer Pipeline for 3D Animation: Insights from Production R&D | Hideki Todo et.al. | 2410.24123 | translate | read | null |
| 2024-10-31 | DiffPAD: Denoising Diffusion-based Adversarial Patch Decontamination | Jia Fu et.al. | 2410.24006 | translate | read | null |
| 2024-10-31 | Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation | Yihang Zhou et.al. | 2410.23962 | translate | read | null |
| 2024-10-31 | EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching | Xinwang Chen et.al. | 2410.23788 | translate | read | link |
| 2024-10-31 | SceneComplete: Open-World 3D Scene Completion in Complex Real World Environments for Robot Manipulation | Aditya Agarwal et.al. | 2410.23643 | translate | read | null |
| 2024-10-31 | Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization | Xiao Guo et.al. | 2410.23556 | translate | read | null |
| 2024-10-30 | MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts | Jie Zhu et.al. | 2410.23332 | translate | read | null |
| 2024-10-30 | RelationBooth: Towards Relation-Aware Customized Object Generation | Qingyu Shi et.al. | 2410.23280 | translate | read | null |
| 2024-10-30 | Multi-student Diffusion Distillation for Better One-step Generators | Yanke Song et.al. | 2410.23274 | translate | read | null |
| 2024-10-30 | Controllable Game Level Generation: Assessing the Effect of Negative Examples in GAN Models | Mahsa Bazzaz et.al. | 2410.23108 | translate | read | null |
| 2024-10-30 | Private Synthetic Text Generation with Diffusion Models | Sebastian Ochs et.al. | 2410.22971 | translate | read | null |
| 2024-10-30 | An Individual Identity-Driven Framework for Animal Re-Identification | Yihao Wu et.al. | 2410.22927 | translate | read | link |
| 2024-10-30 | Latent Diffusion, Implicit Amplification: Efficient Continuous-Scale Super-Resolution for Remote Sensing Images | Hanlin Wu et.al. | 2410.22830 | translate | read | null |
| 2024-10-30 | Diffusion Beats Autoregressive: An Evaluation of Compositional Generation in Text-to-Image Models | Arash Marioriyad et.al. | 2410.22775 | translate | read | null |
| 2024-10-30 | st-DTPM: Spatial-Temporal Guided Diffusion Transformer Probabilistic Model for Delayed Scan PET Image Prediction | Ran Hong et.al. | 2410.22732 | translate | read | null |
| 2024-10-30 | Identifying Drift, Diffusion, and Causal Structure from Temporal Snapshots | Vincent Guan et.al. | 2410.22729 | translate | read | null |
| 2024-10-30 | FlowDCN: Exploring DCN-like Architectures for Fast Image Generation with Arbitrary Resolution | Shuai Wang et.al. | 2410.22655 | translate | read | null |
| 2024-10-29 | Multimodal Semantic Communication for Generative Audio-Driven Video Conferencing | Haonan Tong et.al. | 2410.22112 | translate | read | null |
| 2024-10-29 | PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference | Kendong Liu et.al. | 2410.21966 | translate | read | null |
| 2024-10-29 | Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images | Suhyun Ahn et.al. | 2410.21826 | translate | read | link |
| 2024-10-29 | HairDiffusion: Vivid Multi-Colored Hair Editing via Latent Diffusion | Yu Zeng et.al. | 2410.21789 | translate | read | null |
| 2024-10-29 | Exploring Local Memorization in Diffusion Models via Bright Ending Attention | Chen Chen et.al. | 2410.21665 | translate | read | null |
| 2024-10-29 | Fingerprints of Super Resolution Networks | Jeremy Vonderfecht et.al. | 2410.21653 | translate | read | null |
| 2024-10-29 | Adapting Diffusion Models for Improved Prompt Compliance and Controllable Image Synthesis | Deepak Sridhar et.al. | 2410.21638 | translate | read | null |
| 2024-10-28 | CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation | Claudius Krause et.al. | 2410.21611 | translate | read | null |
| 2024-10-30 | A Novel Score-CAM based Denoiser for Spectrographic Signature Extraction without Ground Truth | Noel Elias et.al. | 2410.21557 | translate | read | null |
| 2024-10-28 | Denoising Diffusion Planner: Learning Complex Paths from Low-Quality Demonstrations | Michiel Nikken et.al. | 2410.21497 | translate | read | null |
| 2024-10-28 | ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization | Christian J. Steinmetz et.al. | 2410.21233 | translate | read | null |
| 2024-10-28 | SeriesGAN: Time Series Generation via Adversarial and Autoregressive Learning | MohammadReza EskandariNasab et.al. | 2410.21203 | translate | read | link |
| 2024-10-28 | Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences | Zhihao Zhao et.al. | 2410.21130 | translate | read | null |
| 2024-10-28 | Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models | Wenda Li et.al. | 2410.21088 | translate | read | link |
| 2024-10-28 | Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework | Vladimir Arkhipkin et.al. | 2410.21061 | translate | read | null |
| 2024-10-28 | Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models | Piotr Przybyła et.al. | 2410.20940 | translate | read | null |
| 2024-10-28 | Markov spin models for image generation : explicit large deviations with respect to the number of pixels | Cecile Monthus et.al. | 2410.20906 | translate | read | null |
| 2024-10-28 | Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models | Weijian Luo et.al. | 2410.20898 | translate | read | null |
| 2024-10-28 | zGAN: An Outlier-focused Generative Adversarial Network For Realistic Synthetic Data Generation | Azizjon Azimi et.al. | 2410.20808 | translate | read | null |
| 2024-10-28 | Murine AI excels at cats and cheese: Structural differences between human and mouse neurons and their implementation in generative AIs | Rino Saiga et.al. | 2410.20735 | translate | read | null |
| 2024-10-25 | Microplastic Identification Using AI-Driven Image Segmentation and GAN-Generated Ecological Context | Alex Dils et.al. | 2410.19604 | translate | read | null |
| 2024-10-25 | Generative Diffusion Models for Sequential Recommendations | Sharare Zolghadr et.al. | 2410.19429 | translate | read | null |
| 2024-10-25 | Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts | Reuben Dorent et.al. | 2410.19378 | translate | read | null |
| 2024-10-25 | High Resolution Seismic Waveform Generation using Denoising Diffusion | Andreas Bergmeister et.al. | 2410.19343 | translate | read | null |
| 2024-10-25 | Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion | Emiel Hoogeboom et.al. | 2410.19324 | translate | read | null |
| 2024-10-24 | Generation of synthetic financial time series by diffusion models | Tomonori Takahashi et.al. | 2410.18897 | translate | read | null |
| 2024-10-24 | Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences | Weijian Luo et.al. | 2410.18881 | translate | read | null |
| 2024-10-24 | Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation | Xiaoyu Zhang et.al. | 2410.18830 | translate | read | null |
| 2024-10-24 | Towards Visual Text Design Transfer Across Languages | Yejin Choi et.al. | 2410.18823 | translate | read | null |
| 2024-10-24 | Ali-AUG: Innovative Approaches to Labeled Data Augmentation using One-Step Diffusion Model | Ali Hamza et.al. | 2410.18678 | translate | read | null |
| 2024-10-24 | FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation | Christopher T. H Teo et.al. | 2410.18615 | translate | read | null |
| 2024-10-24 | FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling | Zhengqiang Zhang et.al. | 2410.18410 | translate | read | link |
| 2024-10-23 | Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing | Dongliang Guo et.al. | 2410.18267 | translate | read | null |
| 2024-10-23 | FreeVS: Generative View Synthesis on Free Driving Trajectory | Qitai Wang et.al. | 2410.18079 | translate | read | null |
| 2024-10-23 | Scalable Ranked Preference Optimization for Text-to-Image Generation | Shyamgopal Karthik et.al. | 2410.18013 | translate | read | null |
| 2024-10-23 | A Wavelet Diffusion GAN for Image Super-Resolution | Lorenzo Aloisi et.al. | 2410.17966 | translate | read | null |
| 2024-10-23 | Medical Imaging Complexity and its Effects on GAN Performance | William Cagas et.al. | 2410.17959 | translate | read | null |
| 2024-10-23 | Variational MineGAN: A Data-efficient Knowledge Transfer Architecture for Generative AI-assisted Design of Nanophotonic Structures | Shahriar Tarvir Nushin et.al. | 2410.17889 | translate | read | null |
| 2024-10-23 | TAGE: Trustworthy Attribute Group Editing for Stable Few-shot Image Generation | Ruicheng Zhang et.al. | 2410.17855 | translate | read | null |
| 2024-10-23 | Longitudinal Causal Image Synthesis | Yujia Li et.al. | 2410.17691 | translate | read | null |
| 2024-10-23 | Deep Generative Models for 3D Medical Image Synthesis | Paul Friedrich et.al. | 2410.17664 | translate | read | null |
| 2024-10-23 | Testing Deep Learning Recommender Systems Models on Synthetic GAN-Generated Datasets | Jesús Bobadilla et.al. | 2410.17651 | translate | read | null |
| 2024-10-22 | Offline Evaluation of Set-Based Text-to-Image Generation | Negar Arabzadeh et.al. | 2410.17331 | translate | read | null |
| 2024-10-22 | Altogether: Image Captioning via Re-aligning Alt-text | Hu Xu et.al. | 2410.17251 | translate | read | null |
| 2024-10-22 | PGCS: Physical Law embedded Generative Cloud Synthesis in Remote Sensing Images | Liying Xu et.al. | 2410.16955 | translate | read | null |
| 2024-10-22 | IdenBAT: Disentangled Representation Learning for Identity-Preserved Brain Age Transformation | Junyeong Maeng et.al. | 2410.16945 | translate | read | link |
| 2024-10-22 | DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization | Haowei Zhu et.al. | 2410.16942 | translate | read | null |
| 2024-10-22 | Hierarchical Clustering for Conditional Diffusion in Image Generation | Jorge da Silva Goncalves et.al. | 2410.16910 | translate | read | link |
| 2024-10-22 | CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare | Nicholas I-Hsien Kuo et.al. | 2410.16872 | translate | read | null |
| 2024-10-22 | MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Meng Xu et.al. | 2410.16840 | translate | read | null |
| 2024-10-22 | Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection | Laurent Colbois et.al. | 2410.16802 | translate | read | link |
| 2024-10-22 | Progressive Compositionality In Text-to-Image Generative Models | Xu Han et.al. | 2410.16719 | translate | read | null |
| 2024-10-22 | Privacy-hardened and hallucination-resistant synthetic data generation with logic-solvers | Mark A. Burgess et.al. | 2410.16705 | translate | read | null |
| 2024-10-21 | MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors | Honghua Chen et.al. | 2410.16272 | translate | read | null |
| 2024-10-21 | Elucidating the design space of language models for image generation | Xuantong Liu et.al. | 2410.16257 | translate | read | null |
| 2024-10-21 | A Framework for Evaluating Predictive Models Using Synthetic Image Covariates and Longitudinal Data | Simon Deltadahl et.al. | 2410.16177 | translate | read | null |
| 2024-10-21 | Continuous Speech Synthesis using per-token Latent Diffusion | Arnon Turetzky et.al. | 2410.16048 | translate | read | null |
| 2024-10-20 | MedDiff-FM: A Diffusion-based Foundation Model for Versatile Medical Image Applications | Yongrui Yu et.al. | 2410.15432 | translate | read | null |
| 2024-10-20 | Synthetic Data Generation for Residential Load Patterns via Recurrent GAN and Ensemble Method | Xinyu Liang et.al. | 2410.15379 | translate | read | null |
| 2024-10-19 | Group Diffusion Transformers are Unsupervised Multitask Learners | Lianghua Huang et.al. | 2410.15027 | translate | read | null |
| 2024-10-19 | DiffuseST: Unleashing the Capability of the Diffusion Model for Style Transfer | Ying Hu et.al. | 2410.15007 | translate | read | null |
| 2024-10-19 | SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning | Zhewei Dai et.al. | 2410.14987 | translate | read | null |
| 2024-10-19 | Non-Invasive to Invasive: Enhancing FFA Synthesis from CFP with a Benchmark Dataset and a Novel Network | Hongqiu Wang et.al. | 2410.14965 | translate | read | null |
| 2024-10-18 | BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities | Shaozhe Hao et.al. | 2410.14672 | translate | read | link |
| 2024-10-18 | FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models | Rui Hu et.al. | 2410.14429 | translate | read | null |
| 2024-10-18 | HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation | Bo Cheng et.al. | 2410.14324 | translate | read | link |
| 2024-10-18 | HYPNOS : Highly Precise Foreground-focused Diffusion Finetuning for Inanimate Objects | Oliverio Theophilus Nathanael et.al. | 2410.14265 | translate | read | null |
| 2024-10-18 | Text-to-Image Representativity Fairness Evaluation Framework | Asma Yamani et.al. | 2410.14201 | translate | read | null |
| 2024-10-18 | Personalized Image Generation with Large Multimodal Models | Yiyan Xu et.al. | 2410.14170 | translate | read | null |
| 2024-10-18 | Assessing Open-world Forgetting in Generative Image Model Customization | Héctor Laria et.al. | 2410.14159 | translate | read | null |
| 2024-10-17 | Inference of morphology and dynamical state of nearby $Planck$ -SZ galaxy clusters with Zernike polynomials | Valentina Capalbo et.al. | 2410.13929 | translate | read | null |
| 2024-10-17 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens | Lijie Fan et.al. | 2410.13863 | translate | read | null |
| 2024-10-17 | PUMA: Empowering Unified MLLM with Multi-granular Visual Generation | Rongyao Fang et.al. | 2410.13861 | translate | read | link |
| 2024-10-17 | Diffusing States and Matching Scores: A New Framework for Imitation Learning | Runzhe Wu et.al. | 2410.13855 | translate | read | link |
| 2024-10-17 | Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning | Xiaodan Xing et.al. | 2410.13823 | translate | read | link |
| 2024-10-18 | Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Yijun Liang et.al. | 2410.13674 | translate | read | link |
| 2024-10-17 | An Active Learning Framework for Inclusive Generation by Large Language Models | Sabit Hassan et.al. | 2410.13641 | translate | read | null |
| 2024-10-17 | LoLDU: Low-Rank Adaptation via Lower-Diag-Upper Decomposition for Parameter-Efficient Fine-Tuning | Yiming Shi et.al. | 2410.13618 | translate | read | link |
| 2024-10-17 | GAN-Based Speech Enhancement for Low SNR Using Latent Feature Conditioning | Shrishti Saha Shetu et.al. | 2410.13599 | translate | read | null |
| 2024-10-17 | AI-based 3-Lead to 12-Lead ECG Reconstruction: Towards Smartphone-based Public Healthcare | Aditya Mallick et.al. | 2410.13528 | translate | read | null |
| 2024-10-17 | MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models | Donghao Zhou et.al. | 2410.13370 | translate | read | null |
| 2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | translate | read | link |
| 2024-10-16 | 3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation | Dewei Zhou et.al. | 2410.12669 | translate | read | null |
| 2024-10-16 | Evaluating Utility of Memory Efficient Medical Image Generation: A Study on Lung Nodule Segmentation | Kathrin Khadra et.al. | 2410.12542 | translate | read | null |
| 2024-10-16 | Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective | Yongxin Zhu et.al. | 2410.12490 | translate | read | link |
| 2024-10-16 | Synthetic Augmentation for Anatomical Landmark Localization using DDPMs | Arnela Hadzic et.al. | 2410.12489 | translate | read | null |
| 2024-10-16 | Imagine2Servo: Intelligent Visual Servoing with Diffusion-Driven Goal Generation for Robotic Tasks | Pranjali Pathre et.al. | 2410.12432 | translate | read | null |
| 2024-10-16 | GAN Based Top-Down View Synthesis in Reinforcement Learning Environments | Usama Younus et.al. | 2410.12372 | translate | read | null |
| 2024-10-16 | FaceChain-FACT: Face Adapter with Decoupled Training for Identity-preserved Personalization | Cheng Yu et.al. | 2410.12312 | translate | read | null |
| 2024-10-16 | NSSI-Net: Multi-Concept Generative Adversarial Network for Non-Suicidal Self-Injury Detection Using High-Dimensional EEG Signals in a Semi-Supervised Learning Framework | Zhen Liang et.al. | 2410.12159 | translate | read | null |
| 2024-10-16 | Facing Identity: The Formation and Performance of Identity via Face-Based Artificial Intelligence Technologies | Wells Lucas Santo et.al. | 2410.12148 | translate | read | null |
| 2024-10-15 | On the Effectiveness of Dataset Alignment for Fake Image Detection | Anirudh Sundara Rajan et.al. | 2410.11835 | translate | read | null |
| 2024-10-15 | KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities | Hsin-Ping Huang et.al. | 2410.11824 | translate | read | null |
| 2024-10-15 | Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices | Zhiyuan Ma et.al. | 2410.11795 | translate | read | null |
| 2024-10-15 | Generative Image Steganography Based on Point Cloud | Zhong Yangjie et.al. | 2410.11673 | translate | read | null |
| 2024-10-15 | InvSeg: Test-Time Prompt Inversion for Semantic Segmentation | Jiayi Lin et.al. | 2410.11473 | translate | read | null |
| 2024-10-15 | A Simple Approach to Unifying Diffusion-based Conditional Generation | Xirui Li et.al. | 2410.11439 | translate | read | null |
| 2024-10-15 | Evolutionary Retrofitting | Mathurin Videau et.al. | 2410.11330 | translate | read | null |
| 2024-10-15 | Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling | Guiyu Zhang et.al. | 2410.11236 | translate | read | null |
| 2024-10-14 | When Does Perceptual Alignment Benefit Vision Representations? | Shobhita Sundaram et.al. | 2410.10817 | translate | read | null |
| 2024-10-14 | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Haotian Tang et.al. | 2410.10812 | translate | read | link |
| 2024-10-14 | MMAR: Towards Lossless Multi-Modal Auto-Regressive Prababilistic Modeling | Jian Yang et.al. | 2410.10798 | translate | read | null |
| 2024-10-14 | Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations | Litu Rout et.al. | 2410.10792 | translate | read | null |
| 2024-10-14 | Evaluating SQL Understanding in Large Language Models | Ananya Rahaman et.al. | 2410.10680 | translate | read | null |
| 2024-10-14 | SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers | Enze Xie et.al. | 2410.10629 | translate | read | null |
| 2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | translate | read | link |
| 2024-10-14 | Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling | Wenze Liu et.al. | 2410.10511 | translate | read | link |
| 2024-10-14 | Vision-guided and Mask-enhanced Adaptive Denoising for Prompt-based Image Editing | Kejie Wang et.al. | 2410.10496 | translate | read | null |
| 2024-10-14 | 4DStyleGaussian: Zero-shot 4D Style Transfer with Gaussian Splatting | Wanlin Liang et.al. | 2410.10412 | translate | read | null |
| 2024-10-11 | SceneCraft: Layout-Guided 3D Scene Generation | Xiuyu Yang et.al. | 2410.09049 | translate | read | link |
| 2024-10-11 | MiRAGeNews: Multimodal Realistic AI-Generated News Detection | Runsheng Huang et.al. | 2410.09045 | translate | read | link |
| 2024-10-11 | One-shot Generative Domain Adaptation in 3D GANs | Ziqiang Li et.al. | 2410.08824 | translate | read | link |
| 2024-10-11 | Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting | Purushothaman Natarajan et.al. | 2410.08612 | translate | read | link |
| 2024-10-11 | Text-To-Image with Generative Adversarial Networks | Mehrshad Momen-Tayefeh et.al. | 2410.08608 | translate | read | null |
| 2024-10-11 | Context-Aware Full Body Anonymization using Text-to-Image Diffusion Models | Pascl Zwick et.al. | 2410.08551 | translate | read | null |
| 2024-10-11 | Score Neural Operator: A Generative Model for Learning and Generalizing Across Multiple Probability Distributions | Xinyu Liao et.al. | 2410.08549 | translate | read | null |
| 2024-10-11 | Diffusion Models Need Visual Priors for Image Generation | Xiaoyu Yue et.al. | 2410.08531 | translate | read | null |
| 2024-10-10 | Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis | Jinbin Bai et.al. | 2410.08261 | translate | read | link |
| 2024-10-10 | DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models | Xiaoxiao He et.al. | 2410.08207 | translate | read | null |
| 2024-10-10 | Scaling Laws For Diffusion Transformers | Zhengyang Liang et.al. | 2410.08184 | translate | read | null |
| 2024-10-10 | DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation | Jiatao Gu et.al. | 2410.08159 | translate | read | null |
| 2024-10-10 | RayEmb: Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace | Pragyan Shrestha et.al. | 2410.08152 | translate | read | link |
| 2024-10-10 | Generated Bias: Auditing Internal Bias Dynamics of Text-To-Image Generative Models | Abhishek Mandal et.al. | 2410.07884 | translate | read | null |
| 2024-10-10 | MinorityPrompt: Text to Minority Image Generation via Prompt Optimization | Soobin Um et.al. | 2410.07838 | translate | read | link |
| 2024-10-10 | MGMD-GAN: Generalization Improvement of Generative Adversarial Networks with Multiple Generator Multiple Discriminator Framework Against Membership Inference Attacks | Nirob Arefin et.al. | 2410.07803 | translate | read | null |
| 2024-10-10 | Synthesizing Multi-Class Surgical Datasets with Anatomy-Aware Diffusion Models | Danush Kumar Venkatesh et.al. | 2410.07753 | translate | read | link |
| 2024-10-10 | Relational Diffusion Distillation for Efficient Image Generation | Weilun Feng et.al. | 2410.07679 | translate | read | link |
| 2024-10-10 | FLIER: Few-shot Language Image Models Embedded with Latent Representations | Zhinuo Zhou et.al. | 2410.07648 | translate | read | null |
| 2024-10-09 | IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation | Xinchen Zhang et.al. | 2410.07171 | translate | read | link |
| 2024-10-09 | EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models | Rui Zhao et.al. | 2410.07133 | translate | read | link |
| 2024-10-09 | Personalized Visual Instruction Tuning | Renjie Pi et.al. | 2410.07113 | translate | read | link |
| 2024-10-09 | Boosting Few-Shot Detection with Large Language Models and Layout-to-Image Synthesis | Ahmed Abdullah et.al. | 2410.06841 | translate | read | null |
| 2024-10-09 | Decouple-Then-Merge: Towards Better Training for Diffusion Models | Qianli Ma et.al. | 2410.06664 | translate | read | link |
| 2024-10-09 | On the Solution of Linearized Inverse Scattering Problems in Near-Field Microwave Imaging by Operator Inversion and Matched Filtering | Matthias M. Saurer et.al. | 2410.06465 | translate | read | null |
| 2024-10-08 | Story-Adapter: A Training-free Iterative Framework for Long Story Visualization | Jiawei Mao et.al. | 2410.06244 | translate | read | link |
| 2024-10-08 | SD- $π$ XL: Generating Low-Resolution Quantized Imagery via Score Distillation | Alexandre Binninger et.al. | 2410.06236 | translate | read | null |
| 2024-10-08 | Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach | Sha Guo et.al. | 2410.06149 | translate | read | null |
| 2024-10-08 | Estimating the Number of HTTP/3 Responses in QUIC Using Deep Learning | Barak Gahtan et.al. | 2410.06140 | translate | read | null |
| 2024-10-07 | Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer | Siyuan Hou et.al. | 2410.05151 | translate | read | null |
| 2024-10-07 | Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning | Ayano Hiranaka et.al. | 2410.05116 | translate | read | null |
| 2024-10-07 | Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization | Rohan Reddy Mekala et.al. | 2410.05114 | translate | read | null |
| 2024-10-07 | Bi-Directional MS Lesion Filling and Synthesis Using Denoising Diffusion Implicit Model-based Lesion Repainting | Jinwei Zhang et.al. | 2410.05027 | translate | read | null |
| 2024-10-07 | OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction | Leheng Li et.al. | 2410.04932 | translate | read | null |
| 2024-10-07 | PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing | Feng Tian et.al. | 2410.04844 | translate | read | null |
| 2024-10-07 | Transforming Color: A Novel Image Colorization Method | Hamza Shafiq et.al. | 2410.04799 | translate | read | null |
| 2024-10-07 | Double Oracle Neural Architecture Search for Game Theoretic Deep Learning Models | Aye Phyu Phyu Aung et.al. | 2410.04764 | translate | read | null |
| 2024-10-07 | Stochastic Runge-Kutta Methods: Provable Acceleration of Diffusion Models | Yuchen Wu et.al. | 2410.04760 | translate | read | null |
| 2024-10-06 | Video Summarization Techniques: A Comprehensive Review | Toqa Alaa et.al. | 2410.04449 | translate | read | null |
| 2024-10-04 | Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features | Benyuan Meng et.al. | 2410.03558 | translate | read | link |
| 2024-10-04 | Dynamic Diffusion Transformer | Wangbo Zhao et.al. | 2410.03456 | translate | read | link |
| 2024-10-04 | Images Speak Volumes: User-Centric Assessment of Image Generation for Accessible Communication | Miriam Anschütz et.al. | 2410.03430 | translate | read | null |
| 2024-10-04 | LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding | Doohyuk Jang et.al. | 2410.03355 | translate | read | null |
| 2024-10-04 | Learning test generators for cyber-physical systems | Jarkko Peltomäki et.al. | 2410.03202 | translate | read | null |
| 2024-10-04 | MultiVerse: Efficient and Expressive Zero-Shot Multi-Task Text-to-Speech | Taejun Bak et.al. | 2410.03192 | translate | read | null |
| 2024-10-04 | Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization | Zichen Miao et.al. | 2410.03190 | translate | read | null |
| 2024-10-04 | Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach | Yaofang Liu et.al. | 2410.03160 | translate | read | link |
| 2024-10-03 | Revealing the Unseen: Guiding Personalized Diffusion Models to Expose Training Data | Xiaoyu Wu et.al. | 2410.03039 | translate | read | null |
| 2024-10-03 | PixelShuffler: A Simple Image Translation Through Pixel Rearrangement | Omar Zamzam et.al. | 2410.03021 | translate | read | null |
| 2024-10-03 | SteerDiff: Steering towards Safe Text-to-Image Diffusion Models | Hongxiang Zhang et.al. | 2410.02710 | translate | read | null |
| 2024-10-03 | ControlAR: Controllable Image Generation with Autoregressive Models | Zongming Li et.al. | 2410.02705 | translate | read | link |
| 2024-10-03 | Grounded Answers for Multi-agent Decision-making Problem through Generative World Model | Zeyang Liu et.al. | 2410.02664 | translate | read | null |
| 2024-10-03 | Event-Customized Image Generation | Zhen Wang et.al. | 2410.02483 | translate | read | null |
| 2024-10-03 | Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation | Muzhi Zhu et.al. | 2410.02369 | translate | read | link |
| 2024-10-03 | SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration | Jintao Zhang et.al. | 2410.02367 | translate | read | link |
| 2024-10-03 | Plug-and-Play Controllable Generation for Discrete Masked Models | Wei Guo et.al. | 2410.02143 | translate | read | null |
| 2024-10-02 | EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing | Haotian Sun et.al. | 2410.02098 | translate | read | null |
| 2024-10-02 | DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation | Jing He et.al. | 2410.02067 | translate | read | null |
| 2024-10-02 | Normalizing Flow Based Metric for Image Generation | Pranav Jeevan et.al. | 2410.02004 | translate | read | link |
| 2024-10-02 | Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space | Yangming Li et.al. | 2410.01796 | translate | read | null |
| 2024-10-02 | ImageFolder: Autoregressive Image Generation with Folded Tokens | Xiang Li et.al. | 2410.01756 | translate | read | link |
| 2024-10-02 | ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation | Rinon Gal et.al. | 2410.01731 | translate | read | null |
| 2024-10-02 | Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding | Yao Teng et.al. | 2410.01699 | translate | read | link |
| 2024-10-02 | Data Extrapolation for Text-to-image Generation on Small Datasets | Senmao Ye et.al. | 2410.01638 | translate | read | link |
| 2024-10-02 | KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models | Pouyan Navard et.al. | 2410.01595 | translate | read | link |
| 2024-10-02 | Edge-preserving noise for diffusion models | Jente Vandersanden et.al. | 2410.01540 | translate | read | null |
| 2024-10-02 | Harnessing the Latent Diffusion Model for Training-Free Image Style Transfer | Kento Masui et.al. | 2410.01366 | translate | read | null |
| 2024-10-02 | Aggregation of Multi Diffusion Models for Enhancing Learned Representations | Conghan Yue et.al. | 2410.01262 | translate | read | link |
| 2024-10-02 | The SynCOM Flow Tracking Challenge | Valmir Moraes Filho et.al. | 2410.01233 | translate | read | null |
| 2024-10-01 | Enhancing GANs with Contrastive Learning-Based Multistage Progressive Finetuning SNN and RL-Based External Optimization | Osama Mustafa et.al. | 2409.20340 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)