Image Generation - 2025-03

Publish Date Title Authors PDF Translate Read Code
2025-03-31 RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy Zhonghan Zhao et.al. 2503.24388 translate read null
2025-03-31 Consistent Subject Generation via Contrastive Instantiated Concepts Lee Hsin-Ying et.al. 2503.24387 translate read null
2025-03-31 ERUPT: Efficient Rendering with Unposed Patch Transformer Maxim V. Shugaev et.al. 2503.24374 translate read null
2025-03-31 Style Quantization for Data-Efficient GAN Training Jian Wang et.al. 2503.24282 translate read null
2025-03-31 FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics Yixuan Li et.al. 2503.24267 translate read null
2025-03-31 Beyond a Single Mode: GAN Ensembles for Diverse Medical Data Generation Lorenzo Tronchin et.al. 2503.24258 translate read link
2025-03-31 Threats and Opportunities in AI-generated Images for Armed Forces Raphael Meier et.al. 2503.24095 translate read null
2025-03-31 AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents Jiaxiang Chen et.al. 2503.23948 translate read null
2025-03-31 Semantic Packet Aggregation and Repeated Transmission for Text-to-Image Generation Seunghun Lee et.al. 2503.23734 translate read null
2025-03-28 Evaluation of Machine-generated Biomedical Images via A Tally-based Similarity Measure Frank J. Brooks et.al. 2503.22658 translate read null
2025-03-28 RELD: Regularization by Latent Diffusion Models for Image Restoration Pasquale Cascarano et.al. 2503.22563 translate read null
2025-03-28 Deterministic Medical Image Translation via High-fidelity Brownian Bridges Qisheng He et.al. 2503.22531 translate read null
2025-03-28 Meta-LoRA: Meta-Learning LoRA Components for Domain-Aware ID Personalization Barış Batuhan Topal et.al. 2503.22352 translate read null
2025-03-28 Semantix: An Energy Guided Sampler for Semantic Style Transfer Huiang He et.al. 2503.22344 translate read null
2025-03-28 ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian Splatting Wenjie Liu et.al. 2503.22218 translate read null
2025-03-28 Intrinsic Image Decomposition for Robust Self-supervised Monocular Depth Estimation on Reflective Surfaces Wonhyeok Choi et.al. 2503.22209 translate read null
2025-03-28 ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation Yunhong Min et.al. 2503.22194 translate read null
2025-03-28 Sell It Before You Make It: Revolutionizing E-Commerce with Personalized AI-Generated Items Jianghao Lin et.al. 2503.22182 translate read null
2025-03-28 An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval Min Cao et.al. 2503.22171 translate read null
2025-03-27 Optimal Stepsize for Diffusion Sampling Jianning Pei et.al. 2503.21774 translate read link
2025-03-27 Lumina-Image 2.0: A Unified and Efficient Image Generative Framework Qi Qin et.al. 2503.21758 translate read link
2025-03-27 A Unified Framework for Diffusion Bridge Problems: Flow Matching and Schrödinger Matching into One Minyoung Kim et.al. 2503.21756 translate read null
2025-03-27 LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis Shitian Zhao et.al. 2503.21749 translate read link
2025-03-27 CTRL-O: Language-Controllable Object-Centric Visual Representation Learning Aniket Didolkar et.al. 2503.21747 translate read null
2025-03-27 3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models Yuhan Zhang et.al. 2503.21745 translate read null
2025-03-27 Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance Jaywon Koo et.al. 2503.21721 translate read null
2025-03-27 Zero-Shot Visual Concept Blending Without Text Guidance Hiroya Makino et.al. 2503.21277 translate read null
2025-03-27 UGen: Unified Autoregressive Multimodal Model with Progressive Vocabulary Learning Hongxuan Tang et.al. 2503.21193 translate read null
2025-03-27 Model as a Game: On Numerical and Spatial Consistency for Generative Games Jingye Chen et.al. 2503.21172 translate read null
2025-03-26 High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching Guoqiang Zhang et.al. 2503.20744 translate read null
2025-03-26 RecTable: Fast Modeling Tabular Data with Rectified Flow Masane Fuchi et.al. 2503.20731 translate read link
2025-03-26 BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Yuyang Peng et.al. 2503.20672 translate read link
2025-03-26 MMGen: Unified Multi-modal Image Generation and Understanding in One Go Jiepeng Wang et.al. 2503.20644 translate read null
2025-03-26 Pluggable Style Representation Learning for Multi-Style Transfer Hongda Liu et.al. 2503.20368 translate read link
2025-03-26 Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models Alex Jinpeng Wang et.al. 2503.20198 translate read null
2025-03-26 AvatarArtist: Open-Domain 4D Avatarization Hongyu Liu et.al. 2503.19906 translate read link
2025-03-25 Scaling Down Text Encoders of Text-to-Image Diffusion Models Lifu Wang et.al. 2503.19897 translate read null
2025-03-26 In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush Vitaly Gnatyuk et.al. 2503.19793 translate read null
2025-03-25 SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation Jingdan Kang et.al. 2503.19791 translate read null
2025-03-25 Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models Kartik Thakral et.al. 2503.19783 translate read null
2025-03-25 PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models Junhyuk So et.al. 2503.19731 translate read null
2025-03-25 VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models Suhas G Hegde et.al. 2503.19530 translate read null
2025-03-25 Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage Zhengwentai Sun et.al. 2503.19486 translate read null
2025-03-25 Interpretable Generative Models through Post-hoc Concept Bottlenecks Akshay Kulkarni et.al. 2503.19377 translate read link
2025-03-25 Efficient Adversarial Detection Frameworks for Vehicle-to-Microgrid Services in Edge Computing Ahmed Omara et.al. 2503.19318 translate read null
2025-03-24 Equivariant Image Modeling Ruixiao Dong et.al. 2503.18948 translate read link
2025-03-24 Training-free Diffusion Acceleration with Bottleneck Sampling Ye Tian et.al. 2503.18940 translate read link
2025-03-24 SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection Shrikant Malviya et.al. 2503.18812 translate read null
2025-03-24 Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation Qin Wang et.al. 2503.18753 translate read null
2025-03-24 Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings Cong Liu et.al. 2503.18719 translate read null
2025-03-24 Adventurer: Exploration with BiGAN for Deep Reinforcement Learning Yongshuai Liu et.al. 2503.18612 translate read null
2025-03-24 Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning Hadi Mohammadi et.al. 2503.18569 translate read null
2025-03-24 Advancing Cross-Organ Domain Generalization with Test-Time Style Transfer and Diversity Enhancement Biwen Meng et.al. 2503.18567 translate read null
2025-03-24 Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models Bin Li et.al. 2503.18556 translate read link
2025-03-24 PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models Tadeusz Dziarmaga et.al. 2503.18462 translate read null
2025-03-21 Vision Transformer Based Semantic Communications for Next Generation Wireless Networks Muhammad Ahmed Mohsin et.al. 2503.17275 translate read null
2025-03-21 Leveraging Text-to-Image Generation for Handling Spurious Correlation Aryan Yazdan Parast et.al. 2503.17226 translate read null
2025-03-21 Generative adversarial framework to calibrate excursion set models for the 3D morphology of all-solid-state battery cathodes Orkun Furat et.al. 2503.17171 translate read null
2025-03-21 D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens Panpan Wang et.al. 2503.17155 translate read null
2025-03-21 HiFi-Stream: Streaming Speech Enhancement with Generative Adversarial Networks Ekaterina Dmitrieva et.al. 2503.17141 translate read null
2025-03-21 Halton Scheduler For Masked Generative Image Transformer Victor Besnier et.al. 2503.17076 translate read link
2025-03-21 Zero-Shot Styled Text Image Generation, but Make It Autoregressive Vittorio Pippi et.al. 2503.17074 translate read link
2025-03-21 DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech Yongkang Cheng et.al. 2503.17059 translate read null
2025-03-21 Multiple Ultrasound Image Generation based on Tuned Alignment of Amplitude Hologram over Spatially non-Uniform Ultrasound Source Keisuke Hasegawa et.al. 2503.16949 translate read null
2025-03-21 When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO Lingfan Zhang et.al. 2503.16921 translate read null
2025-03-20 Tokenize Image as a Set Zigang Geng et.al. 2503.16425 translate read link
2025-03-20 SynCity: Training-Free Generation of 3D Worlds Paul Engstler et.al. 2503.16420 translate read link
2025-03-20 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Liming Jiang et.al. 2503.16418 translate read link
2025-03-20 VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness SeungJu Cha et.al. 2503.16406 translate read null
2025-03-20 LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images Leyang Wang et.al. 2503.16376 translate read null
2025-03-20 Ultra-Resolution Adaptation with Ease Ruonan Yu et.al. 2503.16322 translate read link
2025-03-20 Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction Ziyao Guo et.al. 2503.16194 translate read link
2025-03-20 FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing Tianyi Wei et.al. 2503.16153 translate read null
2025-03-20 Multi-Prompt Style Interpolation for Fine-Grained Artistic Control Lei Chen et.al. 2503.16133 translate read null
2025-03-20 Controllable Segmentation-Based Text-Guided Style Editing Jingwen Li et.al. 2503.16129 translate read null
2025-03-19 FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers Ruichen Chen et.al. 2503.15465 translate read null
2025-03-19 Di $\mathtt{[M]}$ O: Distilling Masked Diffusion Models into One-step Generator Yuanzhi Zhu et.al. 2503.15457 translate read null
2025-03-19 Visual Persona: Foundation Model for Full-Body Human Customization Jisu Nam et.al. 2503.15406 translate read null
2025-03-19 TruthLens:A Training-Free Paradigm for DeepFake Detection Ritabrata Chakraborty et.al. 2503.15342 translate read null
2025-03-19 TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models Teng-Fang Hsiao et.al. 2503.15283 translate read null
2025-03-19 LEGION: Learning to Ground and Explain for Synthetic Image Detection Hengrui Kang et.al. 2503.15264 translate read link
2025-03-19 Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization Feifei Li et.al. 2503.15197 translate read null
2025-03-19 Volumetric Reconstruction From Partial Views for Task-Oriented Grasping Fujian Yan et.al. 2503.15167 translate read null
2025-03-19 Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis Imanol G. Estepa et.al. 2503.15060 translate read null
2025-03-19 Texture-Aware StarGAN for CT data harmonisation Francesco Di Feola et.al. 2503.15058 translate read null
2025-03-18 Deeply Supervised Flow-Based Generative Models Inkyu Shin et.al. 2503.14494 translate read null
2025-03-18 DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Minglei Shi et.al. 2503.14487 translate read null
2025-03-18 ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing Yulin Pan et.al. 2503.14482 translate read null
2025-03-18 RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment Chao Wang et.al. 2503.14358 translate read null
2025-03-18 Free-Lunch Color-Texture Disentanglement for Stylized Image Generation Jiang Qin et.al. 2503.14275 translate read null
2025-03-18 DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection Jaewoo Song et.al. 2503.13985 translate read null
2025-03-18 SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model Xinqing Li et.al. 2503.13952 translate read null
2025-03-18 Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection Chunlei Li et.al. 2503.13828 translate read null
2025-03-18 VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences Anukriti Singh et.al. 2503.13817 translate read null
2025-03-17 Unified Autoregressive Visual Generation and Understanding with Continuous Tokens Lijie Fan et.al. 2503.13436 translate read null
2025-03-17 BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing Yaowei Li et.al. 2503.13434 translate read link
2025-03-17 RainScaleGAN: a Conditional Generative Adversarial Network for Rainfall Downscaling Marcello Iotti et.al. 2503.13316 translate read null
2025-03-17 MAME: Multidimensional Adaptive Metamer Exploration with Human Perceptual Feedback Mina Kamao et.al. 2503.13212 translate read null
2025-03-17 MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis Marvin Seyfarth et.al. 2503.13211 translate read null
2025-03-17 Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation Yihong Luo et.al. 2503.13070 translate read link
2025-03-17 FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks Tong Lei et.al. 2503.12936 translate read null
2025-03-17 DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models Dewei Zhou et.al. 2503.12885 translate read link
2025-03-17 Optimizing Ansatz Design in Quantum Generative Adversarial Networks Using Large Language Models Kento Ueda et.al. 2503.12884 translate read null
2025-03-17 High-Resolution Range-Doppler Imaging from One-Bit PMCW Radar via Generative Adversarial Networks Jingxian Wang et.al. 2503.12841 translate read null
2025-03-14 T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation Seyed Mohammad Hadi Hosseini et.al. 2503.11481 translate read link
2025-03-14 MTV-Inpaint: Multi-Task Long Video Inpainting Shiyuan Yang et.al. 2503.11412 translate read link
2025-03-14 Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking Ziyi Wang et.al. 2503.11324 translate read null
2025-03-14 Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards Zijing Hu et.al. 2503.11240 translate read link
2025-03-14 Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption Du Chen et.al. 2503.11221 translate read link
2025-03-14 Simulating Dual-Pixel Images From Ray Tracing For Depth Estimation Fengchen He et.al. 2503.11213 translate read null
2025-03-14 Provenance Detection for AI-Generated Images: Combining Perceptual Hashing, Homomorphic Encryption, and AI Detection Models Shree Singhi et.al. 2503.11195 translate read null
2025-03-14 Direction-Aware Diagonal Autoregressive Image Generation Yijia Xu et.al. 2503.11129 translate read null
2025-03-14 Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models Zhenguang Liu et.al. 2503.11071 translate read null
2025-03-14 Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization Kyle Sargent et.al. 2503.11056 translate read null
2025-03-13 GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Rongyao Fang et.al. 2503.10639 translate read link
2025-03-13 DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation Chen Chen et.al. 2503.10618 translate read null
2025-03-13 ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer Bolin Chen et.al. 2503.10614 translate read link
2025-03-13 Autoregressive Image Generation with Randomized Parallel Decoding Haopeng Li et.al. 2503.10568 translate read link
2025-03-13 RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation Yuwen Du et.al. 2503.10410 translate read link
2025-03-13 RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models Yijing Lin et.al. 2503.10406 translate read link
2025-03-13 ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation Zirun Guo et.al. 2503.10358 translate read null
2025-03-13 Do I look like a cat.n.01 to you? A Taxonomy Image Generation Benchmark Viktor Moskvoretskii et.al. 2503.10357 translate read null
2025-03-13 MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment Hao Zhou et.al. 2503.10287 translate read null
2025-03-13 PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models Runze He et.al. 2503.10127 translate read null
2025-03-12 FCaS: Fine-grained Cardiac Image Synthesis based on 3D Template Conditional Diffusion Model Jiahao Xia et.al. 2503.09560 translate read null
2025-03-12 DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction Junjie Zhou et.al. 2503.09491 translate read null
2025-03-12 PromptMap: An Alternative Interaction Style for AI-Based Image Generation Krzysztof Adamkiewicz et.al. 2503.09436 translate read null
2025-03-12 LHC Triggers using FPGA Image Recognition James Brooke et.al. 2503.09428 translate read null
2025-03-12 Revealing the Implicit Noise-based Imprint of Generative Models Xinghan Li et.al. 2503.09314 translate read null
2025-03-12 Revealing Unintentional Information Leakage in Low-Dimensional Facial Portrait Representations Kathleen Anderson et.al. 2503.09306 translate read null
2025-03-12 UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer Haoxuan Wang et.al. 2503.09277 translate read null
2025-03-12 NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers Yuhang Ma et.al. 2503.09242 translate read null
2025-03-12 Active Learning Inspired ControlNet Guidance for Augmenting Semantic Segmentation Datasets Hannah Kniesel et.al. 2503.09221 translate read null
2025-03-12 WonderVerse: Extendable 3D Scene Generation with Video Generative Models Hao Feng et.al. 2503.09160 translate read null
2025-03-11 GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing Yuanhao Wang et.al. 2503.08678 translate read null
2025-03-11 Generating Robot Constitutions & Benchmarks for Semantic Safety Pierre Sermanet et.al. 2503.08663 translate read link
2025-03-11 YuE: Scaling Open Foundation Models for Long-Form Music Generation Ruibin Yuan et.al. 2503.08638 translate read link
2025-03-11 LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization Xianfeng Wu et.al. 2503.08619 translate read link
2025-03-11 CellStyle: Improved Zero-Shot Cell Segmentation via Style Transfer Rüveyda Yilmaz et.al. 2503.08603 translate read null
2025-03-11 DISTINGUISH Workflow: A New Paradigm of Dynamic Well Placement Using Generative Machine Learning Sergey Alyaev et.al. 2503.08509 translate read null
2025-03-11 Generalizable AI-Generated Image Detection Based on Fractal Self-Similarity in the Spectrum Shengpeng Xiao et.al. 2503.08484 translate read null
2025-03-11 GAS-NeRF: Geometry-Aware Stylization of Dynamic Radiance Fields Nhat Phuong Anh Vu et.al. 2503.08483 translate read null
2025-03-11 DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank Zhanjie Zhang et.al. 2503.08392 translate read null
2025-03-11 Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens Qingsong Xie et.al. 2503.08377 translate read link
2025-03-10 V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation Guiwei Zhang et.al. 2503.07493 translate read null
2025-03-10 NeAS: 3D Reconstruction from X-ray Images using Neural Attenuation Surface Chengrui Zhu et.al. 2503.07491 translate read null
2025-03-10 GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models Ryugo Morita et.al. 2503.07463 translate read null
2025-03-10 PersonaBooth: Personalized Text-to-Motion Generation Boeun Kim et.al. 2503.07390 translate read null
2025-03-10 TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models Ruidong Chen et.al. 2503.07389 translate read link
2025-03-10 Inversion-Free Video Style Transfer with Trajectory Reset Attention Control and Content-Style Bridging Jiang Lin et.al. 2503.07363 translate read null
2025-03-10 Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment Xing Xie et.al. 2503.07334 translate read link
2025-03-10 AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models Bo Huang et.al. 2503.07307 translate read link
2025-03-10 WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation Yuwei Niu et.al. 2503.07265 translate read link
2025-03-10 Synthetic Lung X-ray Generation through Cross-Attention and Affinity Transformation Ruochen Pi et.al. 2503.07209 translate read null
2025-03-10 Effective and Efficient Masked Image Generation Models Zebin You et.al. 2503.07197 translate read link
2025-03-10 MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction Hung Q. Vo et.al. 2503.07157 translate read null
2025-03-07 VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control Yuxuan Bian et.al. 2503.05639 translate read link
2025-03-07 Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models Zheng Li et.al. 2503.05595 translate read null
2025-03-07 PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations? Martin Spitznagel et.al. 2503.05333 translate read link
2025-03-07 Frequency Autoregressive Image Generation with Continuous Tokens Hu Yu et.al. 2503.05305 translate read null
2025-03-07 Unified Reward Model for Multimodal Understanding and Generation Yibin Wang et.al. 2503.05236 translate read link
2025-03-07 RecipeGen: A Benchmark for Real-World Recipe Image Generation Ruoxuan Zhang et.al. 2503.05228 translate read null
2025-03-07 Development and Enhancement of Text-to-Image Diffusion Models Rajdeep Roshan Sahu et.al. 2503.05149 translate read null
2025-03-07 Accelerated Patient-specific Non-Cartesian MRI Reconstruction using Implicit Neural Representations Di Xu et.al. 2503.05051 translate read null
2025-03-06 Quantum generative adversarial networks for gluon initiated jets generation Rey Guadarrama et.al. 2503.05044 translate read null
2025-03-06 Iris Style Transfer: Enhancing Iris Recognition with Style Features and Privacy Preservation through Neural Style Transfer Mengdi Wang et.al. 2503.04707 translate read null
2025-03-06 Gradient-descent methods for fast quantum state tomography Akshay Gaikwad et.al. 2503.04526 translate read null
2025-03-06 IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement Zhihao Shi et.al. 2503.04501 translate read null
2025-03-06 ObjMST: An Object-Focused Multimodal Style Transfer Framework Chanda Grover Kamra et.al. 2503.04353 translate read null
2025-03-06 S2Gaussian: Sparse-View Super-Resolution 3D Gaussian Splatting Yecong Wan et.al. 2503.04314 translate read null
2025-03-06 ControlFill: Spatially Adjustable Image Inpainting from Prompt Learning Boseong Jeon et.al. 2503.04268 translate read null
2025-03-06 Synthetic Data is an Elegant GIFT for Continual Vision-Language Models Bin Wu et.al. 2503.04229 translate read null
2025-03-06 Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models Rui Jiang et.al. 2503.04215 translate read null
2025-03-06 SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer Chunnan Shang et.al. 2503.04119 translate read null
2025-03-06 Underlying Semantic Diffusion for Effective and Efficient In-Context Learning Zhong Ji et.al. 2503.04050 translate read null
2025-03-05 A Generative Approach to High Fidelity 3D Reconstruction from Text Data Venkat Kumar R et.al. 2503.03664 translate read null
2025-03-05 Generative Artificial Intelligence in Robotic Manipulation: A Survey Kun Zhang et.al. 2503.03464 translate read null
2025-03-05 GenColor: Generative Color-Concept Association in Visual Design Yihan Hou et.al. 2503.03236 translate read null
2025-03-05 An Analytical Theory of Power Law Spectral Bias in the Learning Dynamics of Diffusion Models Binxu Wang et.al. 2503.03206 translate read null
2025-03-05 Find Matching Faces Based On Face Parameters Setu A. Bhatt et.al. 2503.03204 translate read null
2025-03-05 From Architectural Sketch to Conceptual Representation: Using Structure-Aware Diffusion Model to Generate Renderings of School Buildings Zhengyang Wang et.al. 2503.03090 translate read null
2025-03-05 Multi-View Depth Consistent Image Generation Using Generative AI Models: Application on Architectural Design of University Buildings Xusheng Du et.al. 2503.03068 translate read null
2025-03-04 Can Diffusion Models Provide Rigorous Uncertainty Quantification for Bayesian Inverse Problems? Evan Scope Crafts et.al. 2503.03007 translate read link
2025-03-04 Robust time series generation via Schrödinger Bridge: a comprehensive evaluation Alexandre Alouadi et.al. 2503.02943 translate read null
2025-03-04 ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models Qinyu Zhao et.al. 2503.02883 translate read link
2025-03-04 Large-Angle Convergent-Beam Electron Diffraction Patterns via Conditional Generative Adversarial Networks Joseph. J Webb et.al. 2503.02852 translate read null
2025-03-04 Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts Marta Skreta et.al. 2503.02819 translate read link
2025-03-04 Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution Ru Ito et.al. 2503.02767 translate read null
2025-03-04 Generative Modeling of Microweather Wind Velocities for Urban Air Mobility Tristan A. Shah et.al. 2503.02690 translate read null
2025-03-04 YARE-GAN: Yet Another Resting State EEG-GAN Yeganeh Farahzadi et.al. 2503.02636 translate read null
2025-03-04 SPG: Improving Motion Diffusion by Smooth Perturbation Guidance Boseong Jeon et.al. 2503.02577 translate read null
2025-03-04 PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks Sheng Shang et.al. 2503.02547 translate read null
2025-03-04 RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification Zhen Yang et.al. 2503.02537 translate read link
2025-03-04 Q&C: When Quantization Meets Cache in Efficient Image Generation Xin Ding et.al. 2503.02508 translate read null
2025-03-03 MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing Xueyun Tian et.al. 2502.21291 translate read link

(<a href=../Image_Generation.md>back to Image Generation</a>)