Image Generation - 2025-03 | Paper Arxiv Daily

Image Generation - 2025-03

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-03-31	RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy	Zhonghan Zhao et.al.	2503.24388	translate	read	null
2025-03-31	Consistent Subject Generation via Contrastive Instantiated Concepts	Lee Hsin-Ying et.al.	2503.24387	translate	read	null
2025-03-31	ERUPT: Efficient Rendering with Unposed Patch Transformer	Maxim V. Shugaev et.al.	2503.24374	translate	read	null
2025-03-31	Style Quantization for Data-Efficient GAN Training	Jian Wang et.al.	2503.24282	translate	read	null
2025-03-31	FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics	Yixuan Li et.al.	2503.24267	translate	read	null
2025-03-31	Beyond a Single Mode: GAN Ensembles for Diverse Medical Data Generation	Lorenzo Tronchin et.al.	2503.24258	translate	read	link
2025-03-31	Threats and Opportunities in AI-generated Images for Armed Forces	Raphael Meier et.al.	2503.24095	translate	read	null
2025-03-31	AI2Agent: An End-to-End Framework for Deploying AI Projects as Autonomous Agents	Jiaxiang Chen et.al.	2503.23948	translate	read	null
2025-03-31	Semantic Packet Aggregation and Repeated Transmission for Text-to-Image Generation	Seunghun Lee et.al.	2503.23734	translate	read	null
2025-03-28	Evaluation of Machine-generated Biomedical Images via A Tally-based Similarity Measure	Frank J. Brooks et.al.	2503.22658	translate	read	null
2025-03-28	RELD: Regularization by Latent Diffusion Models for Image Restoration	Pasquale Cascarano et.al.	2503.22563	translate	read	null
2025-03-28	Deterministic Medical Image Translation via High-fidelity Brownian Bridges	Qisheng He et.al.	2503.22531	translate	read	null
2025-03-28	Meta-LoRA: Meta-Learning LoRA Components for Domain-Aware ID Personalization	Barış Batuhan Topal et.al.	2503.22352	translate	read	null
2025-03-28	Semantix: An Energy Guided Sampler for Semantic Style Transfer	Huiang He et.al.	2503.22344	translate	read	null
2025-03-28	ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian Splatting	Wenjie Liu et.al.	2503.22218	translate	read	null
2025-03-28	Intrinsic Image Decomposition for Robust Self-supervised Monocular Depth Estimation on Reflective Surfaces	Wonhyeok Choi et.al.	2503.22209	translate	read	null
2025-03-28	ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation	Yunhong Min et.al.	2503.22194	translate	read	null
2025-03-28	Sell It Before You Make It: Revolutionizing E-Commerce with Personalized AI-Generated Items	Jianghao Lin et.al.	2503.22182	translate	read	null
2025-03-28	An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval	Min Cao et.al.	2503.22171	translate	read	null
2025-03-27	Optimal Stepsize for Diffusion Sampling	Jianning Pei et.al.	2503.21774	translate	read	link
2025-03-27	Lumina-Image 2.0: A Unified and Efficient Image Generative Framework	Qi Qin et.al.	2503.21758	translate	read	link
2025-03-27	A Unified Framework for Diffusion Bridge Problems: Flow Matching and Schrödinger Matching into One	Minyoung Kim et.al.	2503.21756	translate	read	null
2025-03-27	LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis	Shitian Zhao et.al.	2503.21749	translate	read	link
2025-03-27	CTRL-O: Language-Controllable Object-Centric Visual Representation Learning	Aniket Didolkar et.al.	2503.21747	translate	read	null
2025-03-27	3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models	Yuhan Zhang et.al.	2503.21745	translate	read	null
2025-03-27	Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance	Jaywon Koo et.al.	2503.21721	translate	read	null
2025-03-27	Zero-Shot Visual Concept Blending Without Text Guidance	Hiroya Makino et.al.	2503.21277	translate	read	null
2025-03-27	UGen: Unified Autoregressive Multimodal Model with Progressive Vocabulary Learning	Hongxuan Tang et.al.	2503.21193	translate	read	null
2025-03-27	Model as a Game: On Numerical and Spatial Consistency for Generative Games	Jingye Chen et.al.	2503.21172	translate	read	null
2025-03-26	High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching	Guoqiang Zhang et.al.	2503.20744	translate	read	null
2025-03-26	RecTable: Fast Modeling Tabular Data with Rectified Flow	Masane Fuchi et.al.	2503.20731	translate	read	link
2025-03-26	BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation	Yuyang Peng et.al.	2503.20672	translate	read	link
2025-03-26	MMGen: Unified Multi-modal Image Generation and Understanding in One Go	Jiepeng Wang et.al.	2503.20644	translate	read	null
2025-03-26	Pluggable Style Representation Learning for Multi-Style Transfer	Hongda Liu et.al.	2503.20368	translate	read	link
2025-03-26	Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models	Alex Jinpeng Wang et.al.	2503.20198	translate	read	null
2025-03-26	AvatarArtist: Open-Domain 4D Avatarization	Hongyu Liu et.al.	2503.19906	translate	read	link
2025-03-25	Scaling Down Text Encoders of Text-to-Image Diffusion Models	Lifu Wang et.al.	2503.19897	translate	read	null
2025-03-26	In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush	Vitaly Gnatyuk et.al.	2503.19793	translate	read	null
2025-03-25	SITA: Structurally Imperceptible and Transferable Adversarial Attacks for Stylized Image Generation	Jingdan Kang et.al.	2503.19791	translate	read	null
2025-03-25	Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models	Kartik Thakral et.al.	2503.19783	translate	read	null
2025-03-25	PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models	Junhyuk So et.al.	2503.19731	translate	read	null
2025-03-25	VectorFit : Adaptive Singular & Bias Vector Fine-Tuning of Pre-trained Foundation Models	Suhas G Hegde et.al.	2503.19530	translate	read	null
2025-03-25	Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage	Zhengwentai Sun et.al.	2503.19486	translate	read	null
2025-03-25	Interpretable Generative Models through Post-hoc Concept Bottlenecks	Akshay Kulkarni et.al.	2503.19377	translate	read	link
2025-03-25	Efficient Adversarial Detection Frameworks for Vehicle-to-Microgrid Services in Edge Computing	Ahmed Omara et.al.	2503.19318	translate	read	null
2025-03-24	Equivariant Image Modeling	Ruixiao Dong et.al.	2503.18948	translate	read	link
2025-03-24	Training-free Diffusion Acceleration with Bottleneck Sampling	Ye Tian et.al.	2503.18940	translate	read	link
2025-03-24	SKDU at De-Factify 4.0: Vision Transformer with Data Augmentation for AI-Generated Image Detection	Shrikant Malviya et.al.	2503.18812	translate	read	null
2025-03-24	Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation	Qin Wang et.al.	2503.18753	translate	read	null
2025-03-24	Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings	Cong Liu et.al.	2503.18719	translate	read	null
2025-03-24	Adventurer: Exploration with BiGAN for Deep Reinforcement Learning	Yongshuai Liu et.al.	2503.18612	translate	read	null
2025-03-24	Anchor-based oversampling for imbalanced tabular data via contrastive and adversarial learning	Hadi Mohammadi et.al.	2503.18569	translate	read	null
2025-03-24	Advancing Cross-Organ Domain Generalization with Test-Time Style Transfer and Diversity Enhancement	Biwen Meng et.al.	2503.18567	translate	read	null
2025-03-24	Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models	Bin Li et.al.	2503.18556	translate	read	link
2025-03-24	PALATE: Peculiar Application of the Law of Total Expectation to Enhance the Evaluation of Deep Generative Models	Tadeusz Dziarmaga et.al.	2503.18462	translate	read	null
2025-03-21	Vision Transformer Based Semantic Communications for Next Generation Wireless Networks	Muhammad Ahmed Mohsin et.al.	2503.17275	translate	read	null
2025-03-21	Leveraging Text-to-Image Generation for Handling Spurious Correlation	Aryan Yazdan Parast et.al.	2503.17226	translate	read	null
2025-03-21	Generative adversarial framework to calibrate excursion set models for the 3D morphology of all-solid-state battery cathodes	Orkun Furat et.al.	2503.17171	translate	read	null
2025-03-21	D2C: Unlocking the Potential of Continuous Autoregressive Image Generation with Discrete Tokens	Panpan Wang et.al.	2503.17155	translate	read	null
2025-03-21	HiFi-Stream: Streaming Speech Enhancement with Generative Adversarial Networks	Ekaterina Dmitrieva et.al.	2503.17141	translate	read	null
2025-03-21	Halton Scheduler For Masked Generative Image Transformer	Victor Besnier et.al.	2503.17076	translate	read	link
2025-03-21	Zero-Shot Styled Text Image Generation, but Make It Autoregressive	Vittorio Pippi et.al.	2503.17074	translate	read	link
2025-03-21	DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech	Yongkang Cheng et.al.	2503.17059	translate	read	null
2025-03-21	Multiple Ultrasound Image Generation based on Tuned Alignment of Amplitude Hologram over Spatially non-Uniform Ultrasound Source	Keisuke Hasegawa et.al.	2503.16949	translate	read	null
2025-03-21	When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO	Lingfan Zhang et.al.	2503.16921	translate	read	null
2025-03-20	Tokenize Image as a Set	Zigang Geng et.al.	2503.16425	translate	read	link
2025-03-20	SynCity: Training-Free Generation of 3D Worlds	Paul Engstler et.al.	2503.16420	translate	read	link
2025-03-20	InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity	Liming Jiang et.al.	2503.16418	translate	read	link
2025-03-20	VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness	SeungJu Cha et.al.	2503.16406	translate	read	null
2025-03-20	LaPIG: Cross-Modal Generation of Paired Thermal and Visible Facial Images	Leyang Wang et.al.	2503.16376	translate	read	null
2025-03-20	Ultra-Resolution Adaptation with Ease	Ruonan Yu et.al.	2503.16322	translate	read	link
2025-03-20	Improving Autoregressive Image Generation through Coarse-to-Fine Token Prediction	Ziyao Guo et.al.	2503.16194	translate	read	link
2025-03-20	FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing	Tianyi Wei et.al.	2503.16153	translate	read	null
2025-03-20	Multi-Prompt Style Interpolation for Fine-Grained Artistic Control	Lei Chen et.al.	2503.16133	translate	read	null
2025-03-20	Controllable Segmentation-Based Text-Guided Style Editing	Jingwen Li et.al.	2503.16129	translate	read	null
2025-03-19	FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers	Ruichen Chen et.al.	2503.15465	translate	read	null
2025-03-19	Di $\mathtt{[M]}$ O: Distilling Masked Diffusion Models into One-step Generator	Yuanzhi Zhu et.al.	2503.15457	translate	read	null
2025-03-19	Visual Persona: Foundation Model for Full-Body Human Customization	Jisu Nam et.al.	2503.15406	translate	read	null
2025-03-19	TruthLens:A Training-Free Paradigm for DeepFake Detection	Ritabrata Chakraborty et.al.	2503.15342	translate	read	null
2025-03-19	TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models	Teng-Fang Hsiao et.al.	2503.15283	translate	read	null
2025-03-19	LEGION: Learning to Ground and Explain for Synthetic Image Detection	Hengrui Kang et.al.	2503.15264	translate	read	link
2025-03-19	Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization	Feifei Li et.al.	2503.15197	translate	read	null
2025-03-19	Volumetric Reconstruction From Partial Views for Task-Oriented Grasping	Fujian Yan et.al.	2503.15167	translate	read	null
2025-03-19	Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis	Imanol G. Estepa et.al.	2503.15060	translate	read	null
2025-03-19	Texture-Aware StarGAN for CT data harmonisation	Francesco Di Feola et.al.	2503.15058	translate	read	null
2025-03-18	Deeply Supervised Flow-Based Generative Models	Inkyu Shin et.al.	2503.14494	translate	read	null
2025-03-18	DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers	Minglei Shi et.al.	2503.14487	translate	read	null
2025-03-18	ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing	Yulin Pan et.al.	2503.14482	translate	read	null
2025-03-18	RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment	Chao Wang et.al.	2503.14358	translate	read	null
2025-03-18	Free-Lunch Color-Texture Disentanglement for Stylized Image Generation	Jiang Qin et.al.	2503.14275	translate	read	null
2025-03-18	DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection	Jaewoo Song et.al.	2503.13985	translate	read	null
2025-03-18	SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model	Xinqing Li et.al.	2503.13952	translate	read	null
2025-03-18	Scale-Aware Contrastive Reverse Distillation for Unsupervised Medical Anomaly Detection	Chunlei Li et.al.	2503.13828	translate	read	null
2025-03-18	VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences	Anukriti Singh et.al.	2503.13817	translate	read	null
2025-03-17	Unified Autoregressive Visual Generation and Understanding with Continuous Tokens	Lijie Fan et.al.	2503.13436	translate	read	null
2025-03-17	BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing	Yaowei Li et.al.	2503.13434	translate	read	link
2025-03-17	RainScaleGAN: a Conditional Generative Adversarial Network for Rainfall Downscaling	Marcello Iotti et.al.	2503.13316	translate	read	null
2025-03-17	MAME: Multidimensional Adaptive Metamer Exploration with Human Perceptual Feedback	Mina Kamao et.al.	2503.13212	translate	read	null
2025-03-17	MedLoRD: A Medical Low-Resource Diffusion Model for High-Resolution 3D CT Image Synthesis	Marvin Seyfarth et.al.	2503.13211	translate	read	null
2025-03-17	Rewards Are Enough for Fast Photo-Realistic Text-to-image Generation	Yihong Luo et.al.	2503.13070	translate	read	link
2025-03-17	FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks	Tong Lei et.al.	2503.12936	translate	read	null
2025-03-17	DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models	Dewei Zhou et.al.	2503.12885	translate	read	link
2025-03-17	Optimizing Ansatz Design in Quantum Generative Adversarial Networks Using Large Language Models	Kento Ueda et.al.	2503.12884	translate	read	null
2025-03-17	High-Resolution Range-Doppler Imaging from One-Bit PMCW Radar via Generative Adversarial Networks	Jingxian Wang et.al.	2503.12841	translate	read	null
2025-03-14	T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation	Seyed Mohammad Hadi Hosseini et.al.	2503.11481	translate	read	link
2025-03-14	MTV-Inpaint: Multi-Task Long Video Inpainting	Shiyuan Yang et.al.	2503.11412	translate	read	link
2025-03-14	Safe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking	Ziyi Wang et.al.	2503.11324	translate	read	null
2025-03-14	Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards	Zijing Hu et.al.	2503.11240	translate	read	link
2025-03-14	Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption	Du Chen et.al.	2503.11221	translate	read	link
2025-03-14	Simulating Dual-Pixel Images From Ray Tracing For Depth Estimation	Fengchen He et.al.	2503.11213	translate	read	null
2025-03-14	Provenance Detection for AI-Generated Images: Combining Perceptual Hashing, Homomorphic Encryption, and AI Detection Models	Shree Singhi et.al.	2503.11195	translate	read	null
2025-03-14	Direction-Aware Diagonal Autoregressive Image Generation	Yijia Xu et.al.	2503.11129	translate	read	null
2025-03-14	Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models	Zhenguang Liu et.al.	2503.11071	translate	read	null
2025-03-14	Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization	Kyle Sargent et.al.	2503.11056	translate	read	null
2025-03-13	GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing	Rongyao Fang et.al.	2503.10639	translate	read	link
2025-03-13	DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation	Chen Chen et.al.	2503.10618	translate	read	null
2025-03-13	ConsisLoRA: Enhancing Content and Style Consistency for LoRA-based Style Transfer	Bolin Chen et.al.	2503.10614	translate	read	link
2025-03-13	Autoregressive Image Generation with Randomized Parallel Decoding	Haopeng Li et.al.	2503.10568	translate	read	link
2025-03-13	RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation	Yuwen Du et.al.	2503.10410	translate	read	link
2025-03-13	RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models	Yijing Lin et.al.	2503.10406	translate	read	link
2025-03-13	ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation	Zirun Guo et.al.	2503.10358	translate	read	null
2025-03-13	Do I look like a `cat.n.01` to you? A Taxonomy Image Generation Benchmark	Viktor Moskvoretskii et.al.	2503.10357	translate	read	null
2025-03-13	MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment	Hao Zhou et.al.	2503.10287	translate	read	null
2025-03-13	PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models	Runze He et.al.	2503.10127	translate	read	null
2025-03-12	FCaS: Fine-grained Cardiac Image Synthesis based on 3D Template Conditional Diffusion Model	Jiahao Xia et.al.	2503.09560	translate	read	null
2025-03-12	DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction	Junjie Zhou et.al.	2503.09491	translate	read	null
2025-03-12	PromptMap: An Alternative Interaction Style for AI-Based Image Generation	Krzysztof Adamkiewicz et.al.	2503.09436	translate	read	null
2025-03-12	LHC Triggers using FPGA Image Recognition	James Brooke et.al.	2503.09428	translate	read	null
2025-03-12	Revealing the Implicit Noise-based Imprint of Generative Models	Xinghan Li et.al.	2503.09314	translate	read	null
2025-03-12	Revealing Unintentional Information Leakage in Low-Dimensional Facial Portrait Representations	Kathleen Anderson et.al.	2503.09306	translate	read	null
2025-03-12	UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer	Haoxuan Wang et.al.	2503.09277	translate	read	null
2025-03-12	NAMI: Efficient Image Generation via Progressive Rectified Flow Transformers	Yuhang Ma et.al.	2503.09242	translate	read	null
2025-03-12	Active Learning Inspired ControlNet Guidance for Augmenting Semantic Segmentation Datasets	Hannah Kniesel et.al.	2503.09221	translate	read	null
2025-03-12	WonderVerse: Extendable 3D Scene Generation with Video Generative Models	Hao Feng et.al.	2503.09160	translate	read	null
2025-03-11	GarmentCrafter: Progressive Novel View Synthesis for Single-View 3D Garment Reconstruction and Editing	Yuanhao Wang et.al.	2503.08678	translate	read	null
2025-03-11	Generating Robot Constitutions & Benchmarks for Semantic Safety	Pierre Sermanet et.al.	2503.08663	translate	read	link
2025-03-11	YuE: Scaling Open Foundation Models for Long-Form Music Generation	Ruibin Yuan et.al.	2503.08638	translate	read	link
2025-03-11	LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization	Xianfeng Wu et.al.	2503.08619	translate	read	link
2025-03-11	CellStyle: Improved Zero-Shot Cell Segmentation via Style Transfer	Rüveyda Yilmaz et.al.	2503.08603	translate	read	null
2025-03-11	DISTINGUISH Workflow: A New Paradigm of Dynamic Well Placement Using Generative Machine Learning	Sergey Alyaev et.al.	2503.08509	translate	read	null
2025-03-11	Generalizable AI-Generated Image Detection Based on Fractal Self-Similarity in the Spectrum	Shengpeng Xiao et.al.	2503.08484	translate	read	null
2025-03-11	GAS-NeRF: Geometry-Aware Stylization of Dynamic Radiance Fields	Nhat Phuong Anh Vu et.al.	2503.08483	translate	read	null
2025-03-11	DyArtbank: Diverse Artistic Style Transfer via Pre-trained Stable Diffusion and Dynamic Style Prompt Artbank	Zhanjie Zhang et.al.	2503.08392	translate	read	null
2025-03-11	Layton: Latent Consistency Tokenizer for 1024-pixel Image Reconstruction and Generation by 256 Tokens	Qingsong Xie et.al.	2503.08377	translate	read	link
2025-03-10	V2Flow: Unifying Visual Tokenization and Large Language Model Vocabularies for Autoregressive Image Generation	Guiwei Zhang et.al.	2503.07493	translate	read	null
2025-03-10	NeAS: 3D Reconstruction from X-ray Images using Neural Attenuation Surface	Chengrui Zhu et.al.	2503.07491	translate	read	null
2025-03-10	GenAIReading: Augmenting Human Cognition with Interactive Digital Textbooks Using Large Language Models and Image Generation Models	Ryugo Morita et.al.	2503.07463	translate	read	null
2025-03-10	PersonaBooth: Personalized Text-to-Motion Generation	Boeun Kim et.al.	2503.07390	translate	read	null
2025-03-10	TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models	Ruidong Chen et.al.	2503.07389	translate	read	link
2025-03-10	Inversion-Free Video Style Transfer with Trajectory Reset Attention Control and Content-Style Bridging	Jiang Lin et.al.	2503.07363	translate	read	null
2025-03-10	Unleashing the Potential of Large Language Models for Text-to-Image Generation through Autoregressive Representation Alignment	Xing Xie et.al.	2503.07334	translate	read	link
2025-03-10	AttenST: A Training-Free Attention-Driven Style Transfer Framework with Pre-Trained Diffusion Models	Bo Huang et.al.	2503.07307	translate	read	link
2025-03-10	WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation	Yuwei Niu et.al.	2503.07265	translate	read	link
2025-03-10	Synthetic Lung X-ray Generation through Cross-Attention and Affinity Transformation	Ruochen Pi et.al.	2503.07209	translate	read	null
2025-03-10	Effective and Efficient Masked Image Generation Models	Zebin You et.al.	2503.07197	translate	read	link
2025-03-10	MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction	Hung Q. Vo et.al.	2503.07157	translate	read	null
2025-03-07	VideoPainter: Any-length Video Inpainting and Editing with Plug-and-Play Context Control	Yuxuan Bian et.al.	2503.05639	translate	read	link
2025-03-07	Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models	Zheng Li et.al.	2503.05595	translate	read	null
2025-03-07	PhysicsGen: Can Generative Models Learn from Images to Predict Complex Physical Relations?	Martin Spitznagel et.al.	2503.05333	translate	read	link
2025-03-07	Frequency Autoregressive Image Generation with Continuous Tokens	Hu Yu et.al.	2503.05305	translate	read	null
2025-03-07	Unified Reward Model for Multimodal Understanding and Generation	Yibin Wang et.al.	2503.05236	translate	read	link
2025-03-07	RecipeGen: A Benchmark for Real-World Recipe Image Generation	Ruoxuan Zhang et.al.	2503.05228	translate	read	null
2025-03-07	Development and Enhancement of Text-to-Image Diffusion Models	Rajdeep Roshan Sahu et.al.	2503.05149	translate	read	null
2025-03-07	Accelerated Patient-specific Non-Cartesian MRI Reconstruction using Implicit Neural Representations	Di Xu et.al.	2503.05051	translate	read	null
2025-03-06	Quantum generative adversarial networks for gluon initiated jets generation	Rey Guadarrama et.al.	2503.05044	translate	read	null
2025-03-06	Iris Style Transfer: Enhancing Iris Recognition with Style Features and Privacy Preservation through Neural Style Transfer	Mengdi Wang et.al.	2503.04707	translate	read	null
2025-03-06	Gradient-descent methods for fast quantum state tomography	Akshay Gaikwad et.al.	2503.04526	translate	read	null
2025-03-06	IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement	Zhihao Shi et.al.	2503.04501	translate	read	null
2025-03-06	ObjMST: An Object-Focused Multimodal Style Transfer Framework	Chanda Grover Kamra et.al.	2503.04353	translate	read	null
2025-03-06	S2Gaussian: Sparse-View Super-Resolution 3D Gaussian Splatting	Yecong Wan et.al.	2503.04314	translate	read	null
2025-03-06	ControlFill: Spatially Adjustable Image Inpainting from Prompt Learning	Boseong Jeon et.al.	2503.04268	translate	read	null
2025-03-06	Synthetic Data is an Elegant GIFT for Continual Vision-Language Models	Bin Wu et.al.	2503.04229	translate	read	null
2025-03-06	Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models	Rui Jiang et.al.	2503.04215	translate	read	null
2025-03-06	SCSA: A Plug-and-Play Semantic Continuous-Sparse Attention for Arbitrary Semantic Style Transfer	Chunnan Shang et.al.	2503.04119	translate	read	null
2025-03-06	Underlying Semantic Diffusion for Effective and Efficient In-Context Learning	Zhong Ji et.al.	2503.04050	translate	read	null
2025-03-05	A Generative Approach to High Fidelity 3D Reconstruction from Text Data	Venkat Kumar R et.al.	2503.03664	translate	read	null
2025-03-05	Generative Artificial Intelligence in Robotic Manipulation: A Survey	Kun Zhang et.al.	2503.03464	translate	read	null
2025-03-05	GenColor: Generative Color-Concept Association in Visual Design	Yihan Hou et.al.	2503.03236	translate	read	null
2025-03-05	An Analytical Theory of Power Law Spectral Bias in the Learning Dynamics of Diffusion Models	Binxu Wang et.al.	2503.03206	translate	read	null
2025-03-05	Find Matching Faces Based On Face Parameters	Setu A. Bhatt et.al.	2503.03204	translate	read	null
2025-03-05	From Architectural Sketch to Conceptual Representation: Using Structure-Aware Diffusion Model to Generate Renderings of School Buildings	Zhengyang Wang et.al.	2503.03090	translate	read	null
2025-03-05	Multi-View Depth Consistent Image Generation Using Generative AI Models: Application on Architectural Design of University Buildings	Xusheng Du et.al.	2503.03068	translate	read	null
2025-03-04	Can Diffusion Models Provide Rigorous Uncertainty Quantification for Bayesian Inverse Problems?	Evan Scope Crafts et.al.	2503.03007	translate	read	link
2025-03-04	Robust time series generation via Schrödinger Bridge: a comprehensive evaluation	Alexandre Alouadi et.al.	2503.02943	translate	read	null
2025-03-04	ARINAR: Bi-Level Autoregressive Feature-by-Feature Generative Models	Qinyu Zhao et.al.	2503.02883	translate	read	link
2025-03-04	Large-Angle Convergent-Beam Electron Diffraction Patterns via Conditional Generative Adversarial Networks	Joseph. J Webb et.al.	2503.02852	translate	read	null
2025-03-04	Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts	Marta Skreta et.al.	2503.02819	translate	read	link
2025-03-04	Undertrained Image Reconstruction for Realistic Degradation in Blind Image Super-Resolution	Ru Ito et.al.	2503.02767	translate	read	null
2025-03-04	Generative Modeling of Microweather Wind Velocities for Urban Air Mobility	Tristan A. Shah et.al.	2503.02690	translate	read	null
2025-03-04	YARE-GAN: Yet Another Resting State EEG-GAN	Yeganeh Farahzadi et.al.	2503.02636	translate	read	null
2025-03-04	SPG: Improving Motion Diffusion by Smooth Perturbation Guidance	Boseong Jeon et.al.	2503.02577	translate	read	null
2025-03-04	PVTree: Realistic and Controllable Palm Vein Generation for Recognition Tasks	Sheng Shang et.al.	2503.02547	translate	read	null
2025-03-04	RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification	Zhen Yang et.al.	2503.02537	translate	read	link
2025-03-04	Q&C: When Quantization Meets Cache in Efficient Image Generation	Xin Ding et.al.	2503.02508	translate	read	null
2025-03-03	MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing	Xueyun Tian et.al.	2502.21291	translate	read	link

(<a href=../Image_Generation.md>back to Image Generation</a>)