Image Generation - 2025-12

Publish Date Title Authors PDF Translate Read Code
2025-12-31 Comparative Evaluation of CNN Architectures for Neural Style Transfer in Indonesian Batik Motif Generation: A Comprehensive Study Happy Gery Pangestu et.al. 2601.00888 translate read null
2025-12-25 Can Generative Models Actually Forge Realistic Identity Documents? Alexander Vinogradov et.al. 2601.00829 translate read null
2025-12-24 Speak the Art: A Direct Speech to Image Generation Framework Mariam Saeed et.al. 2601.00827 translate read null
2025-12-31 Compositional Diffusion with Guided Search for Long-Horizon Planning Utkarsh A Mishra et.al. 2601.00126 translate read null
2025-12-31 Are First-Order Diffusion Samplers Really Slower? A Fast Forward-Value Approach Yuchen Jiao et.al. 2512.24927 translate read null
2025-12-31 Resolving the Origins and Pathways of Ionizing Radiation Escape with UV Integral Field Spectroscopy Cody Carr et.al. 2512.24895 translate read null
2025-12-30 F2IDiff: Real-world Image Super-resolution using Feature to Image Diffusion Foundation Model Devendra K. Jangid et.al. 2512.24473 translate read null
2025-12-30 Medical Image Classification on Imbalanced Data Using ProGAN and SMA-Optimized ResNet: Application to COVID-19 Sina Jahromi et.al. 2512.24214 translate read null
2025-12-30 RainFusion2.0: Temporal-Spatial Awareness and Hardware-Efficient Block-wise Sparse Attention Aiyue Chen et.al. 2512.24086 translate read null
2025-12-29 Lifelong Domain Adaptive 3D Human Pose Estimation Qucheng Peng et.al. 2512.23860 translate read null
2025-12-29 MiMo-Audio: Audio Language Models are Few-Shot Learners Xiaomi LLM-Core Team et.al. 2512.23808 translate read null
2025-12-29 Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion Hau-Shiang Shiu et.al. 2512.23709 translate read null
2025-12-29 PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation Zongsheng Cao et.al. 2512.23546 translate read null
2025-12-29 Iterative Inference-time Scaling with Adaptive Frequency Steering for Image Super-Resolution Hexin Zhang et.al. 2512.23532 translate read null
2025-12-29 UniHetero: Could Generation Enhance Understanding for Vision-Language-Model at Large Data Scale? Fengjiao Chen et.al. 2512.23512 translate read null
2025-12-29 SPER: Accelerating Progressive Entity Resolution via Stochastic Bipartite Maximization Dimitrios Karapiperis et.al. 2512.23491 translate read null
2025-12-29 Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators Bohan Xiao et.al. 2512.23463 translate read null
2025-12-29 Direct Diffusion Score Preference Optimization via Stepwise Contrastive Policy-Pair Supervision Dohyun Kim et.al. 2512.23426 translate read null
2025-12-29 Multi Agents Semantic Emotion Aligned Music to Image Generation with Music Derived Captions Junchang Shi et.al. 2512.23320 translate read null
2025-12-29 Flow2GAN: Hybrid Flow Matching and GAN with Multi-Resolution Network for Few-step High-Fidelity Audio Generation Zengwei Yao et.al. 2512.23278 translate read null
2025-12-29 RS-Prune: Training-Free Data Pruning at High Ratios for Efficient Remote Sensing Diffusion Foundation Models Fan Wei et.al. 2512.23239 translate read null
2025-12-29 Anomaly Detection by Effectively Leveraging Synthetic Images Sungho Kang et.al. 2512.23227 translate read null
2025-12-29 Bridging Your Imagination with Audio-Video Generation via a Unified Director Jiaxu Zhang et.al. 2512.23222 translate read null
2025-12-29 PathoSyn: Imaging-Pathology MRI Synthesis via Disentangled Deviation Diffusion Jian Wang et.al. 2512.23130 translate read null
2025-12-28 RealCamo: Boosting Real Camouflage Synthesis with Layout Controls and Textual-Visual Guidance Chunyuan Chen et.al. 2512.22974 translate read null
2025-12-28 KANO: Kolmogorov-Arnold Neural Operator for Image Super-Resolution Chenyu Li et.al. 2512.22822 translate read null
2025-12-28 SwinCCIR: An end-to-end deep network for Compton camera imaging reconstruction Minghao Dong et.al. 2512.22766 translate read null
2025-12-27 CritiFusion: Semantic Critique and Spectral Alignment for Faithful Text-to-Image Generation ZhenQi Chen et.al. 2512.22681 translate read null
2025-12-27 Quantum Generative Models for Computational Fluid Dynamics: A First Exploration of Latent Space Learning in Lattice Boltzmann Simulations Achraf Hsain et.al. 2512.22672 translate read null
2025-12-27 FinPercep-RM: A Fine-grained Reward Model and Co-evolutionary Curriculum for RL-based Real-world Super-Resolution Yidi Liu et.al. 2512.22647 translate read null
2025-12-27 Quantum-Circuit Framework for Two-Stage Stochastic Programming via QAOA Integrated with a Quantum Generative Neural Network Taihei Kuroiwa et.al. 2512.22434 translate read null
2025-12-26 Self-Evaluation Unlocks Any-Step Text-to-Image Generation Xin Yu et.al. 2512.22374 translate read null
2025-12-25 Human-Aligned Generative Perception: Bridging Psychophysics and Generative Models Antara Titikhsha et.al. 2512.22272 translate read null
2025-12-22 Super-Resolution Enhancement of Medical Images Based on Diffusion Model: An Optimization Scheme for Low-Resolution Gastric Images Haozhe Jia et.al. 2512.22209 translate read null
2025-12-21 Complex Swin Transformer for Accelerating Enhanced SMWI Reconstruction Muhammad Usman et.al. 2512.22202 translate read null
2025-12-17 AudioGAN: A Compact and Efficient Framework for Real-Time High-Fidelity Text-to-Audio Generation HaeChun Chung et.al. 2512.22166 translate read null
2025-12-26 DPAR: Dynamic Patchification for Efficient Autoregressive Visual Generation Divyansh Srivastava et.al. 2512.21867 translate read null
2025-12-25 Deep Generative Models for Synthetic Financial Data: Applications to Portfolio and Risk Modeling Christophe D. Hounwanou et.al. 2512.21798 translate read null
2025-12-25 Diffusion Posterior Sampling for Super-Resolution under Gaussian Measurement Noise Abu Hanif Muhammad Syarubany et.al. 2512.21797 translate read null
2025-12-25 Synthetic Financial Data Generation for Enhanced Financial Modelling Christophe D. Hounwanou et.al. 2512.21791 translate read null
2025-12-25 InstructMoLE: Instruction-Guided Mixture of Low-rank Experts for Multi-Conditional Image Generation Jinqi Xiao et.al. 2512.21788 translate read null
2025-12-25 FUSE: Unifying Spectral and Semantic Cues for Robust AI-Generated Image Detection Md. Zahid Hossain et.al. 2512.21695 translate read null
2025-12-25 BeHGAN: Bengali Handwritten Word Generation from Plain Text Using Generative Adversarial Networks Md. Rakibul Islam et.al. 2512.21694 translate read null
2025-12-25 Dictionary-Transform Generative Adversarial Networks Angshul Majumdar et.al. 2512.21677 translate read null
2025-12-25 UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture Shuo Cao et.al. 2512.21675 translate read null
2025-12-25 Training-Free Disentangled Text-Guided Image Editing via Sparse Latent Constraints Mutiara Shabrina et.al. 2512.21637 translate read null
2025-12-25 Residual Prior Diffusion: A Probabilistic Framework Integrating Coarse Latent Priors with Diffusion Models Takuro Kutsuna et.al. 2512.21593 translate read null
2025-12-25 DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO Henglin Liu et.al. 2512.21514 translate read null
2025-12-25 Fixed-Threshold Evaluation of a Hybrid CNN-ViT for AI-Generated Image Detection Across Photos and Art Md Ashik Khan et.al. 2512.21512 translate read null
2025-12-24 A Reinforcement Learning Approach to Synthetic Data Generation Natalia Espinosa-Dice et.al. 2512.21395 translate read null
2025-12-24 GriDiT: Factorized Grid-Based Diffusion for Efficient Long Image Sequence Generation Snehal Singh Tomar et.al. 2512.21276 translate read null
2025-12-24 A Turn Toward Better Alignment: Few-Shot Generative Adaptation with Equivariant Feature Rotation Chenghao Xu et.al. 2512.21174 translate read null
2025-12-24 FreeInpaint: Tuning-free Prompt Alignment and Visual Rationality Enhancement in Image Inpainting Chao Gong et.al. 2512.21104 translate read null
2025-12-24 Beyond Pixel Simulation: Pathology Image Generation via Diagnostic Semantic Tokens and Prototype Control Minghao Han et.al. 2512.21058 translate read null
2025-12-24 Matrix Completion Via Reweighted Logarithmic Norm Minimization Zhijie Wang et.al. 2512.21050 translate read null
2025-12-24 A Large-Depth-Range Layer-Based Hologram Dataset for Machine Learning-Based 3D Computer-Generated Holography Jaehong Lee et.al. 2512.21040 translate read null
2025-12-24 Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising Yiwen Shan et.al. 2512.21038 translate read null
2025-12-24 Enhancing diffusion models with Gaussianization preprocessing Li Cunzhi et.al. 2512.21020 translate read null
2025-12-24 FluencyVE: Marrying Temporal-Aware Mamba with Bypass Attention for Video Editing Mingshu Cai et.al. 2512.21015 translate read null
2025-12-19 Dominating vs. Dominated: Generative Collapse in Diffusion Models Hayeon Jeong et.al. 2512.20666 translate read null
2025-12-12 Flow Gym Francesco Banelli et.al. 2512.20642 translate read null
2025-12-23 Optical Pin Beams: Research Progresses and Emerging Applications Ze Zhang et.al. 2512.20541 translate read null
2025-12-23 Multi-temporal Adaptive Red-Green-Blue and Long-Wave Infrared Fusion for You Only Look Once-Based Landmine Detection from Unmanned Aerial Systems James E. Gallagher et.al. 2512.20487 translate read null
2025-12-23 UTDesign: A Unified Framework for Stylized Text Editing and Generation in Graphic Design Images Yiming Zhao et.al. 2512.20479 translate read null
2025-12-23 Resolution and Robustness Bounds for Reconstructive Spectrometers Changyan Zhu et.al. 2512.20415 translate read null
2025-12-23 CRAFT: Continuous Reasoning and Agentic Feedback Tuning for Multimodal Text-to-Image Generation V. Kovalev et.al. 2512.20362 translate read null
2025-12-23 Field-Space Attention for Structure-Preserving Earth System Transformers Maximilian Witte et.al. 2512.20350 translate read null
2025-12-23 HGAN-SDEs: Learning Neural Stochastic Differential Equations with Hermite-Guided Adversarial Training Yuanjian Xu et.al. 2512.20272 translate read null
2025-12-23 How I Met Your Bias: Investigating Bias Amplification in Diffusion Models Nathan Roos et.al. 2512.20233 translate read null
2025-12-23 Generative Latent Coding for Ultra-Low Bitrate Image Compression Zhaoyang Jia et.al. 2512.20194 translate read null
2025-12-23 Target Classification for Integrated Sensing and Communication in Industrial Deployments Luca Barbieri et.al. 2512.20154 translate read null
2025-12-23 IoT-based Android Malware Detection Using Graph Neural Network With Adversarial Defense Rahul Yumlembam et.al. 2512.20004 translate read null
2025-12-22 Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning Mojtaba Safari et.al. 2512.19676 translate read null
2025-12-22 Generative diffusion models for agricultural AI: plant image generation, indoor-to-outdoor translation, and expert preference alignment Da Tan et.al. 2512.19632 translate read null
2025-12-22 Rethinking Coupled Tensor Analysis for Hyperspectral Super-Resolution: Recoverable Modeling Under Endmember Variability Meng Ding et.al. 2512.19489 translate read null
2025-12-22 Emotion-Director: Bridging Affective Shortcut in Emotion-Oriented Image Generation Guoli Jia et.al. 2512.19479 translate read null
2025-12-22 dMLLM-TTS: Self-Verified and Efficient Test-Time Scaling for Diffusion Multi-Modal Large Language Models Yi Xin et.al. 2512.19433 translate read null
2025-12-22 GANeXt: A Fully ConvNeXt-Enhanced Generative Adversarial Network for MRI- and CBCT-to-CT Synthesis Siyuan Mei et.al. 2512.19336 translate read null
2025-12-22 MixFlow Training: Alleviating Exposure Bias with Slowed Interpolation Mixture Hui Li et.al. 2512.19311 translate read null
2025-12-22 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory Xinyang Song et.al. 2512.19271 translate read null
2025-12-22 VisionDirector: Vision-Language Guided Closed-Loop Refinement for Generative Image Synthesis Meng Chu et.al. 2512.19243 translate read null
2025-12-22 Regression generation adversarial network based on dual data evaluation strategy for industrial application Zesen Wang et.al. 2512.19232 translate read null
2025-12-22 ALMA Observations of Cold Methanol Gas in the Large Magellanic Cloud (LMC): N79 South GMC Suman Kumar Mondal et.al. 2512.19185 translate read null
2025-12-22 Efficient Personalization of Generative Models via Optimal Experimental Design Guy Schacht et.al. 2512.19057 translate read null
2025-12-22 AI-Driven Subcarrier-Level CQI Feedback Chengyong Jiang et.al. 2512.19054 translate read null
2025-12-22 An Fluid Antenna Array-Enabled DOA Estimation Method: End-Fire Effect Suppression Jiaji Ren et.al. 2512.18981 translate read null
2025-12-22 LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer Raina Panda et.al. 2512.18930 translate read null
2025-12-21 Generative Modeling through Spectral Analysis of Koopman Operator Yuanchao Xu et.al. 2512.18837 translate read null
2025-12-21 MaskFocus: Focusing Policy Optimization on Critical Steps for Masked Image Generation Guohui Zhang et.al. 2512.18766 translate read null
2025-12-21 Uni-Neur2Img: Unified Neural Signal-Guided Image Generation, Editing, and Stylization via Diffusion Transformers Xiyue Bai et.al. 2512.18635 translate read null
2025-12-21 Image-to-Image Translation with Generative Adversarial Network for Electrical Resistance Tomography Reconstruction Wejian Yan et.al. 2512.18557 translate read null
2025-12-20 Plasticine: A Traceable Diffusion Model for Medical Image Translation Tianyang Zhanng et.al. 2512.18455 translate read null
2025-12-20 Imaging the LkCa 15 system in polarimetry and total intensity without self-subtraction artefacts C. Swastik et.al. 2512.18439 translate read null
2025-12-20 Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models Chao Wen et.al. 2512.18388 translate read null
2025-12-20 PSI3D: Plug-and-Play 3D Stochastic Inference with Slice-wise Latent Diffusion Prior Wenhan Guo et.al. 2512.18367 translate read null
2025-12-20 Loom: Diffusion-Transformer for Interleaved Generation Mingcheng Ye et.al. 2512.18254 translate read null
2025-12-20 Local Patches Meet Global Context: Scalable 3D Diffusion Priors for Computed Tomography Reconstruction Taewon Yang et.al. 2512.18161 translate read null
2025-12-19 SERA-H: Beyond Native Sentinel Spatial Limits for High-Resolution Canopy Height Mapping Thomas Boudras et.al. 2512.18128 translate read null
2025-12-17 SuperFlow: Training Flow Matching Models with RL on the Fly Kaijie Chen et.al. 2512.17951 translate read null
2025-12-19 Both Semantics and Reconstruction Matter: Making Representation Encoders Ready for Text-to-Image Generation and Editing Shilong Zhang et.al. 2512.17909 translate read null
2025-12-19 Inverse-Designed Phase Prediction in Digital Lasers Using Deep Learning and Transfer Learning Yu-Che Wu et.al. 2512.17879 translate read null
2025-12-19 InSPECT: Invariant Spectral Features Preservation of Diffusion Models Baohua Yan et.al. 2512.17873 translate read null
2025-12-19 UrbanDIFF: A Denoising Diffusion Model for Spatial Gap Filling of Urban Land Surface Temperature Under Dense Cloud Cover Arya Chavoshi et.al. 2512.17782 translate read null
2025-12-19 AdaptPrompt: Parameter-Efficient Adaptation of VLMs for Generalizable Deepfake Detection Yichen Jiang et.al. 2512.17730 translate read null
2025-12-19 An Empirical Study of Sampling Hyperparameters in Diffusion-Based Super-Resolution Yudhistira Arief Wibowo et.al. 2512.17675 translate read null
2025-12-19 Self-Supervised Weighted Image Guided Quantitative MRI Super-Resolution Alireza Samadifardheris et.al. 2512.17612 translate read null
2025-12-19 LumiCtrl : Learning Illuminant Prompts for Lighting Control in Personalized Text-to-Image Models Muhammad Atif Butt et.al. 2512.17489 translate read null
2025-12-19 Fetpype: An Open-Source Pipeline for Reproducible Fetal Brain MRI Analysis Thomas Sanchez et.al. 2512.17472 translate read null
2025-12-19 Super-resolution wavefront reconstruction in adaptive-optics with pyramid sensors Carlos M. Correia et.al. 2512.17469 translate read null
2025-12-19 Super-resolution-enabled atmospheric tomography for astronomical multi-wavefront-sensor adaptive-optics systems Carlos M. Correia et.al. 2512.17430 translate read null
2025-12-19 Beyond Semantic Features: Pixel-level Mapping for Generalized AI-Generated Image Detection Chenming Zhou et.al. 2512.17350 translate read null
2025-12-19 Multi-level distortion-aware deformable network for omnidirectional image super-resolution Cuixin Yang et.al. 2512.17343 translate read null
2025-12-19 A Benchmark for Ultra-High-Resolution Remote Sensing MLLMs Yunkai Dang et.al. 2512.17319 translate read null
2025-12-19 MatLat: Material Latent Space for PBR Texture Generation Kyeongmin Yeo et.al. 2512.17302 translate read null
2025-12-18 SFTok: Bridging the Performance Gap in Discrete Tokenizers Qihang Rao et.al. 2512.16910 translate read null
2025-12-18 FrameDiffuser: G-Buffer-Conditioned Diffusion for Neural Forward Frame Rendering Ole Beisswenger et.al. 2512.16670 translate read null
2025-12-18 REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion Giorgos Petsangourakis et.al. 2512.16636 translate read null
2025-12-18 DeContext as Defense: Safe Image Editing in Diffusion Transformers Linghui Shen et.al. 2512.16625 translate read null
2025-12-18 Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers Yifan Zhou et.al. 2512.16615 translate read null
2025-12-18 Yuan-TecSwin: A text conditioned Diffusion model with Swin-transformer blocks Shaohua Wu et.al. 2512.16586 translate read null
2025-12-18 StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models Senmao Li et.al. 2512.16483 translate read null
2025-12-18 Geometric Disentanglement of Text Embeddings for Subject-Consistent Text-to-Image Generation using A Single Prompt Shangxun Li et.al. 2512.16443 translate read null
2025-12-18 CogSR: Semantic-Aware Speech Super-Resolution via Chain-of-Thought Guided Flow Matching Jiajun Yuan et.al. 2512.16304 translate read null
2025-12-18 PixelArena: A benchmark for Pixel-Precision Visual Intelligence Feng Liang et.al. 2512.16303 translate read null
2025-12-18 Pixel Super-Resolved Fluorescence Lifetime Imaging Using Deep Learning Paloma Casteleiro Costa et.al. 2512.16266 translate read null
2025-12-18 Driving in Corner Case: A Real-World Adversarial Closed-Loop Evaluation Platform for End-to-End Autonomous Driving Jiaheng Geng et.al. 2512.16055 translate read null
2025-12-17 MCR-VQGAN: A Scalable and Cost-Effective Tau PET Synthesis Approach for Alzheimer’s Disease Imaging Jin Young Kim et.al. 2512.15947 translate read null
2025-12-17 Secure AI-Driven Super-Resolution for Real-Time Mixed Reality Applications Mohammad Waquas Usmani et.al. 2512.15823 translate read null
2025-12-13 Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real Yan Yang et.al. 2512.15774 translate read null
2025-12-17 Stylized Synthetic Augmentation further improves Corruption Robustness Georg Siedel et.al. 2512.15675 translate read null
2025-12-16 InpaintDPO: Mitigating Spatial Relationship Hallucinations in Foreground-conditioned Inpainting via Diverse Preference Optimization Qirui Li et.al. 2512.15644 translate read null
2025-12-16 ComMark: Covert and Robust Black-Box Model Watermarking with Compressed Samples Yunfei Yang et.al. 2512.15641 translate read null
2025-12-17 Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition Shengming Yin et.al. 2512.15603 translate read null
2025-12-17 VAAS: Vision-Attention Anomaly Scoring for Image Manipulation Detection in Digital Forensics Opeyemi Bamigbade et.al. 2512.15512 translate read null
2025-12-17 Copyright Infringement Risk Reduction via Chain-of-Thought and Task Instruction Prompting Neeraj Sarna et.al. 2512.15442 translate read null
2025-12-17 Can AI Generate more Comprehensive Test Scenarios? Review on Automated Driving Systems Test Scenario Generation Methods Ji Zhou et.al. 2512.15422 translate read null
2025-12-17 Time-Varying Audio Effect Modeling by End-to-End Adversarial Training Yann Bourdin et.al. 2512.15313 translate read null
2025-12-17 SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation Wangyu Wu et.al. 2512.15310 translate read null
2025-12-17 Quantum Machine Learning for Cybersecurity: A Taxonomy and Future Directions Siva Sai et.al. 2512.15286 translate read null
2025-12-17 MMMamba: A Versatile Cross-Modal In Context Fusion Framework for Pan-Sharpening and Zero-Shot Image Enhancement Yingying Wang et.al. 2512.15261 translate read null
2025-12-17 Is Nano Banana Pro a Low-Level Vision All-Rounder? A Comprehensive Evaluation on 14 Tasks and 40 Datasets Jialong Zuo et.al. 2512.15110 translate read null
2025-12-17 MVGSR: Multi-View Consistent 3D Gaussian Super-Resolution via Epipolar Guidance Kaizhe Zhang et.al. 2512.15048 translate read null
2025-12-16 Spherical Leech Quantization for Visual Tokenization and Generation Yue Zhao et.al. 2512.14697 translate read null
2025-12-16 MMGR: Multi-Modal Generative Reasoning Zefan Cai et.al. 2512.14691 translate read null
2025-12-16 JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction Atsuyuki Miyai et.al. 2512.14620 translate read null
2025-12-16 TAT: Task-Adaptive Transformer for All-in-One Medical Image Restoration Zhiwen Yang et.al. 2512.14550 translate read null
2025-12-16 Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space Xingfu Zhou et.al. 2512.14448 translate read null
2025-12-16 Separation-free exponential fitting with structured noise, with applications to inverse problems in parabolic PDEs Rami Katz et.al. 2512.14301 translate read null
2025-12-16 On fractal minimizers and potentials of occupation measures Michael Hinz et.al. 2512.14248 translate read null
2025-12-16 MFE-GAN: Efficient GAN-based Framework for Document Image Enhancement and Binarization with Multi-scale Feature Extraction Rui-Yang Ju et.al. 2512.14114 translate read null
2025-12-16 ViewMask-1-to-3: Multi-View Consistent Image Generation via Multimodal Diffusion Models Ruishu Zhu et.al. 2512.14099 translate read null
2025-12-16 OUSAC: Optimized Guidance Scheduling with Adaptive Caching for DiT Acceleration Ruitong Sun et.al. 2512.14096 translate read null
2025-12-16 Bridging Fidelity-Reality with Controllable One-Step Diffusion for Image Super-Resolution Hao Chen et.al. 2512.14061 translate read null
2025-12-16 Sparse-LaViDa: Sparse Multimodal Discrete Diffusion Language Models Shufan Li et.al. 2512.14008 translate read null
2025-12-16 An intercomparison of generative machine learning methods for downscaling precipitation at fine spatial scales Bryn Ward-Leikis et.al. 2512.13987 translate read null
2025-12-16 Super-Resolution Posterior Ocular Microvascular Imaging Using 3-D Ultrasound Localization Microscopy With a 32X32 Matrix Array Junhang Zhang et.al. 2512.13966 translate read null
2025-12-15 From Unlearning to UNBRANDING: A Benchmark for Trademark-Safe Text-to-Image Generation Dawid Malarz et.al. 2512.13953 translate read null
2025-12-15 An evaluation of SVBRDF Prediction from Generative Image Models for Appearance Modeling of 3D Scenes Alban Gauthier et.al. 2512.13950 translate read null
2025-12-15 Coarse-to-Fine Hierarchical Alignment for UAV-based Human Detection using Diffusion Models Wenda Li et.al. 2512.13869 translate read null
2025-12-15 Time-aware UNet and super-resolution deep residual networks for spatial downscaling Mika Sipilä et.al. 2512.13753 translate read null
2025-12-13 Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution Jacob Schnell et.al. 2512.13729 translate read null
2025-12-15 Directional Textual Inversion for Personalized Text-to-Image Generation Kunhee Kim et.al. 2512.13672 translate read null
2025-12-15 Fast label-free point-scanning super-resolution imaging for endoscopy Ning Xu et.al. 2512.13432 translate read null
2025-12-15 ALMA view on the nature of the compact VLA continuum sources in the massive young stellar object G25.65+1.05 N. N. Shakhvorostova et.al. 2512.13382 translate read null
2025-12-15 Super-resolving Herschel - a deep learning based deconvolution and denoising technique Dennis Koopmans et.al. 2512.13353 translate read null
2025-12-15 ShowTable: Unlocking Creative Table Visualization with Collaborative Reflection and Refinement Zhihang Liu et.al. 2512.13303 translate read null
2025-12-15 A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis Xianchao Guan et.al. 2512.13164 translate read null
2025-12-15 Bi-Erasing: A Bidirectional Framework for Concept Removal in Diffusion Models Hao Chen et.al. 2512.13039 translate read null
2025-12-15 JoDiffusion: Jointly Diffusing Image with Pixel-Level Annotations for Semantic Segmentation Promotion Haoyu Wang et.al. 2512.13014 translate read null
2025-12-15 Few-Step Distillation for Text-to-Image Generation: A Practical Guide Yifan Pu et.al. 2512.13006 translate read null
2025-12-15 SCAdapter: Content-Style Disentanglement for Diffusion Style Transfer Luan Thanh Trinh et.al. 2512.12963 translate read null
2025-12-15 Qonvolution: Towards Learning High-Frequency Signals with Queried Convolution Abhinav Kumar et.al. 2512.12898 translate read null
2025-12-14 Learning Common and Salient Generative Factors Between Two Image Datasets Yunlong He et.al. 2512.12800 translate read null
2025-12-14 Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling Yuran Wang et.al. 2512.12675 translate read null
2025-12-14 Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space Chengzhi Liu et.al. 2512.12623 translate read null
2025-12-14 Geometry-Aware Scene-Consistent Image Generation Cong Xie et.al. 2512.12598 translate read null
2025-12-14 Vision-Enhanced Large Language Models for High-Resolution Image Synthesis and Multimodal Data Interpretation Karthikeya KV et.al. 2512.12595 translate read null
2025-12-14 Differentiable Energy-Based Regularization in GANs: A Simulator-Based Exploration of VQE-Inspired Auxiliary Losses David Strnadel et.al. 2512.12581 translate read null
2025-12-14 SafeGen: Embedding Ethical Safeguards in Text-to-Image Generation Dang Phuong Nam et.al. 2512.12501 translate read null
2025-12-13 From Particles to Fields: Reframing Photon Mapping with Continuous Gaussian Photon Fields Jiachen Tao et.al. 2512.12459 translate read null
2025-12-13 Can GPT replace human raters? Validity and reliability of machine-generated norms for metaphors Veronica Mangiaterra et.al. 2512.12444 translate read null
2025-12-13 Anchoring Values in Temporal and Group Dimensions for Flow Matching Model Alignment Yawen Shao et.al. 2512.12387 translate read null
2025-12-13 Hellinger loss function for Generative Adversarial Networks Giovanni Saraceno et.al. 2512.12267 translate read null
2025-12-13 ProImage-Bench: Rubric-Based Evaluation for Professional Image Generation Minheng Ni et.al. 2512.12220 translate read null
2025-12-13 AutoMV: An Automatic Multi-Agent System for Music Video Generation Xiaoxuan Tang et.al. 2512.12196 translate read link
2025-12-13 A comparative study of generative models for child voice conversion Protima Nomo Sudro et.al. 2512.12129 translate read null
2025-12-12 CreativeVR: Diffusion-Prior-Guided Approach for Structure and Motion Restoration in Generative and Real Videos Tejas Panambur et.al. 2512.12060 translate read null
2025-12-12 From Earths to Super-Earths: Five New Small Planets Transiting M Dwarf Stars Jonathan Gomez Barrientos et.al. 2512.11971 translate read null
2025-12-09 Aesthetic Alignment Risks Assimilation: How Image Generation and Reward Models Reinforce Beauty Bias and Ideological “Censorship” Wenqi Marshall Guo et.al. 2512.11883 translate read null
2025-12-12 Smudged Fingerprints: A Systematic Evaluation of the Robustness of AI Image Fingerprints Kai Yao et.al. 2512.11771 translate read null
2025-12-12 Reducing Domain Gap with Diffusion-Based Domain Adaptation for Cell Counting Mohammad Dehghanmanshadi et.al. 2512.11763 translate read link
2025-12-12 SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder Minglei Shi et.al. 2512.11749 translate read link
2025-12-12 Reframing Music-Driven 2D Dance Pose Generation as Multi-Channel Image Generation Yan Zhang et.al. 2512.11720 translate read null
2025-12-12 EditMGT: Unleashing Potentials of Masked Generative Transformers in Image Editing Wei Chow et.al. 2512.11715 translate read null
2025-12-12 Fast and Explicit: Slice-to-Volume Reconstruction via 3D Gaussian Primitives with Analytic Point Spread Function Modeling Maik Dannecker et.al. 2512.11624 translate read null
2025-12-12 Super-Resolved Canopy Height Mapping from Sentinel-2 Time Series Using LiDAR HD Reference Data across Metropolitan France Ekaterina Kalinicheva et.al. 2512.11524 translate read null
2025-12-12 Exploring MLLM-Diffusion Information Transfer with MetaCanvas Han Lin et.al. 2512.11464 translate read null
2025-12-12 VFMF: World Modeling by Forecasting Vision Foundation Model Features Gabrijel Boduljak et.al. 2512.11225 translate read null
2025-12-11 Chemical composition and enrichment of the Centaurus cluster core seen by XRISM/Resolve F. Mernier et.al. 2512.11028 translate read null
2025-12-11 Generative Adversarial Variational Quantum Kolmogorov-Arnold Network Hikaru Wakaura et.al. 2512.11014 translate read null
2025-12-11 Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration Sicheng Mo et.al. 2512.10954 translate read null
2025-12-11 Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Yiwen Tang et.al. 2512.10949 translate read null
2025-12-11 GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting Madhav Agarwal et.al. 2512.10939 translate read null
2025-12-11 Quantifying classical and quantum bounds for resolving closely spaced, non-interacting, simultaneously emitting dipole sources in optical microscopy Armine I. Dingilian et.al. 2512.10889 translate read null
2025-12-11 Interpretable and Steerable Concept Bottleneck Sparse Autoencoders Akshay Kulkarni et.al. 2512.10805 translate read null
2025-12-11 OutLines: Modeling Spectral Lines from Winds, Bubbles, and Outflows Sophia R. Flury et.al. 2512.10650 translate read null
2025-12-11 Lang2Motion: Bridging Language and Motion through Joint Embedding Spaces Bishoy Galoaa et.al. 2512.10617 translate read null
2025-12-11 Topology-Guided Quantum GANs for Constrained Graph Generation Tobias Rohe et.al. 2512.10582 translate read null
2025-12-11 Blink: Dynamic Visual Token Resolution for Enhanced Multimodal Understanding Yuchen Feng et.al. 2512.10548 translate read null
2025-12-11 Topology-Agnostic Animal Motion Generation from Text Prompt Keyi Chen et.al. 2512.10352 translate read null
2025-12-11 Zero-shot Adaptation of Stable Diffusion via Plug-in Hierarchical Degradation Representation for Real-World Super-Resolution Yi-Cheng Liao et.al. 2512.10340 translate read null
2025-12-10 Independent Density Estimation Jiahao Liu et.al. 2512.10067 translate read null
2025-12-10 MetaVoxel: Joint Diffusion Modeling of Imaging and Clinical Metadata Yihao Liu et.al. 2512.10041 translate read null
2025-12-10 Hybrid Finite Element and Least Squares Support Vector Regression Method for solving Partial Differential Equations with Legendre Polynomial Kernels Maryam Babaei et.al. 2512.09967 translate read null
2025-12-10 DynaIP: Dynamic Image Prompt Adapter for Scalable Zero-shot Personalized Text-to-Image Generation Zhizhong Wang et.al. 2512.09814 translate read null
2025-12-10 Stylized Meta-Album: Group-bias injection with style transfer to study robustness against distribution shifts Romain Mussard et.al. 2512.09773 translate read null
2025-12-10 SynthPix: A lightspeed PIV images generator Antonio Terpin et.al. 2512.09664 translate read null
2025-12-10 A Dual-Domain Convolutional Network for Hyperspectral Single-Image Super-Resolution Murat Karayaka et.al. 2512.09546 translate read null
2025-12-10 FunPhase: A Periodic Functional Autoencoder for Motion Generation via Phase Manifolds Marco Pegoraro et.al. 2512.09423 translate read null
2025-12-10 LongT2IBench: A Benchmark for Evaluating Long Text-to-Image Generation with Graph-structured Annotations Zhichao Yang et.al. 2512.09271 translate read null
2025-12-10 OmniPSD: Layered PSD Generation with Diffusion Transformer Cheng Liu et.al. 2512.09247 translate read null
2025-12-09 Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation Hao Chen et.al. 2512.09185 translate read null
2025-12-09 SuperF: Neural Implicit Fields for Multi-Image Super-Resolution Sander Riisøen Jyhne et.al. 2512.09115 translate read null
2025-12-09 Food Image Generation on Multi-Noun Categories Xinyue Pan et.al. 2512.09095 translate read null
2025-12-09 AgentComp: From Agentic Reasoning to Compositional Mastery in Text-to-Image Models Arman Zarei et.al. 2512.09081 translate read null
2025-12-06 An Efficient Test-Time Scaling Approach for Image Generation Vignesh Sundaresha et.al. 2512.08985 translate read null
2025-12-09 OSMO: Open-Source Tactile Glove for Human-to-Robot Skill Transfer Jessica Yin et.al. 2512.08920 translate read null
2025-12-09 Differentially Private Synthetic Data Generation Using Context-Aware GANs Anantaa Kotal et.al. 2512.08869 translate read null
2025-12-09 CARLoS: Retrieval via Concise Assessment Representation of LoRAs at Scale Shahar Sarfaty et.al. 2512.08826 translate read null
2025-12-09 Refining Visual Artifacts in Diffusion Models via Explainable AI-based Flaw Activation Maps Seoyeon Lee et.al. 2512.08774 translate read null
2025-12-09 Chain-of-Image Generation: Toward Monitorable and Controllable Image Generation Young Kyung Kim et.al. 2512.08645 translate read null
2025-12-09 A Novel Wasserstein Quaternion Generative Adversarial Network for Color Image Generation Zhigang Jia et.al. 2512.08542 translate read null
2025-12-09 PaintFlow: A Unified Framework for Interactive Oil Paintings Editing and Generation Zhangli Hu et.al. 2512.08534 translate read null
2025-12-09 Beyond the Noise: Aligning Prompts with Latent Representations in Diffusion Models Vasco Ramos et.al. 2512.08505 translate read null
2025-12-09 Magneton: Optimizing Energy Efficiency of ML Systems via Differential Energy Debugging Yi Pan et.al. 2512.08365 translate read null
2025-12-09 SCU-CGAN: Enhancing Fire Detection through Synthetic Fire Image Generation and Dataset Augmentation Ju-Young Kim et.al. 2512.08362 translate read null
2025-12-09 Interpreting Structured Perturbations in Image Protection Methods for Diffusion Models Michael R. Martin et.al. 2512.08329 translate read null
2025-12-09 OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Yexin Liu et.al. 2512.08294 translate read null
2025-12-09 FlowSteer: Conditioning Flow Field for Consistent Image Restoration Tharindu Wickremasinghe et.al. 2512.08125 translate read null
2025-12-08 One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation Yuan Gao et.al. 2512.07829 translate read null
2025-12-08 Distribution Matching Variational AutoEncoder Sen Ye et.al. 2512.07778 translate read null
2025-12-08 Guiding What Not to Generate: Automated Negative Prompting for Text-Image Alignment Sangha Park et.al. 2512.07702 translate read null
2025-12-08 DIST-CLIP: Arbitrary Metadata and Image Guided MRI Harmonization via Disentangled Anatomy-Contrast Representations Mehmet Yigit Avci et.al. 2512.07674 translate read null
2025-12-08 LongCat-Image Technical Report Meituan LongCat Team et.al. 2512.07584 translate read null
2025-12-08 SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation Yao Teng et.al. 2512.07503 translate read null
2025-12-08 MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition Xinyu Wei et.al. 2512.07348 translate read null
2025-12-08 DGGAN: Degradation Guided Generative Adversarial Network for Real-time Endoscopic Video Enhancement Handing Xu et.al. 2512.07253 translate read null
2025-12-08 Generating Storytelling Images with Rich Chains-of-Reasoning Xiujie Song et.al. 2512.07198 translate read null
2025-12-07 Evaluating and Preserving High-level Fidelity in Super-Resolution Josep M. Rocafort et.al. 2512.07037 translate read null
2025-12-07 Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods Panagiota Kiourti et.al. 2512.06665 translate read null
2025-12-07 Masked Autoencoder Pretraining on Strong-Lensing Images for Joint Dark-Matter Model Classification and Super-Resolution Achmad Ardani Prasha et.al. 2512.06642 translate read null
2025-12-06 Generic visuality of war? How image-generative AI models (mis)represent Russia’s war against Ukraine Mykola Makhortykh et.al. 2512.06570 translate read null
2025-12-06 SUGAR: A Sweeter Spot for Generative Unlearning of Many Identities Dung Thuy Nguyen et.al. 2512.06562 translate read null
2025-12-06 AGORA: Adversarial Generation Of Real-time Animatable 3D Gaussian Head Avatars Ramazan Fazylov et.al. 2512.06438 translate read null
2025-12-06 TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search Kaicheng Yang et.al. 2512.06353 translate read null
2025-12-04 PrefGen: Multimodal Preference Learning for Preference-Conditioned Image Generation Wenyi Mo et.al. 2512.06020 translate read null
2025-12-05 EditThinker: Unlocking Iterative Reasoning for Any Image Editor Hongyu Li et.al. 2512.05965 translate read null
2025-12-05 Impugan: Learning Conditional Generative Models for Robust Data Imputation Zalish Mahmud et.al. 2512.05950 translate read null
2025-12-05 Underwater Image Reconstruction Using a Swin Transformer-Based Generator and PatchGAN Discriminator Md. Mahbub Hasan Akash et.al. 2512.05866 translate read null
2025-12-05 HQ-DM: Single Hadamard Transformation-Based Quantization-Aware Training for Low-Bit Diffusion Models Shizhuo Mao et.al. 2512.05746 translate read null
2025-12-05 China Regional 3km Downscaling Based on Residual Corrective Diffusion Model Honglu Sun et.al. 2512.05377 translate read null
2025-12-04 CARD: Correlation Aware Restoration with Diffusion Niki Nezakati et.al. 2512.05268 translate read null
2025-12-04 Invariance Co-training for Robot Visual Generalization Jonathan Yang et.al. 2512.05230 translate read null
2025-12-03 EFDiT: Efficient Fine-grained Image Generation Using Diffusion Transformer Models Kun Wang et.al. 2512.05152 translate read null
2025-12-04 DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation Dongzhi Jiang et.al. 2512.05112 translate read null
2025-12-04 NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation Yu Zeng et.al. 2512.05106 translate read null
2025-12-04 Semantic-Guided Two-Stage GAN for Face Inpainting with Hybrid Perceptual Encoding Abhigyan Bhattacharya et.al. 2512.05039 translate read null
2025-12-04 Generative Neural Video Compression via Video Diffusion Prior Qi Mao et.al. 2512.05016 translate read null
2025-12-04 Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models NaHyeon Park et.al. 2512.04981 translate read null
2025-12-04 Autoregressive Image Generation Needs Only a Few Lines of Cached Tokens Ziran Qin et.al. 2512.04857 translate read null
2025-12-04 FreeGen: Feed-Forward Reconstruction-Generation Co-Training for Free-Viewpoint Driving Scene Synthesis Shijie Chen et.al. 2512.04830 translate read null
2025-12-04 LaFiTe: A Generative Latent Field for 3D Native Texturing Chia-Hao Chen et.al. 2512.04786 translate read null
2025-12-02 PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling Bowen Ping et.al. 2512.04784 translate read null
2025-12-04 Multi Task Denoiser Training for Solving Linear Inverse Problems Clément Bled et.al. 2512.04709 translate read null
2025-12-04 OmniScaleSR: Unleashing Scale-Controlled Diffusion Prior for Faithful and Realistic Arbitrary-Scale Image Super-Resolution Xinning Chai et.al. 2512.04699 translate read null
2025-12-04 Controllable Long-term Motion Generation with Extended Joint Targets Eunjong Lee et.al. 2512.04487 translate read null
2025-12-04 Not All Birds Look The Same: Identity-Preserving Generation For Birds Aaron Sun et.al. 2512.04485 translate read null
2025-12-04 Adversarial Limits of Quantum Certification: When Eve Defeats Detection Davut Emre Tasar et.al. 2512.04391 translate read null
2025-12-04 FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring Geunhyuk Youk et.al. 2512.04390 translate read null
2025-12-03 Learning Single-Image Super-Resolution in the JPEG Compressed Domain Sruthi Srinivasan et.al. 2512.04284 translate read null
2025-12-03 Plug-and-Play Image Restoration with Flow Matching: A Continuous Viewpoint Fan Jia et.al. 2512.04283 translate read null
2025-12-03 UniLight: A Unified Representation for Lighting Zitian Zhang et.al. 2512.04267 translate read null
2025-12-03 Fast & Efficient Normalizing Flows and Applications of Image Generative Models Sandeep Nagar et.al. 2512.04039 translate read null
2025-12-03 Beyond the Ground Truth: Enhanced Supervision for Image Restoration Donghun Ryou et.al. 2512.03932 translate read null
2025-12-03 LSRS: Latent Scale Rejection Sampling for Visual Autoregressive Modeling Hong-Kai Zheng et.al. 2512.03796 translate read null
2025-12-03 Evaluation of Foundational Machine Learned Interatomic Potentials for Migration Barrier Predictions Achinthya Krishna Bheemaguli et.al. 2512.03642 translate read null
2025-12-03 CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation Ruoxuan Zhang et.al. 2512.03540 translate read null
2025-12-03 Text-Printed Image: Bridging the Image-Text Modality Gap for Text-centric Training of Large Vision-Language Models Shojiro Yamabe et.al. 2512.03463 translate read null
2025-12-02 PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement Haitian Zheng et.al. 2512.03247 translate read null
2025-12-02 Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models Xiwen Wei et.al. 2512.03125 translate read null
2025-12-02 DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation Ying Yang et.al. 2512.02931 translate read null
2025-12-02 Glance: Accelerating Diffusion Models with 1 Sample Zhuobai Dong et.al. 2512.02899 translate read null
2025-12-02 Leveraging generative adversarial networks with spatially adaptive denormalization for multivariate stochastic seismic data inversion Roberto Miele et.al. 2512.02863 translate read null
2025-12-01 PhyCustom: Towards Realistic Physical Customization in Text-to-Image Generation Fan Wu et.al. 2512.02794 translate read null
2025-12-02 Channel Knowledge Map Construction via Physics-Inspired Diffusion Model Without Prior Observations Yunzhe Zhu et.al. 2512.02757 translate read null
2025-12-02 Training Data Attribution for Image Generation using Ontology-Aligned Knowledge Graphs Theodoros Aivalis et.al. 2512.02713 translate read null
2025-12-02 PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution Zhongbao Yang et.al. 2512.02681 translate read null
2025-12-02 OmniPerson: Unified Identity-Preserving Pedestrian Generation Changxiao Ma et.al. 2512.02554 translate read null
2025-12-02 Two-Stage Vision Transformer for Image Restoration: Colorization Pretraining + Residual Upsampling Aditya Chaudhary et.al. 2512.02512 translate read null
2025-12-02 Bayesian Physics-Informed Neural Networks for Inverse Problems (BPINN-IP): Application in Infrared Image Processing Ali Mohammad-Djafari et.al. 2512.02495 translate read null
2025-12-02 ClusterStyle: Modeling Intra-Style Diversity with Prototypical Clustering for Stylized Motion Generation Kerui Chen et.al. 2512.02453 translate read null
2025-12-02 SAGE: Style-Adaptive Generalization for Privacy-Constrained Semantic Segmentation Across Domains Qingmei Li et.al. 2512.02369 translate read null
2025-12-01 Progressive Image Restoration via Text-Conditioned Video Generation Peng Kang et.al. 2512.02273 translate read null
2025-12-01 Visible to Longwave-infrared imaging via an inverse-designed monolithic lens Syed N. Qadri et.al. 2512.02184 translate read null
2025-12-01 SplatSuRe: Selective Super-Resolution for Multi-view Consistent 3D Gaussian Splatting Pranav Asthana et.al. 2512.02172 translate read null
2025-12-01 FineGRAIN: Evaluating Failure Modes of Text-to-Image Models with Vision Language Model Judges Kevin David Hayes et.al. 2512.02161 translate read null
2025-12-01 Data-Centric Visual Development for Self-Driving Labs Anbang Liu et.al. 2512.02018 translate read null
2025-12-01 Envision: Benchmarking Unified Understanding & Generation for Causal World Process Insights Juanxi Tian et.al. 2512.01816 translate read null
2025-12-01 ViT $^3$ : Unlocking Test-Time Training in Vision Dongchen Han et.al. 2512.01643 translate read null
2025-12-01 LLM2Fx-Tools: Tool Calling For Music Post-Production Seungheon Doh et.al. 2512.01559 translate read null
2025-12-01 ResDiT: Evoking the Intrinsic Resolution Scalability in Diffusion Transformers Yiyang Ma et.al. 2512.01426 translate read null
2025-12-01 FRAMER: Frequency-Aligned Self-Distillation with Adaptive Modulation Leveraging Diffusion Priors for Real-World Image Super-Resolution Seungho Choi et.al. 2512.01390 translate read null
2025-12-01 FOD-S2R: A FOD Dataset for Sim2Real Transfer Learning based Object Detection Ashish Vashist et.al. 2512.01315 translate read null
2025-12-01 Generative Adversarial Gumbel MCTS for Abstract Visual Composition Generation Zirui Zhao et.al. 2512.01242 translate read null
2025-12-01 PSR: Scaling Multi-Subject Personalized Image Generation with Pairwise Subject-Consistency Rewards Shulei Wang et.al. 2512.01236 translate read null

(<a href=../Image_Generation.md>back to Image Generation</a>)