Image Generation - 2026-01
Image Generation - 2026-01
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2026-01-27 | A General-Purpose Diversified 2D Seismic Image Dataset from NAMSS | Lucas de Magalhães Araujo et.al. | 2602.04890 | translate | read | null |
| 2026-01-31 | Trajectory Consistency for One-Step Generation on Euler Mean Flows | Zhiqi Li et.al. | 2602.02571 | translate | read | null |
| 2026-01-30 | Super-résolution non supervisée d’images hyperspectrales de télédétection utilisant un entraînement entièrement synthétique | Xinxin Xu et.al. | 2602.02552 | translate | read | null |
| 2026-01-31 | DIAMOND: Directed Inference for Artifact Mitigation in Flow Matching Models | Alicja Polowczyk et.al. | 2602.00883 | translate | read | null |
| 2026-01-31 | Lightweight Super Resolution-enabled Coding Model for the JPEG Pleno Learning-based Point Cloud Coding Standard | André F. R. Guarda et.al. | 2602.00863 | translate | read | null |
| 2026-01-31 | RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation | Yuhao Huang et.al. | 2602.00849 | translate | read | null |
| 2026-01-31 | Super-resolution Imaging of Limited-size Objects | Taeyong Chang et.al. | 2602.00719 | translate | read | null |
| 2026-01-31 | Improving Neuropathological Reconstruction Fidelity via AI Slice Imputation | Marina Crespo Aguirre et.al. | 2602.00669 | translate | read | null |
| 2026-01-31 | FaceSnap: Enhanced ID-fidelity Network for Tuning-free Portrait Customization | Benxiang Zhai et.al. | 2602.00627 | translate | read | null |
| 2026-01-31 | Tune-Your-Style: Intensity-tunable 3D Style Transfer with Gaussian Splatting | Yian Zhao et.al. | 2602.00618 | translate | read | null |
| 2026-01-31 | Inference-Only Prompt Projection for Safe Text-to-Image Generation with TV Guarantees | Minhyuk Lee et.al. | 2602.00616 | translate | read | null |
| 2026-01-31 | Bridging Degradation Discrimination and Generation for Universal Image Restoration | JiaKui Hu et.al. | 2602.00579 | translate | read | null |
| 2026-01-31 | Toward Autonomous Laboratory Safety Monitoring with Vision Language Models: Learning to See Hazards Through Scene Structure | Trishna Chakraborty et.al. | 2602.00414 | translate | read | null |
| 2026-01-31 | Alignment of Diffusion Model and Flow Matching for Text-to-Image Generation | Yidong Ouyang et.al. | 2602.00413 | translate | read | null |
| 2026-01-30 | PLACID: Identity-Preserving Multi-Object Compositing via Video Diffusion with Synthetic Trajectories | Gemma Canet Tarrés et.al. | 2602.00267 | translate | read | null |
| 2026-01-30 | MapDream: Task-Driven Map Learning for Vision-Language Navigation | Guoxin Lian et.al. | 2602.00222 | translate | read | null |
| 2026-01-30 | Benchmarking Vanilla GAN, DCGAN, and WGAN Architectures for MRI Reconstruction: A Quantitative Analysis | Humaira Mehwish et.al. | 2602.00221 | translate | read | null |
| 2026-01-30 | Stabilizing Diffusion Posterior Sampling by Noise–Frequency Continuation | Feng Tian et.al. | 2602.00176 | translate | read | null |
| 2026-01-30 | PaperBanana: Automating Academic Illustration for AI Scientists | Dawei Zhu et.al. | 2601.23265 | translate | read | null |
| 2026-01-30 | How well do generative models solve inverse problems? A benchmark study | Patrick Krüger et.al. | 2601.23238 | translate | read | null |
| 2026-01-30 | Solving Inverse Problems with Flow-based Models via Model Predictive Control | George Webber et.al. | 2601.23231 | translate | read | null |
| 2026-01-30 | Scale-Cascaded Diffusion Models for Super-Resolution in Medical Imaging | Darshan Thaker et.al. | 2601.23201 | translate | read | null |
| 2026-01-30 | Adaptive Edge Learning for Density-Aware Graph Generation | Seyedeh Ava Razi Razavi et.al. | 2601.23052 | translate | read | null |
| 2026-01-30 | Improving Supervised Machine Learning Performance in Optical Quality Control via Generative AI for Dataset Expansion | Dennis Sprute et.al. | 2601.22961 | translate | read | null |
| 2026-01-30 | MoVE: Mixture of Value Embeddings – A New Axis for Scaling Parametric Memory in Autoregressive Models | Yangyan Li et.al. | 2601.22887 | translate | read | null |
| 2026-01-30 | Synthetic Time Series Generation via Complex Networks | Jaime Vale et.al. | 2601.22879 | translate | read | null |
| 2026-01-30 | NativeTok: Native Visual Tokenization for Improved Image Generation | Bin Wu et.al. | 2601.22837 | translate | read | null |
| 2026-01-30 | Synthetic Abundance Maps for Unsupervised Super-Resolution of Hyperspectral Remote Sensing Images | Xinxin Xu et.al. | 2601.22755 | translate | read | null |
| 2026-01-30 | Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation | Youngjoong Kim et.al. | 2601.22679 | translate | read | null |
| 2026-01-30 | LINA: Linear Autoregressive Image Generative Models with Continuous Tokens | Jiahao Wang et.al. | 2601.22630 | translate | read | null |
| 2026-01-30 | Time-Annealed Perturbation Sampling: Diverse Generation for Diffusion Language Models | Jingxuan Wu et.al. | 2601.22629 | translate | read | null |
| 2026-01-30 | Corrected Samplers for Discrete Flow Models | Zhengyan Wan et.al. | 2601.22519 | translate | read | null |
| 2026-01-30 | DreamVAR: Taming Reinforced Visual Autoregressive Model for High-Fidelity Subject-Driven Image Generation | Xin Jiang et.al. | 2601.22507 | translate | read | null |
| 2026-01-30 | ScribbleSense: Generative Scribble-Based Texture Editing with Intent Prediction | Yudi Zhang et.al. | 2601.22455 | translate | read | null |
| 2026-01-29 | Jailbreaks on Vision Language Model via Multimodal Reasoning | Aarush Noheria et.al. | 2601.22398 | translate | read | null |
| 2026-01-29 | Little Red Dots on FIRE: The Ability of Bursty Galaxies to Host an Abundant Population of High-Redshift AGN | Andrew Marszewski et.al. | 2601.22213 | translate | read | null |
| 2026-01-29 | One-step Latent-free Image Generation with Pixel Mean Flows | Yiyang Lu et.al. | 2601.22158 | translate | read | null |
| 2026-01-29 | Creative Image Generation with Diffusion Models | Kunpeng Song et.al. | 2601.22125 | translate | read | null |
| 2026-01-29 | RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation | Hanzhuo Huang et.al. | 2601.22094 | translate | read | null |
| 2026-01-29 | Investigating Associational Biases in Inter-Model Communication of Large Generative Models | Fethiye Irmak Dogan et.al. | 2601.22093 | translate | read | null |
| 2026-01-29 | MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources | Baorui Ma et.al. | 2601.22054 | translate | read | null |
| 2026-01-29 | SmartMeterFM: Unifying Smart Meter Data Generative Tasks Using Flow Matching Models | Nan Lin et.al. | 2601.21706 | translate | read | null |
| 2026-01-29 | Noise as a Probe: Membership Inference Attacks on Diffusion Models Leveraging Initial Noise | Puwei Lian et.al. | 2601.21628 | translate | read | null |
| 2026-01-29 | SimGraph: A Unified Framework for Scene Graph-Based Image Generation and Editing | Thanh-Nhan Vo et.al. | 2601.21498 | translate | read | null |
| 2026-01-29 | Revisiting Diffusion Model Predictions Through Dimensionality | Qing Jin et.al. | 2601.21419 | translate | read | null |
| 2026-01-29 | SR $^{2}$ -Net: A General Plug-and-Play Model for Spectral Refinement in Hyperspectral Image Super-Resolution | Ji-Xuan He et.al. | 2601.21338 | translate | read | null |
| 2026-01-29 | Optimization and Mobile Deployment for Anthropocene Neural Style Transfer | Po-Hsun Chen et.al. | 2601.21141 | translate | read | null |
| 2026-01-28 | Shape of Thought: Progressive Object Assembly via Visual Chain-of-Thought | Yu Huo et.al. | 2601.21081 | translate | read | null |
| 2026-01-28 | CompSRT: Quantization and Pruning for Image Super Resolution Transformers | Dorsa Zeinali et.al. | 2601.21069 | translate | read | null |
| 2026-01-28 | Non-Markov Multi-Round Conversational Image Generation with History-Conditioned MLLMs | Haochen Zhang et.al. | 2601.20911 | translate | read | null |
| 2026-01-28 | End-to-end example-based sim-to-real RL policy transfer based on neural stylisation with application to robotic cutting | Jamie Hathaway et.al. | 2601.20846 | translate | read | null |
| 2026-01-28 | Compressible Turbulence as a Source of Particle Beams and Ion Bernstein Waves in Collisionless Plasmas | Chuanpeng Hou et.al. | 2601.20842 | translate | read | null |
| 2026-01-28 | Detecting and Mitigating Memorization in Diffusion Models through Anisotropy of the Log-Probability | Rohan Asthana et.al. | 2601.20642 | translate | read | null |
| 2026-01-28 | CM-GAI: Continuum Mechanistic Generative Artificial Intelligence Theory for Data Dynamics | Shan Tang et.al. | 2601.20462 | translate | read | null |
| 2026-01-28 | Exploiting the Final Component of Generator Architectures for AI-Generated Image Detection | Yanzhu Liu et.al. | 2601.20461 | translate | read | null |
| 2026-01-28 | OSDEnhancer: Taming Real-World Space-Time Video Super-Resolution with One-Step Diffusion | Shuoyan Wei et.al. | 2601.20308 | translate | read | null |
| 2026-01-28 | Reversible Efficient Diffusion for Image Fusion | Xingxin Xu et.al. | 2601.20260 | translate | read | null |
| 2026-01-28 | BLENDER: Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning | Jan Niklas Kolf et.al. | 2601.20246 | translate | read | null |
| 2026-01-28 | DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment | Haoyou Deng et.al. | 2601.20218 | translate | read | null |
| 2026-01-28 | TeleStyle: Content-Preserving Style Transfer in Images and Videos | Shiwen Zhang et.al. | 2601.20175 | translate | read | null |
| 2026-01-27 | DiffStyle3D: Consistent 3D Gaussian Stylization via Attention Optimization | Yitong Yang et.al. | 2601.19717 | translate | read | null |
| 2026-01-27 | Cortex-Grounded Diffusion Models for Brain Image Generation | Fabian Bongratz et.al. | 2601.19498 | translate | read | null |
| 2026-01-27 | Engineering Quantum Emission with Mie Voids | Yuchao Fu et.al. | 2601.19420 | translate | read | null |
| 2026-01-27 | LightSBB-M: Bridging Schrödinger and Bass for Generative Diffusion Modeling | Alexandre Alouadi et.al. | 2601.19312 | translate | read | null |
| 2026-01-27 | A sixth-order compact time-splitting Fourier pseudospectral method | Weiguo Gao et.al. | 2601.19172 | translate | read | null |
| 2026-01-27 | GTFMN: Guided Texture and Feature Modulation Network for Low-Light Image Enhancement and Super-Resolution | Yongsong Huang et.al. | 2601.19157 | translate | read | null |
| 2026-01-27 | CLIP-Guided Unsupervised Semantic-Aware Exposure Correction | Puzhen Wu et.al. | 2601.19129 | translate | read | null |
| 2026-01-27 | FBSDiff++: Improved Frequency Band Substitution of Diffusion Features for Efficient and Highly Controllable Text-Driven Image-to-Image Translation | Xiang Gao et.al. | 2601.19115 | translate | read | null |
| 2026-01-26 | Pay Attention to Where You Look | Alex Beriand et.al. | 2601.18970 | translate | read | null |
| 2026-01-26 | Advances in Diffusion-Based Generative Compression | Yibo Yang et.al. | 2601.18932 | translate | read | null |
| 2026-01-26 | SelfieAvatar: Real-time Head Avatar reenactment from a Selfie Video | Wei Liang et.al. | 2601.18851 | translate | read | null |
| 2026-01-26 | OptiGAN for Crystal Arrays: Physics-Informed Generative Modeling of Optical Photon Transport in PET Detector Arrays | Stephan Naunheim et.al. | 2601.18780 | translate | read | null |
| 2026-01-26 | GimmBO: Interactive Generative Image Model Merging via Bayesian Optimization | Chenxi Liu et.al. | 2601.18585 | translate | read | null |
| 2026-01-26 | GenAgent: Scaling Text-to-Image Generation via Agentic Multimodal Reasoning | Kaixun Jiang et.al. | 2601.18543 | translate | read | null |
| 2026-01-23 | SoS: Analysis of Surface over Semantics in Multilingual Text-To-Image Generation | Carolin Holtermann et.al. | 2601.16803 | translate | read | null |
| 2026-01-23 | A Novel Transfer Learning Approach for Mental Stability Classification from Voice Signal | Rafiul Islam et.al. | 2601.16793 | translate | read | null |
| 2026-01-23 | Evaluating Generative AI in the Lab: Methodological Challenges and Guidelines | Hyerim Park et.al. | 2601.16740 | translate | read | null |
| 2026-01-23 | Sim-to-Real Transfer via a Style-Identified Cycle Consistent Generative Adversarial Network: Zero-Shot Deployment on Robotic Manipulators through Visual Domain Adaptation | Lucía Güitta-López et.al. | 2601.16677 | translate | read | null |
| 2026-01-23 | Fast, faithful and photorealistic diffusion-based image super-resolution with enhanced Flow Map models | Maxence Noble et.al. | 2601.16660 | translate | read | null |
| 2026-01-23 | Edge-Aware Image Manipulation via Diffusion Models with a Novel Structure-Preservation Loss | Minsu Gong et.al. | 2601.16645 | translate | read | link |
| 2026-01-23 | Unsupervised Super-Resolution of Hyperspectral Remote Sensing Images Using Fully Synthetic Training | Xinxin Xu et.al. | 2601.16602 | translate | read | null |
| 2026-01-23 | DeMark: A Query-Free Black-Box Attack on Deepfake Watermarking Defenses | Wei Song et.al. | 2601.16473 | translate | read | null |
| 2026-01-23 | Secure Intellicise Wireless Network: Agentic AI for Coverless Semantic Steganography Communication | Rui Meng et.al. | 2601.16472 | translate | read | null |
| 2026-01-23 | Mode Conversion of Hyperbolic Phonon Polaritons in van der Waals terraces | Byung-Il Noh et.al. | 2601.16465 | translate | read | null |
| 2026-01-23 | A Cosine Network for Image Super-Resolution | Chunwei Tian et.al. | 2601.16413 | translate | read | null |
| 2026-01-22 | ProGiDiff: Prompt-Guided Diffusion-Based Medical Image Segmentation | Yuan Lin et.al. | 2601.16060 | translate | read | null |
| 2026-01-22 | Understanding the Transfer Limits of Vision Foundation Models | Shiqi Huang et.al. | 2601.15888 | translate | read | null |
| 2026-01-22 | PMPBench: A Paired Multi-Modal Pan-Cancer Benchmark for Medical Image Synthesis | Yifan Chen et.al. | 2601.15884 | translate | read | null |
| 2026-01-22 | Uncertainty-guided Generation of Dark-field Radiographs | Lina Felsner et.al. | 2601.15859 | translate | read | null |
| 2026-01-22 | TinySense: Effective CSI Compression for Scalable and Accurate Wi-Fi Sensing | Toan Gian et.al. | 2601.15838 | translate | read | null |
| 2026-01-22 | Diffusion Model-Based Data Augmentation for Enhanced Neuron Segmentation | Liuyun Jiang et.al. | 2601.15779 | translate | read | null |
| 2026-01-22 | Beyond Visual Safety: Jailbreaking Multimodal Large Language Models for Harmful Image Generation via Semantic-Agnostic Inputs | Mingyu Yu et.al. | 2601.15698 | translate | read | null |
| 2026-01-22 | Consistency-Regularized GAN for Few-Shot SAR Target Recognition | Yikui Zhai et.al. | 2601.15681 | translate | read | null |
| 2026-01-22 | Relative Classification Accuracy: A Calibrated Metric for Identity Consistency in Fine-Grained K-pop Face Generation | Sylvey Lin et.al. | 2601.15560 | translate | read | null |
| 2026-01-21 | Controllable Layered Image Generation for Real-World Editing | Jinrui Yang et.al. | 2601.15507 | translate | read | null |
| 2026-01-21 | Hybrid Vision Transformer_GAN Attribute Neutralizer for Mitigating Bias in Chest X_Ray Diagnosis | Jobeal Solomon et.al. | 2601.15490 | translate | read | null |
| 2026-01-21 | Ambient Dataloops: Generative Models for Dataset Refinement | Adrián Rodríguez-Muñoz et.al. | 2601.15417 | translate | read | null |
| 2026-01-21 | GeMM-GAN: A Multimodal Generative Model Conditioned on Histopathology Images and Clinical Descriptions for Gene Expression Profile Generation | Francesca Pia Panaccione et.al. | 2601.15392 | translate | read | link |
| 2026-01-21 | OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation | Letian Zhang et.al. | 2601.15369 | translate | read | null |
| 2026-01-21 | Aligned Stable Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency | Yikai Wang et.al. | 2601.15368 | translate | read | null |
| 2026-01-21 | Iterative Refinement Improves Compositional Image Generation | Shantanu Jaiswal et.al. | 2601.15286 | translate | read | null |
| 2026-01-21 | Field-Space Autoencoder for Scalable Climate Emulators | Johannes Meuer et.al. | 2601.15102 | translate | read | null |
| 2026-01-21 | The Pictorial Cortex: Zero-Shot Cross-Subject fMRI-to-Image Reconstruction via Compositional Latent Modeling | Jingyang Huo et.al. | 2601.15071 | translate | read | null |
| 2026-01-21 | Differential Privacy Image Generation with Reconstruction Loss and Noise Injection Using an Error Feedback SGD | Qiwei Ma et.al. | 2601.15061 | translate | read | null |
| 2026-01-21 | HyperNet-Adaptation for Diffusion-Based Test Case Generation | Oliver Weißl et.al. | 2601.15041 | translate | read | null |
| 2026-01-21 | TempViz: On the Evaluation of Temporal Knowledge in Text-to-Image Models | Carolin Holtermann et.al. | 2601.14951 | translate | read | null |
| 2026-01-21 | Synthetic Data Augmentation for Multi-Task Chinese Porcelain Classification: A Stable Diffusion Approach | Ziyao Ling et.al. | 2601.14791 | translate | read | null |
| 2026-01-21 | Semantic-Guided Unsupervised Video Summarization | Haizhou Liu et.al. | 2601.14773 | translate | read | null |
| 2026-01-21 | Enhancing Text-to-Image Generation via End-Edge Collaborative Hybrid Super-Resolution | Chongbin Yi et.al. | 2601.14741 | translate | read | null |
| 2026-01-21 | Mirai: Autoregressive Visual Generation Needs Foresight | Yonghao Yu et.al. | 2601.14671 | translate | read | null |
| 2026-01-21 | 3D Space as a Scratchpad for Editable Text-to-Image Generation | Oindrila Saha et.al. | 2601.14602 | translate | read | null |
| 2026-01-20 | Quantum Super-resolution by Adaptive Non-local Observables | Hsin-Yi Lin et.al. | 2601.14433 | translate | read | null |
| 2026-01-20 | When Generative AI Is Intimate, Sexy, and Violent: Examining Not-Safe-For-Work (NSFW) Chatbots on FlowGPT | Xian Li et.al. | 2601.14324 | translate | read | null |
| 2026-01-20 | Implicit Neural Representation Facilitates Unified Universal Vision Encoding | Matthew Gwilliam et.al. | 2601.14256 | translate | read | link |
| 2026-01-20 | Style Transfer as Bias Mitigation: Diffusion Models for Synthetic Mental Health Text for Arabic | Saad Mankarious et.al. | 2601.14124 | translate | read | null |
| 2026-01-20 | Likelihood-Separable Diffusion Inference for Multi-Image MRI Super-Resolution | Samuel W. Remedios et.al. | 2601.14030 | translate | read | null |
| 2026-01-20 | SHARE: A Fully Unsupervised Framework for Single Hyperspectral Image Restoration | Jiangwei Xie et.al. | 2601.13987 | translate | read | null |
| 2026-01-20 | Asymmetric regularization mechanism for GAN training with Variational Inequalities | Spyridon C. Giagtzoglou et.al. | 2601.13920 | translate | read | null |
| 2026-01-20 | Prospecting MeerKAT Continuum Data for Enigmatic Radio Sources with Unsupervised Vector-Quantised Variational Autoencoders | Fernando L. Ventura et.al. | 2601.13901 | translate | read | null |
| 2026-01-20 | Generative Adversarial Networks for Resource State Generation | Shahbaz Shaik et.al. | 2601.13708 | translate | read | null |
| 2026-01-20 | Dynamic Differential Linear Attention: Enhancing Linear Diffusion Transformer for High-Quality Image Generation | Boyuan Cao et.al. | 2601.13683 | translate | read | null |
| 2026-01-19 | SpatialBench-UC: Uncertainty-Aware Evaluation of Spatial Prompt Following in Text-to-Image Generation | Amine Rostane et.al. | 2601.13462 | translate | read | null |
| 2026-01-19 | Diffusion Representations for Fine-Grained Image Classification: A Marine Plankton Case Study | A. Nieto Juscafresa et.al. | 2601.13416 | translate | read | null |
| 2026-01-19 | StyMam: A Mamba-Based Generator for Artistic Style Transfer | Zhou Hong et.al. | 2601.12954 | translate | read | null |
| 2026-01-19 | AI-generated data contamination erodes pathological variability and diagnostic reliability | Hongyu He et.al. | 2601.12946 | translate | read | null |
| 2026-01-19 | Generalizable and Animatable 3D Full-Head Gaussian Avatar from a Single Image | Shuling Zhao et.al. | 2601.12770 | translate | read | null |
| 2026-01-19 | SSPFormer: Self-Supervised Pretrained Transformer for MRI Images | Jingkai Li et.al. | 2601.12747 | translate | read | null |
| 2026-01-16 | ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models | Linqing Zhong et.al. | 2601.11404 | translate | read | null |
| 2026-01-16 | Shape-morphing programming of soft materials on complex geometries via neural operator | Lu Chen et.al. | 2601.11126 | translate | read | null |
| 2026-01-16 | Generation of Chest CT pulmonary Nodule Images by Latent Diffusion Models using the LIDC-IDRI Dataset | Kaito Urata et.al. | 2601.11085 | translate | read | null |
| 2026-01-15 | The BigBite Calorimeter for the Super Bigbite Spectrometer Program at Jefferson Lab | Provakar Datta et.al. | 2601.10799 | translate | read | null |
| 2026-01-15 | A Safety Report on GPT-5.2, Gemini 3 Pro, Qwen3-VL, Grok 4.1 Fast, Nano Banana Pro, and Seedream 4.5 | Xingjun Ma et.al. | 2601.10527 | translate | read | link |
| 2026-01-15 | Think-Then-Generate: Reasoning-Aware Text-to-Image Diffusion with LLM Encoders | Siqi Kou et.al. | 2601.10332 | translate | read | link |
| 2026-01-15 | Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text | Piyush Singh Pasi et.al. | 2601.10096 | translate | read | link |
| 2026-01-15 | Thinking Like Van Gogh: Structure-Aware Style Transfer via Flow-Guided 3D Gaussian Splatting | Zhendong Wang et.al. | 2601.10075 | translate | read | null |
| 2026-01-15 | CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation | Chengzhuo Tong et.al. | 2601.10061 | translate | read | link |
| 2026-01-15 | Resistive Memory based Efficient Machine Unlearning and Continual Learning | Ning Lin et.al. | 2601.10037 | translate | read | null |
| 2026-01-14 | VibrantSR: Sub-Meter Canopy Height Models from Sentinel-2 Using Generative Flow Matching | Kiarie Ndegwa et.al. | 2601.09866 | translate | read | null |
| 2026-01-14 | NanoSD: Edge Efficient Foundation Model for Real Time Image Restoration | Subhajit Sanyal et.al. | 2601.09823 | translate | read | null |
| 2026-01-14 | Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning | Dongjie Cheng et.al. | 2601.09536 | translate | read | link |
| 2026-01-14 | Detail Loss in Super-Resolution Models Based on the Laplacian Pyramid and Repeated Upscaling and Downscaling Process | Sangjun Han et.al. | 2601.09410 | translate | read | null |
| 2026-01-14 | PhyRPR: Training-Free Physics-Constrained Video Generation | Yibo Zhao et.al. | 2601.09255 | translate | read | null |
| 2026-01-14 | Knowledge-Embedded and Hypernetwork-Guided Few-Shot Substation Meter Defect Image Generation Method | Jackie Alex et.al. | 2601.09238 | translate | read | null |
| 2026-01-14 | SpikeVAEDiff: Neural Spike-based Natural Visual Scene Reconstruction via VD-VAE and Versatile Diffusion | Jialu Li et.al. | 2601.09213 | translate | read | null |
| 2026-01-14 | Annealed Relaxation of Speculative Decoding for Faster Autoregressive Image Generation | Xingyao Li et.al. | 2601.09212 | translate | read | null |
| 2026-01-14 | Architecture inside the mirage: evaluating generative image models on architectural style, elements, and typologies | Jamie Magrill et.al. | 2601.09169 | translate | read | null |
| 2026-01-14 | How Many Human Judgments Are Enough? Feasibility Limits of Human Preference Evaluation | Wilson Y. Lee et.al. | 2601.09084 | translate | read | null |
| 2026-01-12 | TAG-MoE: Task-Aware Gating for Unified Generative Mixture-of-Experts | Yu Xu et.al. | 2601.08881 | translate | read | null |
| 2026-01-13 | S3-CLIP: Video Super Resolution for Person-ReID | Tamas Endrei et.al. | 2601.08807 | translate | read | null |
| 2026-01-13 | Aggregating Diverse Cue Experts for AI-Generated Image Detection | Lei Tan et.al. | 2601.08790 | translate | read | null |
| 2026-01-13 | Translating Light-Sheet Microscopy Images to Virtual H&E Using CycleGAN | Yanhua Zhao et.al. | 2601.08776 | translate | read | null |
| 2026-01-13 | SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models | Renyang Liu et.al. | 2601.08623 | translate | read | null |
| 2026-01-13 | MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization | Yongtong Gu et.al. | 2601.08564 | translate | read | null |
| 2026-01-13 | From Local Windows to Adaptive Candidates via Individualized Exploratory: Rethinking Attention for Image Super-Resolution | Chunyu Meng et.al. | 2601.08341 | translate | read | null |
| 2026-01-13 | IGAN: A New Inception-based Model for Stable and High-Fidelity Image Synthesis Using Generative Adversarial Networks | Ahmed A. Hashim et.al. | 2601.08332 | translate | read | null |
| 2026-01-13 | UM-Text: A Unified Multimodal Model for Image Understanding | Lichen Ma et.al. | 2601.08321 | translate | read | null |
| 2026-01-13 | SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices | Dongting Hu et.al. | 2601.08303 | translate | read | null |
| 2026-01-13 | A Usable GAN-Based Tool for Synthetic ECG Generation in Cardiac Amyloidosis Research | Francesco Speziale et.al. | 2601.08260 | translate | read | null |
| 2026-01-12 | MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head | Kewei Zhang et.al. | 2601.07832 | translate | read | link |
| 2026-01-12 | Evaluating the encoding competence of visual language models using uncommon actions | Chen Ling et.al. | 2601.07737 | translate | read | null |
| 2026-01-12 | Sub-Pixel Electron Beam Alignment for Machine Learning Characterization of Hybrid Pixel Detectors | Emiliya Poghosyan et.al. | 2601.07682 | translate | read | null |
| 2026-01-12 | Advancing Multinational License Plate Recognition Through Synthetic and Real Data Fusion: A Comprehensive Evaluation | Rayson Laroca et.al. | 2601.07671 | translate | read | null |
| 2026-01-12 | GenDet: Painting Colored Bounding Boxes on Images via Diffusion Model for Object Detection | Chen Min et.al. | 2601.07273 | translate | read | null |
| 2026-01-12 | Language-Grounded Multi-Domain Image Translation via Semantic Difference Guidance | Jongwon Ryu et.al. | 2601.07221 | translate | read | null |
| 2026-01-11 | When Humans Judge Irises: Pupil Size Normalization as an Aid and Synthetic Irises as a Challenge | Mahsa Mitcheff et.al. | 2601.06725 | translate | read | null |
| 2026-01-10 | Boosting Overlapping Organoid Instance Segmentation Using Pseudo-Label Unmixing and Synthesis-Assisted Learning | Gui Huang et.al. | 2601.06642 | translate | read | null |
| 2026-01-10 | Sissi: Zero-shot Style-guided Image Synthesis via Semantic-style Integration | Yingying Deng et.al. | 2601.06605 | translate | read | null |
| 2026-01-10 | UMLoc: Uncertainty-Aware Map-Constrained Inertial Localization with Quantified Bounds | Mohammed S. Alharbi et.al. | 2601.06602 | translate | read | null |
| 2026-01-10 | APEX: Learning Adaptive Priorities for Multi-Objective Alignment in Vision-Language Generation | Dongliang Chen et.al. | 2601.06574 | translate | read | null |
| 2026-01-10 | A novel RF-enabled Non-Destructive Inspection Method through Machine Learning and Programmable Wireless Environments | Stavros Tsimpoukis et.al. | 2601.06512 | translate | read | null |
| 2026-01-10 | GlobalPaint: Spatiotemporal Coherent Video Outpainting with Global Feature Guidance | Yueming Pan et.al. | 2601.06413 | translate | read | null |
| 2026-01-10 | From Easy to Hard++: Promoting Differentially Private Image Synthesis Through Spatial-Frequency Curriculum | Chen Gong et.al. | 2601.06368 | translate | read | null |
| 2026-01-09 | Circuit Mechanisms for Spatial Relation Generation in Diffusion Transformers | Binxu Wang et.al. | 2601.06338 | translate | read | null |
| 2026-01-08 | QwenStyle: Content-Preserving Style Transfer with Qwen-Image-Edit | Shiwen Zhang et.al. | 2601.06202 | translate | read | null |
| 2026-01-07 | Think Bright, Diffuse Nice: Enhancing T2I-ICL via Inductive-Bias Hint Instruction and Query Contrastive Decoding | Zhiyong Ma et.al. | 2601.06169 | translate | read | null |
| 2026-01-09 | Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization | Yuliang Chen et.al. | 2601.05955 | translate | read | null |
| 2026-01-09 | Kidney Cancer Detection Using 3D-Based Latent Diffusion Models | Jen Dusseljee et.al. | 2601.05852 | translate | read | null |
| 2026-01-09 | GenCtrl – A Formal Controllability Toolkit for Generative Models | Emily Cheng et.al. | 2601.05637 | translate | read | link |
| 2026-01-09 | Text Detoxification in isiXhosa and Yorùbá: A Cross-Lingual Machine Learning Approach for Low-Resource African Languages | Abayomi O. Agbeyangi et.al. | 2601.05624 | translate | read | null |
| 2026-01-09 | MoGen: A Unified Collaborative Framework for Controllable Multi-Object Image Generation | Yanfeng Li et.al. | 2601.05546 | translate | read | null |
| 2026-01-09 | Multi-Image Super Resolution Framework for Detection and Analysis of Plant Roots | Shubham Agarwal et.al. | 2601.05482 | translate | read | null |
| 2026-01-08 | Coding the Visual World: From Image to Simulation Using Vision Language Models | Sagi Eppel et.al. | 2601.05344 | translate | read | null |
| 2026-01-08 | Multi-Scale Local Speculative Decoding for Image Generation | Elia Peruzzo et.al. | 2601.05149 | translate | read | null |
| 2026-01-08 | Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing | Runze He et.al. | 2601.05124 | translate | read | link |
| 2026-01-08 | From Rays to Projections: Better Inputs for Feed-Forward View Synthesis | Zirui Wu et.al. | 2601.05116 | translate | read | null |
| 2026-01-08 | Exponential capacity scaling of classical GANs compared to hybrid latent style-based quantum GANs | Milan Liepelt et.al. | 2601.05036 | translate | read | null |
| 2026-01-08 | OnomaCompass: A Texture Exploration Interface that Shuttles between Words and Images | Miki Okamura et.al. | 2601.04915 | translate | read | null |
| 2026-01-08 | Illumination Angular Spectrum Encoding for Controlling the Functionality of Diffractive Networks | Matan Kleiner et.al. | 2601.04825 | translate | read | null |
| 2026-01-08 | SRU-Pix2Pix: A Fusion-Driven Generator Network for Medical Image Translation with Few-Shot Learning | Xihe Qiu et.al. | 2601.04785 | translate | read | null |
| 2026-01-08 | Forge-and-Quench: Enhancing Image Generation for Higher Fidelity in Unified Multimodal Models | Yanbing Zeng et.al. | 2601.04706 | translate | read | link |
| 2026-01-08 | HATIR: Heat-Aware Diffusion for Turbulent Infrared Video Super-Resolution | Yang Zou et.al. | 2601.04682 | translate | read | null |
| 2026-01-08 | HyperAlign: Hyperbolic Entailment Cones for Adaptive Text-to-Image Alignment Assessment | Wenzhi Chen et.al. | 2601.04614 | translate | read | null |
| 2026-01-08 | 3D Conditional Image Synthesis of Left Atrial LGE MRI from Composite Semantic Masks | Yusri Al-Sanaani et.al. | 2601.04588 | translate | read | null |
| 2026-01-08 | Studies in Astronomical Time Series Analysis: The Double Lomb-Scargle Periodogram and Super Resolution | Jeffrey D. Scargle et.al. | 2601.04552 | translate | read | null |
| 2026-01-08 | FaceRefiner: High-Fidelity Facial Texture Refinement with Differentiable Rendering-based Style Transfer | Chengyang Li et.al. | 2601.04520 | translate | read | null |
| 2026-01-07 | UniDrive-WM: Unified Understanding, Planning and Generation World Model For Autonomous Driving | Zhexiao Xiong et.al. | 2601.04453 | translate | read | null |
| 2026-01-07 | Modifications to Image Phase Alignment Super-sampling Produce up to 4.4 times Increased Image Resolution | James N. Caron et.al. | 2601.04391 | translate | read | null |
| 2026-01-07 | Unified Text-Image Generation with Weakness-Targeted Post-Training | Jiahui Chen et.al. | 2601.04339 | translate | read | null |
| 2026-01-07 | Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion | Yuanfeng Xu et.al. | 2601.04056 | translate | read | null |
| 2026-01-07 | Padé Neurons for Efficient Neural Models | Onur Keleş et.al. | 2601.04005 | translate | read | null |
| 2026-01-07 | ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation | Xu Zhang et.al. | 2601.03955 | translate | read | link |
| 2026-01-07 | Local Interpolation via Low-Rank Tensor Trains | Siddhartha E. Guzman et.al. | 2601.03885 | translate | read | null |
| 2026-01-07 | FLNet: Flood-Induced Agriculture Damage Assessment using Super Resolution of Satellite Images | Sanidhya Ghosal et.al. | 2601.03884 | translate | read | null |
| 2026-01-07 | Logic Tensor Network-Enhanced Generative Adversarial Network | Nijesh Upreti et.al. | 2601.03839 | translate | read | null |
| 2026-01-07 | Zak-OTFS ISAC with Bistatic Sensing via Semi-Blind Atomic Norm Denoising Scheme | Kecheng Zhang et.al. | 2601.03639 | translate | read | null |
| 2026-01-07 | Physics-Constrained Cross-Resolution Enhancement Network for Optics-Guided Thermal UAV Image Super-Resolution | Zhicheng Zhao et.al. | 2601.03526 | translate | read | null |
| 2026-01-07 | GeoDiff-SAR: A Geometric Prior Guided Diffusion Model for SAR Image Generation | Fan Zhang et.al. | 2601.03499 | translate | read | null |
| 2026-01-06 | Understanding Reward Hacking in Text-to-Image Reinforcement Learning | Yunqi Hong et.al. | 2601.03468 | translate | read | null |
| 2026-01-06 | ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing | Hengjia Li et.al. | 2601.03467 | translate | read | null |
| 2026-01-06 | Discriminating real and synthetic super-resolved audio samples using embedding-based classifiers | Mikhail Silaev et.al. | 2601.03443 | translate | read | null |
| 2026-01-06 | Edit2Restore:Few-Shot Image Restoration via Parameter-Efficient Adaptation of Pre-trained Editing Models | M. Akın Yılmaz et.al. | 2601.03391 | translate | read | link |
| 2026-01-06 | Muses: Designing, Composing, Generating Nonexistent Fantasy 3D Creatures without Training | Hexiao Lu et.al. | 2601.03256 | translate | read | link |
| 2026-01-06 | UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision | Ruiyan Han et.al. | 2601.03193 | translate | read | link |
| 2026-01-06 | Unified Thinker: A General Reasoning Modular Core for Image Generation | Sashuai Zhou et.al. | 2601.03127 | translate | read | null |
| 2026-01-06 | Stroke Patches: Customizable Artistic Image Styling Using Regression | Ian Jaffray et.al. | 2601.03114 | translate | read | null |
| 2026-01-06 | LAMS-Edit: Latent and Attention Mixing with Schedulers for Improved Content Preservation in Diffusion-Based Image and Style Editing | Wingwa Fu et.al. | 2601.02987 | translate | read | null |
| 2026-01-06 | VTONQA: A Multi-Dimensional Quality Assessment Dataset for Virtual Try-on | Xinyi Wei et.al. | 2601.02945 | translate | read | null |
| 2026-01-06 | GRRE: Leveraging G-Channel Removed Reconstruction Error for Robust Detection of AI-Generated Images | Shuman He et.al. | 2601.02709 | translate | read | null |
| 2026-01-05 | Annealed Langevin Posterior Sampling (ALPS): A Rapid Algorithm for Image Restoration with Multiscale Energy Models | Jyothi Rikhab Chand et.al. | 2601.02594 | translate | read | null |
| 2026-01-05 | VIBE: Visual Instruction Based Editor | Grigorii Alekseenko et.al. | 2601.02242 | translate | read | link |
| 2026-01-05 | Beam-Brainstorm: A Generative Site-Specific Beamforming Approach | Zihao Zhou et.al. | 2601.02219 | translate | read | null |
| 2026-01-05 | Unraveling MMDiT Blocks: Training-free Analysis and Enhancement of Text-conditioned Diffusion | Binglei Li et.al. | 2601.02211 | translate | read | null |
| 2026-01-05 | Seeing the Unseen: Zooming in the Dark with Event Cameras | Dachun Kai et.al. | 2601.02206 | translate | read | null |
| 2026-01-05 | Agentic Retoucher for Text-To-Image Generation | Shaocheng Shen et.al. | 2601.02046 | translate | read | null |
| 2026-01-05 | AlignVTOFF: Texture-Spatial Feature Alignment for High-Fidelity Virtual Try-Off | Yihan Zhu et.al. | 2601.02038 | translate | read | null |
| 2026-01-05 | SerpentFlow: Generative Unpaired Domain Alignment via Shared-Structure Decomposition | Julie Keisler et.al. | 2601.01979 | translate | read | null |
| 2026-01-05 | Score-based diffusion models for accurate crystal-structure inpainting and reconstruction of hydrogen positions | Timo Reents et.al. | 2601.01959 | translate | read | null |
| 2026-01-04 | Guiding Token-Sparse Diffusion Models | Felix Krause et.al. | 2601.01608 | translate | read | null |
| 2026-01-04 | Improving Flexible Image Tokenizers for Autoregressive Image Generation | Zixuan Fu et.al. | 2601.01535 | translate | read | null |
| 2026-01-04 | Domain Adaptation of Carotid Ultrasound Images using Generative Adversarial Network | Mohd Usama et.al. | 2601.01460 | translate | read | null |
| 2026-01-04 | Image Synthesis Using Spintronic Deep Convolutional Generative Adversarial Network | Saumya Gupta et.al. | 2601.01441 | translate | read | null |
| 2026-01-04 | SwinIFS: Landmark Guided Swin Transformer For Identity Preserving Face Super Resolution | Habiba Kausar et.al. | 2601.01406 | translate | read | null |
| 2026-01-04 | Poisson Centralisers and Polynomial Superintegrability for Magnetic Geodesic Flows on Reductive Homogeneous Spaces | Kai Jiang et.al. | 2601.01369 | translate | read | null |
| 2026-01-03 | Diffusion Timbre Transfer Via Mutual Information Guided Inpainting | Ching Ho Lee et.al. | 2601.01294 | translate | read | null |
| 2026-01-03 | Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment | Bac Nguyen et.al. | 2601.01224 | translate | read | null |
| 2026-01-03 | RefSR-Adv: Adversarial Attack on Reference-based Image Super-Resolution Models | Jiazhu Dai et.al. | 2601.01202 | translate | read | null |
| 2026-01-03 | Comparative Evaluation of VAE, GAN, and SMOTE for Tor Detection in Encrypted Network Traffic | Saravanan A et.al. | 2601.01183 | translate | read | null |
| 2026-01-03 | Neural Networks on Symmetric Spaces of Noncompact Type | Xuan Son Nguyen et.al. | 2601.01097 | translate | read | null |
| 2026-01-03 | Unsupervised Text Style Transfer for Controllable Intensity | Shuhuan Gu et.al. | 2601.01060 | translate | read | null |
| 2026-01-03 | Out-of-Band Power Side-Channel Detection for Semiconductor Supply Chain Integrity at Scale | Rajiv Thummala et.al. | 2601.01054 | translate | read | null |
| 2026-01-02 | Contractive Diffusion Policies: Robust Action Diffusion via Contractive Score-Based Sampling with Differential Equations | Amin Abyaneh et.al. | 2601.01003 | translate | read | null |
| 2026-01-02 | Peak-Nadir Encoding for Efficient CGM Data Compression and High-Fidelity Reconstruction | Clara Bender et.al. | 2601.00608 | translate | read | null |
| 2026-01-01 | Application Research of a Deep Learning Model Integrating CycleGAN and YOLO in PCB Infrared Defect Detection | Chao Yang et.al. | 2601.00237 | translate | read | null |
| 2026-01-01 | Unknown Aware AI-Generated Content Attribution | Ellie Thieu et.al. | 2601.00218 | translate | read | null |
| 2026-01-01 | MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing | Xiaokun Sun et.al. | 2601.00204 | translate | read | null |
| 2026-01-01 | DichroGAN: Towards Restoration of in-air Colours of Seafloor from Satellite Imagery | Salma Gonzalez-Sabbagh et.al. | 2601.00194 | translate | read | null |
| 2026-01-01 | SSI-GAN: Semi-Supervised Swin-Inspired Generative Adversarial Networks for Neuronal Spike Classification | Danial Sharifrazi et.al. | 2601.00189 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)