Image Generation - 2026-02
Image Generation - 2026-02
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2026-02-28 | Direct low-field MRI super-resolution using undersampled k-space | Daniel Tweneboah Anyimadu et.al. | 2603.00668 | translate | read | null |
| 2026-02-28 | IdGlow: Dynamic Identity Modulation for Multi-Subject Generation | Honghao Cai et.al. | 2603.00607 | translate | read | null |
| 2026-02-28 | AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution | Cencen Liu et.al. | 2603.00589 | translate | read | null |
| 2026-02-28 | Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation | Zhen Zhou et.al. | 2603.00526 | translate | read | null |
| 2026-02-28 | RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment | Liyao Jiang et.al. | 2603.00483 | translate | read | link |
| 2026-02-28 | Improved Adversarial Diffusion Compression for Real-World Video Super-Resolution | Bin Chen et.al. | 2603.00458 | translate | read | null |
| 2026-02-28 | SesaHand: Enhancing 3D Hand Reconstruction via Controllable Generation with Semantic and Structural Alignment | Zhuoran Zhao et.al. | 2603.00443 | translate | read | null |
| 2026-02-28 | Mamba-CAD: State Space Model For 3D Computer-Aided Design Generative Modeling | Xueyang Li et.al. | 2603.00439 | translate | read | null |
| 2026-02-28 | An Interpretable Local Editing Model for Counterfactual Medical Image Generation | Hyungi Min et.al. | 2603.00423 | translate | read | null |
| 2026-02-26 | SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation | Vaibhav Agrawal et.al. | 2602.23359 | translate | read | null |
| 2026-02-26 | Decomposing Private Image Generation via Coarse-to-Fine Wavelet Modeling | Jasmine Bayrooti et.al. | 2602.23262 | translate | read | null |
| 2026-02-26 | DMAligner: Enhancing Image Alignment via Diffusion Model Based View Synthesis | Xinglong Luo et.al. | 2602.23022 | translate | read | null |
| 2026-02-26 | Probing the Atmospheres of Young Long-Period Sub-Neptune Progenitors with ELT/ANDES | Spandan Dash et.al. | 2602.22830 | translate | read | null |
| 2026-02-26 | No Caption, No Problem: Caption-Free Membership Inference via Model-Fitted Embeddings | Joonsung Jeon et.al. | 2602.22689 | translate | read | null |
| 2026-02-26 | Instruction-based Image Editing with Planning, Reasoning, and Generation | Liya Ji et.al. | 2602.22624 | translate | read | null |
| 2026-02-26 | LoR-LUT: Learning Compact 3D Lookup Tables via Low-Rank Residuals | Ziqi Zhao et.al. | 2602.22607 | translate | read | null |
| 2026-02-26 | Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation | Dian Xie et.al. | 2602.22570 | translate | read | null |
| 2026-02-26 | DisQ-HNet: A Disentangled Quantized Half-UNet for Interpretable Multimodal Image Synthesis Applications to Tau-PET Synthesis from T1 and FLAIR MRI | Agamdeep S. Chopra et.al. | 2602.22545 | translate | read | null |
| 2026-02-25 | Flow Matching is Adaptive to Manifold Structures | Shivam Kumar et.al. | 2602.22486 | translate | read | null |
| 2026-02-25 | mmWave Radar Aware Dual-Conditioned GAN for Speech Reconstruction of Signals With Low SNR | Jash Karani et.al. | 2602.22431 | translate | read | null |
| 2026-02-25 | CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness | Wenhao Guo et.al. | 2602.22159 | translate | read | null |
| 2026-02-25 | CoLoGen: Progressive Learning of Concept-Localization Duality for Unified Image Generation | YuXin Song et.al. | 2602.22150 | translate | read | null |
| 2026-02-25 | GeoDiv: Framework For Measuring Geographical Diversity In Text-To-Image Models | Abhipsa Basu et.al. | 2602.22120 | translate | read | null |
| 2026-02-25 | Bayesian Generative Adversarial Networks via Gaussian Approximation for Tabular Data Synthesis | Bahrul Ilmi Nasution et.al. | 2602.21948 | translate | read | null |
| 2026-02-25 | SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model | Guibin Chen et.al. | 2602.21818 | translate | read | null |
| 2026-02-25 | RAMSeS: Robust and Adaptive Model Selection for Time-Series Anomaly Detection Algorithms | Mohamed Abdelmaksoud et.al. | 2602.21766 | translate | read | null |
| 2026-02-25 | Structure-to-Image: Zero-Shot Depth Estimation in Colonoscopy via High-Fidelity Sim-to-Real Adaptation | Juan Yang et.al. | 2602.21740 | translate | read | null |
| 2026-02-25 | Deep Learning-based Low-Overhead Beam Alignment for mmWave Massive MIMO Systems | Weijie Jin et.al. | 2602.21664 | translate | read | null |
| 2026-02-25 | A Hidden Semantic Bottleneck in Conditional Embeddings of Diffusion Transformers | Trung X. Pham et.al. | 2602.21596 | translate | read | null |
| 2026-02-25 | Deep Unfolding Real-Time Super-Resolution Using Subpixel-Shift Twin Image and Convex Self-Similarity Prior | Chia-Hsiang Lin et.al. | 2602.21513 | translate | read | null |
| 2026-02-25 | Perceptual Quality Optimization of Image Super-Resolution | Wei Zhou et.al. | 2602.21482 | translate | read | null |
| 2026-02-24 | Provably Safe Generative Sampling with Constricting Barrier Functions | Darshan Gadginmath et.al. | 2602.21429 | translate | read | null |
| 2026-02-24 | FlowFixer: Towards Detail-Preserving Subject-Driven Generation | Jinyoung Jun et.al. | 2602.21402 | translate | read | null |
| 2026-02-24 | RelA-Diffusion: Relativistic Adversarial Diffusion for Multi-Tracer PET Synthesis from Multi-Sequence MRI | Minhui Yu et.al. | 2602.21345 | translate | read | null |
| 2026-02-24 | SynthRender and IRIS: Open-Source Framework and Dataset for Bidirectional Sim-Real Transfer in Industrial Object Perception | Jose Moises Araya-Martinez et.al. | 2602.21141 | translate | read | null |
| 2026-02-24 | TextPecker: Rewarding Structural Anomaly Quantification for Enhancing Visual Text Rendering | Hanshen Zhu et.al. | 2602.20903 | translate | read | null |
| 2026-02-24 | RU4D-SLAM: Reweighting Uncertainty in Gaussian Splatting SLAM for 4D Scene Reconstruction | Yangfan Zhao et.al. | 2602.20807 | translate | read | null |
| 2026-02-24 | Generative Deep Learning for the Two-Dimensional Quantum Rotor Model | Yanyang Wang et.al. | 2602.20772 | translate | read | null |
| 2026-02-24 | Deep unfolding of MCMC kernels: scalable, modular & explainable GANs for high-dimensional posterior sampling | Jonathan Spence et.al. | 2602.20758 | translate | read | null |
| 2026-02-24 | Bridging Physically Based Rendering and Diffusion Models with Stochastic Differential Equation | Junwei Shu et.al. | 2602.20725 | translate | read | null |
| 2026-02-24 | CleanStyle: Plug-and-Play Style Conditioning Purification for Text-to-Image Stylization | Xiaoman Feng et.al. | 2602.20721 | translate | read | null |
| 2026-02-24 | Vanishing Watermarks: Diffusion-Based Image Editing Undermines Robust Invisible Watermarking | Fan Guo et.al. | 2602.20680 | translate | read | null |
| 2026-02-24 | VINA: Variational Invertible Neural Architectures | Shubhanshu Shekhar et.al. | 2602.20480 | translate | read | null |
| 2026-02-23 | GSNR: Graph Smooth Null-Space Representation for Inverse Problems | Romario Gualdrón-Hurtado et.al. | 2602.20328 | translate | read | null |
| 2026-02-23 | HelioSpectrotron 5000: An interactive multi-resolution solar spectral atlas | A. G. M. Pietrow et.al. | 2602.20101 | translate | read | null |
| 2026-02-23 | Training-Free Generative Modeling via Kernelized Stochastic Interpolants | Florentin Coeurdoux et.al. | 2602.20070 | translate | read | null |
| 2026-02-23 | LRG-BEASTS: Detection of sodium and evidence for water absorption in the hot Saturn HAT-P-44b | Alastair B. Claringbold et.al. | 2602.19986 | translate | read | null |
| 2026-02-23 | RL-RIG: A Generative Spatial Reasoner via Intrinsic Reflection | Tianyu Wang et.al. | 2602.19974 | translate | read | null |
| 2026-02-23 | Learning Positive-Incentive Point Sampling in Neural Implicit Fields for Object Pose Estimation | Yifei Shi et.al. | 2602.19937 | translate | read | null |
| 2026-02-23 | Fully Convolutional Spatiotemporal Learning for Microstructure Evolution Prediction | Michael Trimboli et.al. | 2602.19915 | translate | read | null |
| 2026-02-23 | DTT-BSR: GAN-based DTTNet with RoPE Transformer Enhancement for Music Source Restoration | Shihong Tan et.al. | 2602.19825 | translate | read | null |
| 2026-02-23 | Training Deep Stereo Matching Networks on Tree Branch Imagery: A Benchmark Study for Real-Time UAV Forestry Applications | Yida Lin et.al. | 2602.19763 | translate | read | null |
| 2026-02-23 | InfScene-SR: Spatially Continuous Inference for Arbitrary-Size Image Super-Resolution | Shoukun Sun et.al. | 2602.19736 | translate | read | null |
| 2026-02-23 | ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization | Minseo Kim et.al. | 2602.19575 | translate | read | null |
| 2026-02-23 | MICON-Bench: Benchmarking and Enhancing Multi-Image Context Image Generation in Unified Multimodal Models | Mingrui Wu et.al. | 2602.19497 | translate | read | null |
| 2026-02-23 | Laplacian Multi-scale Flow Matching for Generative Modeling | Zelin Zhao et.al. | 2602.19461 | translate | read | null |
| 2026-02-22 | PoseCraft: Tokenized 3D Body Landmark and Camera Conditioning for Photorealistic Human Image Synthesis | Zhilin Guo et.al. | 2602.19350 | translate | read | null |
| 2026-02-22 | MultiDiffSense: Diffusion-Based Multi-Modal Visuo-Tactile Image Generation Conditioned on Object Shape and Contact Pose | Sirine Bhouri et.al. | 2602.19348 | translate | read | null |
| 2026-02-22 | RegionRoute: Regional Style Transfer with Diffusion Model | Bowen Chen et.al. | 2602.19254 | translate | read | null |
| 2026-02-22 | JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation | Kai Liu et.al. | 2602.19163 | translate | read | null |
| 2026-02-22 | ReVision : A Post-Hoc, Vision-Based Technique for Replacing Unacceptable Concepts in Image Generation Pipeline | Gurjot Singh et.al. | 2602.19149 | translate | read | null |
| 2026-02-22 | A Markovian View of Iterative-Feedback Loops in Image Generative Models: Neural Resonance and Model Collapse | Vibhas Kumar Vats et.al. | 2602.19033 | translate | read | null |
| 2026-02-22 | Pushing the Limits of Inverse Lithography with Generative Reinforcement Learning | Haoyu Yang et.al. | 2602.19027 | translate | read | null |
| 2026-02-21 | CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion | Yu Li et.al. | 2602.18936 | translate | read | null |
| 2026-02-21 | SCHEMA for Gemini 3 Pro Image: A Structured Methodology for Controlled AI Image Generation on Google’s Native Multimodal Model | Luca Cazzaniga et.al. | 2602.18903 | translate | read | null |
| 2026-02-21 | Structure-Level Disentangled Diffusion for Few-Shot Chinese Font Generation | Jie Li et.al. | 2602.18874 | translate | read | null |
| 2026-02-21 | Robust Self-Supervised Cross-Modal Super-Resolution against Real-World Misaligned Observations | Xiaoyu Dong et.al. | 2602.18822 | translate | read | null |
| 2026-02-21 | RadioGen3D: 3D Radio Map Generation via Adversarial Learning on Large-Scale Synthetic Data | Junshen Chen et.al. | 2602.18744 | translate | read | null |
| 2026-02-21 | Subtle Motion Blur Detection and Segmentation from Static Image Artworks | Ganesh Samarth et.al. | 2602.18720 | translate | read | null |
| 2026-02-20 | DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction | Jiayang Shi et.al. | 2602.18589 | translate | read | null |
| 2026-02-20 | Morphological Addressing of Identity Basins in Text-to-Image Diffusion Models | Andrew Fraser et.al. | 2602.18533 | translate | read | null |
| 2026-02-20 | Super-Resolution Structured-Illumination X-Ray Microscopy based on Fourier Decomposition | Stefan Schwaiger et.al. | 2602.18343 | translate | read | null |
| 2026-02-20 | Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation | Ziyue Liu et.al. | 2602.18309 | translate | read | null |
| 2026-02-20 | Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies | Zhuoran Li et.al. | 2602.18291 | translate | read | null |
| 2026-02-20 | Generative Model via Quantile Assignment | Georgi Hrusanov et.al. | 2602.18216 | translate | read | null |
| 2026-02-20 | Improving Sampling for Masked Diffusion Models via Information Gain | Kaisen Yang et.al. | 2602.18176 | translate | read | null |
| 2026-02-20 | Extremely Large Antenna Spacing Method for Enhanced Wideband Near-Field Sensing | Tommaso Bacchielli et.al. | 2602.18076 | translate | read | null |
| 2026-02-20 | Interactions that reshape the interfaces of the interacting parties | David I. Spivak et.al. | 2602.17917 | translate | read | null |
| 2026-02-19 | MeDUET: Disentangled Unified Pretraining for 3D Medical Image Synthesis and Analysis | Junkai Liu et.al. | 2602.17901 | translate | read | null |
| 2026-02-19 | Financial time series augmentation using transformer based GAN architecture | Andrzej Podobiński et.al. | 2602.17865 | translate | read | null |
| 2026-02-19 | LGD-Net: Latent-Guided Dual-Stream Network for HER2 Scoring with Task-Specific Domain Knowledge | Peide Zhu et.al. | 2602.17793 | translate | read | null |
| 2026-02-19 | Multi-material Multi-physics Topology Optimization with Physics-informed Gaussian Process Priors | Xiangyu Sun et.al. | 2602.17783 | translate | read | null |
| 2026-02-19 | Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline | Mohamed Dhouib et.al. | 2602.17322 | translate | read | null |
| 2026-02-19 | Physics Encoded Spatial and Temporal Generative Adversarial Network for Tropical Cyclone Image Super-resolution | Ruoyi Zhang et.al. | 2602.17277 | translate | read | null |
| 2026-02-19 | GASS: Geometry-Aware Spherical Sampling for Disentangled Diversity Enhancement in Text-to-Image Generation | Ye Zhu et.al. | 2602.17200 | translate | read | null |
| 2026-02-19 | CAFE: Channel-Autoregressive Factorized Encoding for Robust Biosignal Spatial Super-Resolution | Hongjun Liu et.al. | 2602.17011 | translate | read | null |
| 2026-02-18 | StereoAdapter-2: Globally Structure-Consistent Underwater Stereo Depth Estimation | Zeyu Ren et.al. | 2602.16915 | translate | read | null |
| 2026-02-18 | Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning | Zifan Wang et.al. | 2602.16796 | translate | read | null |
| 2026-02-13 | Speech to Speech Synthesis for Voice Impersonation | Bjorn Johnson et.al. | 2602.16721 | translate | read | null |
| 2026-02-18 | Unpaired Image-to-Image Translation via a Self-Supervised Semantic Bridge | Jiaming Liu et.al. | 2602.16664 | translate | read | null |
| 2026-02-18 | Steering diffusion models with quadratic rewards: a fine-grained analysis | Ankur Moitra et.al. | 2602.16570 | translate | read | null |
| 2026-02-18 | EasyControlEdge: A Foundation-Model Fine-Tuning for Edge Detection | Hiroki Nakamura et.al. | 2602.16238 | translate | read | null |
| 2026-02-17 | Surgical Activation Steering via Generative Causal Mediation | Aruna Sankaranarayanan et.al. | 2602.16080 | translate | read | null |
| 2026-02-17 | Chem-SIM: Super-resolution Chemical Imaging via Photothermal Modulation of Structured-Illumination Fluorescence | Dashan Dong et.al. | 2602.16079 | translate | read | null |
| 2026-02-17 | B-DENSE: Branching For Dense Ensemble Network Learning | Cherish Puniani et.al. | 2602.15971 | translate | read | null |
| 2026-02-17 | Entanglement-assisted Hamiltonian dynamics learning | Ayaka Usui et.al. | 2602.15931 | translate | read | null |
| 2026-02-15 | A Comprehensive Survey on Deep Learning-Based LiDAR Super-Resolution for Autonomous Driving | June Moh Goo et.al. | 2602.15904 | translate | read | null |
| 2026-02-17 | RPT-SR: Regional Prior attention Transformer for infrared image Super-Resolution | Youngwan Jin et.al. | 2602.15490 | translate | read | null |
| 2026-02-17 | Efficient Generative Modeling beyond Memoryless Diffusion via Adjoint Schrödinger Bridge Matching | Jeongwoo Shin et.al. | 2602.15396 | translate | read | null |
| 2026-02-17 | Consistency-Preserving Diverse Video Generation | Xinshuang Liu et.al. | 2602.15287 | translate | read | null |
| 2026-02-17 | Visual Persuasion: What Influences Decisions of Vision-Language Models? | Manuel Cherep et.al. | 2602.15278 | translate | read | null |
| 2026-02-17 | Enhancing Diversity and Feasibility: Joint Population Synthesis from Multi-source Data Using Generative Models | Farbod Abbasi et.al. | 2602.15270 | translate | read | null |
| 2026-02-16 | Distributional Deep Learning for Super-Resolution of 4D Flow MRI under Domain Shift | Xiaoyi Wen et.al. | 2602.15167 | translate | read | null |
| 2026-02-16 | Image Generation with a Sphere Encoder | Kaiyu Yue et.al. | 2602.15030 | translate | read | null |
| 2026-02-16 | Text Style Transfer with Parameter-efficient LLM Finetuning and Round-trip Translation | Ruoxi Liu et.al. | 2602.15013 | translate | read | null |
| 2026-02-16 | Efficient Text-Guided Convolutional Adapter for the Diffusion Model | Aryan Das et.al. | 2602.14514 | translate | read | null |
| 2026-02-16 | MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction | Zhicheng He et.al. | 2602.14512 | translate | read | null |
| 2026-02-16 | CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer | Wenbo Nie et.al. | 2602.14464 | translate | read | null |
| 2026-02-16 | Controlling Your Image via Simplified Vector Graphics | Lanqing Guo et.al. | 2602.14443 | translate | read | null |
| 2026-02-15 | UniRef-Image-Edit: Towards Scalable and Consistent Multi-Reference Image Editing | Hongyang Wei et.al. | 2602.14186 | translate | read | null |
| 2026-02-15 | UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model | Shaobin Zhuang et.al. | 2602.14178 | translate | read | null |
| 2026-02-15 | Convexity Meets Curvature: Lifted Near-Field Super-Resolution | Sajad Daei et.al. | 2602.14063 | translate | read | null |
| 2026-02-15 | BitDance: Scaling Autoregressive Generative Models with Binary Tokens | Yuang Ai et.al. | 2602.14041 | translate | read | null |
| 2026-02-15 | Inject Where It Matters: Training-Free Spatially-Adaptive Identity Preservation for Text-to-Image Personalization | Guandong Li et.al. | 2602.13994 | translate | read | null |
| 2026-02-14 | HybridFlow: A Two-Step Generative Policy for Robotic Manipulation | Zhenchen Dong et.al. | 2602.13718 | translate | read | null |
| 2026-02-14 | A WDLoRA-Based Multimodal Generative Framework for Clinically Guided Corneal Confocal Microscopy Image Synthesis in Diabetic Neuropathy | Xin Zhang et.al. | 2602.13693 | translate | read | null |
| 2026-02-14 | Diff-Aid: Inference-time Adaptive Interaction Denoising for Rectified Text-to-Image Generation | Binglei Li et.al. | 2602.13585 | translate | read | null |
| 2026-02-13 | FUTON: Fourier Tensor Network for Implicit Neural Representations | Pooya Ashtari et.al. | 2602.13414 | translate | read | null |
| 2026-02-13 | Preference-Guided Prompt Optimization for Text-to-Image Generation | Zhipeng Li et.al. | 2602.13131 | translate | read | null |
| 2026-02-13 | A Calibrated Memorization Index (MI) for Detecting Training Data Leakage in Generative MRI Models | Yash Deo et.al. | 2602.13066 | translate | read | null |
| 2026-02-13 | Diverging Flows: Detecting Extrapolations in Conditional Generation | Constantinos Tsakonas et.al. | 2602.13061 | translate | read | null |
| 2026-02-13 | Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation | Florinel-Alin Croitoru et.al. | 2602.13055 | translate | read | null |
| 2026-02-13 | TFTF: Training-Free Targeted Flow for Conditional Sampling | Qianqian Qu et.al. | 2602.12932 | translate | read | null |
| 2026-02-13 | PixelRush: Ultra-Fast, Training-Free High-Resolution Image Generation via One-step Diffusion | Hong-Phuc Lai et.al. | 2602.12769 | translate | read | null |
| 2026-02-13 | Towards reconstructing experimental sparse-view X-ray CT data with diffusion models | Nelas J. Thomsen et.al. | 2602.12755 | translate | read | null |
| 2026-02-13 | ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models | Peijie Qiu et.al. | 2602.12640 | translate | read | null |
| 2026-02-13 | The Constant Eye: Benchmarking and Bridging Appearance Robustness in Autonomous Driving | Jiabao Wang et.al. | 2602.12563 | translate | read | null |
| 2026-02-12 | ForeAct: Steering Your VLA with Efficient Visual Foresight Planning | Zhuoyang Zhang et.al. | 2602.12322 | translate | read | null |
| 2026-02-12 | Best of Both Worlds: Multimodal Reasoning and Generation via Unified Discrete Flow Matching | Onkar Susladkar et.al. | 2602.12221 | translate | read | null |
| 2026-02-12 | DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing | Dianyi Wang et.al. | 2602.12205 | translate | read | null |
| 2026-02-12 | FAIL: Flow Matching Adversarial Imitation Learning for Image Generation | Yeyao Ma et.al. | 2602.12155 | translate | read | null |
| 2026-02-12 | Neutral Prompts, Non-Neutral People: Quantifying Gender and Skin-Tone Bias in Gemini Flash 2.5 Image and GPT Image 1.5 | Roberto Balestri et.al. | 2602.12133 | translate | read | null |
| 2026-02-12 | GAN-based data augmentation for rare and exotic hadron searches in Pb–Pb collisions in ALICE | Anisa Khatun et.al. | 2602.12088 | translate | read | null |
| 2026-02-12 | CSEval: A Framework for Evaluating Clinical Semantics in Text-to-Image Generation | Robert Cronshaw et.al. | 2602.12004 | translate | read | null |
| 2026-02-12 | Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation | Wei Chen et.al. | 2602.11980 | translate | read | null |
| 2026-02-12 | DiffPlace: Street View Generation via Place-Controllable Diffusion Model Enhancing Place Recognition | Ji Li et.al. | 2602.11875 | translate | read | null |
| 2026-02-12 | U-DAVI: Uncertainty-Aware Diffusion-Prior-Based Amortized Variational Inference for Image Reconstruction | Ayush Varshney et.al. | 2602.11704 | translate | read | null |
| 2026-02-12 | Estimation of Electrical Characteristics of Complex Walls Using Deep Neural Networks | Kainat Yasmeen et.al. | 2602.11463 | translate | read | null |
| 2026-02-11 | Enhanced Portable Ultra Low-Field Diffusion Tensor Imaging with Bayesian Artifact Correction and Deep Learning-Based Super-Resolution | Mark D. Olchanyi et.al. | 2602.11446 | translate | read | null |
| 2026-02-11 | Latent Forcing: Reordering the Diffusion Trajectory for Pixel-Space Image Generation | Alan Baade et.al. | 2602.11401 | translate | read | null |
| 2026-02-11 | Exploring Real-Time Super-Resolution: Benchmarking and Fine-Tuning for Streaming Content | Evgeney Bogatyrev et.al. | 2602.11339 | translate | read | null |
| 2026-02-11 | LCIP: Loss-Controlled Inverse Projection of High-Dimensional Image Data | Yu Wang et.al. | 2602.11141 | translate | read | null |
| 2026-02-11 | FastFlow: Accelerating The Generative Flow Matching Models with Bandit Inference | Divya Jyoti Bajpai et.al. | 2602.11105 | translate | read | null |
| 2026-02-11 | Predicting integers from continuous parameters | Bas Maat et.al. | 2602.10751 | translate | read | null |
| 2026-02-11 | Self-Supervised Image Super-Resolution Quality Assessment based on Content-Free Multi-Model Oriented Representation Learning | Kian Majlessi et.al. | 2602.10744 | translate | read | null |
| 2026-02-11 | A Diffusion-Based Generative Prior Approach to Sparse-view Computed Tomography | Davide Evangelista et.al. | 2602.10722 | translate | read | null |
| 2026-02-11 | Dynamic Frequency Modulation for Controllable Text-driven Image Generation | Tiandong Shi et.al. | 2602.10662 | translate | read | null |
| 2026-02-11 | Towards Universal Spatial Transcriptomics Super-Resolution: A Generalist Physically Consistent Flow Matching Framework | Xinlei Huang et.al. | 2602.10644 | translate | read | null |
| 2026-02-11 | Eliminating VAE for Fast and High-Resolution Generative Detail Restoration | Yan Wang et.al. | 2602.10630 | translate | read | null |
| 2026-02-11 | MindPilot: Closed-loop Visual Stimulation Optimization for Brain Modulation with EEG-guided Diffusion | Dongyang Li et.al. | 2602.10552 | translate | read | null |
| 2026-02-11 | RealHD: A High-Quality Dataset for Robust Detection of State-of-the-Art AI-Generated Images | Hanzhe Yu et.al. | 2602.10546 | translate | read | null |
| 2026-02-10 | WildCat: Near-Linear Attention in Theory and Practice | Tobias Schröder et.al. | 2602.10056 | translate | read | null |
| 2026-02-10 | SciFlow-Bench: Evaluating Structure-Aware Scientific Diagram Generation via Inverse Parsing | Tong Zhang et.al. | 2602.09809 | translate | read | null |
| 2026-02-10 | Where Do Images Come From? Analyzing Captions to Geographically Profile Datasets | Abhipsa Basu et.al. | 2602.09775 | translate | read | null |
| 2026-02-10 | The mixture of glycerin with tartrazine: a solution to reversibly increase tissue transparency for in vitro quantitative phase imaging | Mikolaj Krysa et.al. | 2602.09732 | translate | read | null |
| 2026-02-10 | Robust Depth Super-Resolution via Adaptive Diffusion Sampling | Kun Wang et.al. | 2602.09510 | translate | read | null |
| 2026-02-10 | ArtifactLens: Hundreds of Labels Are Enough for Artifact Detection with VLMs | James Burgess et.al. | 2602.09475 | translate | read | null |
| 2026-02-10 | Motion Compensation for Multiple-Input-Multiple-Output Inverse Synthetic Aperture Imaging of Automotive Targets | Devansh Mathur et.al. | 2602.09452 | translate | read | null |
| 2026-02-10 | Look-Ahead and Look-Back Flows: Training-Free Image Generation with Trajectory Smoothing | Yan Luo et.al. | 2602.09449 | translate | read | null |
| 2026-02-10 | Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning | Xu Ma et.al. | 2602.09439 | translate | read | null |
| 2026-02-10 | Bridging the Modality Gap in Roadside LiDAR: A Training-Free Vision-Language Model Framework for Vehicle Classification | Yiqiao Li et.al. | 2602.09425 | translate | read | null |
| 2026-02-10 | Measuring Privacy Risks and Tradeoffs in Financial Synthetic Data Generation | Michael Zuo et.al. | 2602.09288 | translate | read | null |
| 2026-02-09 | Gradient Residual Connections | Yangchen Pan et.al. | 2602.09190 | translate | read | null |
| 2026-02-09 | All-in-One Conditioning for Text-to-Image Synthesis | Hirunima Jayasekara et.al. | 2602.09165 | translate | read | null |
| 2026-02-09 | Autoregressive Image Generation with Masked Bit Modeling | Qihang Yu et.al. | 2602.09024 | translate | read | null |
| 2026-02-09 | ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation | Zihan Yang et.al. | 2602.09014 | translate | read | null |
| 2026-02-09 | GEBench: Benchmarking Image Generation Models as GUI Environments | Haodong Li et.al. | 2602.09007 | translate | read | null |
| 2026-02-09 | Shifting the Breaking Point of Flow Matching for Multi-Instance Editing | Carmine Zaccagnino et.al. | 2602.08749 | translate | read | null |
| 2026-02-09 | Forget Superresolution, Sample Adaptively (when Path Tracing) | Martin Bálint et.al. | 2602.08642 | translate | read | null |
| 2026-02-09 | Inspiration Seeds: Learning Non-Literal Visual Combinations for Generative Exploration | Kfir Goldberg et.al. | 2602.08615 | translate | read | null |
| 2026-02-09 | Trajectory Stitching for Solving Inverse Problems with Flow-Based Models | Alexander Denker et.al. | 2602.08538 | translate | read | null |
| 2026-02-09 | UReason: Benchmarking the Reasoning Paradox in Unified Multimodal Models | Cheng Yang et.al. | 2602.08336 | translate | read | null |
| 2026-02-09 | Room Temperature Collective Blinking and Photon Bunching from CsPbBr3 Quantum Dot Superlattice | Qiwen Tan et.al. | 2602.08301 | translate | read | null |
| 2026-02-09 | A Unified Framework for Multimodal Image Reconstruction and Synthesis using Denoising Diffusion Models | Weijie Gan et.al. | 2602.08249 | translate | read | null |
| 2026-02-04 | Reliable and Responsible Foundation Models: A Comprehensive Survey | Xinyu Yang et.al. | 2602.08145 | translate | read | null |
| 2026-02-08 | Enhanced Mixture 3D CGAN for Completion and Generation of 3D Objects | Yahia Hamdi et.al. | 2602.08046 | translate | read | null |
| 2026-02-08 | Deepfake Synthesis vs. Detection: An Uneven Contest | Md. Tarek Hasan et.al. | 2602.07986 | translate | read | null |
| 2026-02-08 | Accelerating Black Hole Image Generation via Latent Space Diffusion Models | Ao Liu et.al. | 2602.07786 | translate | read | null |
| 2026-02-07 | FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation | Guandong Li et.al. | 2602.07554 | translate | read | null |
| 2026-02-07 | PTB-XL-Image-17K: A Large-Scale Synthetic ECG Image Dataset with Comprehensive Ground Truth for Deep Learning-Based Digitization | Naqcho Ali Mehdi et.al. | 2602.07446 | translate | read | null |
| 2026-02-06 | The Double-Edged Sword of Data-Driven Super-Resolution: Adversarial Super-Resolution Models | Haley Duba-Sullivan et.al. | 2602.07251 | translate | read | null |
| 2026-02-06 | Lite-BD: A Lightweight Black-box Backdoor Defense via Reviving Multi-Stage Image Transformations | Abdullah Arafat Miah et.al. | 2602.07197 | translate | read | null |
| 2026-02-06 | WorldEdit: Towards Open-World Image Editing with a Knowledge-Informed Benchmark | Wang Lin et.al. | 2602.07095 | translate | read | null |
| 2026-02-05 | Bidirectional Reward-Guided Diffusion for Real-World Image Super-Resolution | Zihao Fan et.al. | 2602.07069 | translate | read | null |
| 2026-02-05 | Exploring Physical Intelligence Emergence via Omni-Modal Architecture and Physical Data Engine | Minghao Han et.al. | 2602.07064 | translate | read | null |
| 2026-02-04 | FADE: Selective Forgetting via Sparse LoRA and Self-Distillation | Carolina R. Kelsch et.al. | 2602.07058 | translate | read | null |
| 2026-02-02 | Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss | Yucheng Zhou et.al. | 2602.07022 | translate | read | null |
| 2026-02-06 | Prompt Reinjection: Alleviating Prompt Forgetting in Multimodal Diffusion Transformers | Yuxuan Yao et.al. | 2602.06886 | translate | read | null |
| 2026-02-06 | NanoFLUX: Distillation-Driven Compression of Large Text-to-Image Generation Models for Mobile Devices | Ruchika Chavhan et.al. | 2602.06879 | translate | read | null |
| 2026-02-06 | RFDM: Residual Flow Diffusion Model for Efficient Causal Video Editing | Mohammadreza Salehi et.al. | 2602.06871 | translate | read | null |
| 2026-02-06 | AEGPO: Adaptive Entropy-Guided Policy Optimization for Diffusion Models | Yuming Li et.al. | 2602.06825 | translate | read | null |
| 2026-02-06 | RAIGen: Rare Attribute Identification in Text-to-Image Generative Models | Silpa Vadakkeeveetil Sreelatha et.al. | 2602.06806 | translate | read | null |
| 2026-02-06 | PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks | Junxian Li et.al. | 2602.06663 | translate | read | link |
| 2026-02-06 | ChatUMM: Robust Context Tracking for Conversational Interleaved Generation | Wenxun Dai et.al. | 2602.06442 | translate | read | null |
| 2026-02-06 | Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO | Yunze Tong et.al. | 2602.06422 | translate | read | link |
| 2026-02-05 | GRP-Obliteration: Unaligning LLMs With a Single Unlabeled Prompt | Mark Russinovich et.al. | 2602.06258 | translate | read | null |
| 2026-02-05 | A Fast and Generalizable Fourier Neural Operator-Based Surrogate for Melt-Pool Prediction in Laser Processing | Alix Benoit et.al. | 2602.06241 | translate | read | null |
| 2026-02-05 | Learning Rate Scaling across LoRA Ranks and Transfer to Full Finetuning | Nan Chen et.al. | 2602.06204 | translate | read | null |
| 2026-02-05 | M3: High-fidelity Text-to-Image Generation via Multi-Modal, Multi-Agent and Multi-Round Visual Reasoning | Bangji Yang et.al. | 2602.06166 | translate | read | null |
| 2026-02-05 | From Blurry to Believable: Enhancing Low-quality Talking Heads with 3D Generative Priors | Ding-Jiun Huang et.al. | 2602.06122 | translate | read | null |
| 2026-02-05 | Shared LoRA Subspaces for almost Strict Continual Learning | Prakhar Kaushik et.al. | 2602.06043 | translate | read | null |
| 2026-02-05 | Discrete diffusion samplers and bridges: Off-policy algorithms and applications in latent spaces | Arran Carter et.al. | 2602.05961 | translate | read | null |
| 2026-02-05 | Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching | Junwan Kim et.al. | 2602.05951 | translate | read | null |
| 2026-02-05 | CLIP-Map: Structured Matrix Mapping for Parameter-Efficient CLIP Compression | Kangjie Zhang et.al. | 2602.05909 | translate | read | null |
| 2026-02-05 | Synthesizing Realistic Test Data without Breaking Privacy | Laura Plein et.al. | 2602.05833 | translate | read | null |
| 2026-02-05 | SSG: Scaled Spatial Guidance for Multi-Scale Visual Autoregressive Generation | Youngwoo Shin et.al. | 2602.05534 | translate | read | null |
| 2026-02-05 | DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching | Chang Zou et.al. | 2602.05449 | translate | read | null |
| 2026-02-05 | Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration towards High-Quality Speech Generation from SSL features | Hien Ohnaka et.al. | 2602.05443 | translate | read | null |
| 2026-02-04 | Rule-Based Spatial Mixture-of-Experts U-Net for Explainable Edge Detection | Bharadwaj Dogga et.al. | 2602.05100 | translate | read | null |
| 2026-02-04 | Untwisting RoPE: Frequency Control for Shared Attention in DiTs | Aryan Mikaeili et.al. | 2602.05013 | translate | read | null |
| 2026-02-04 | A uniformly accurate multiscale time integrator for the nonlinear Klein-Gordon equation in the nonrelativistic regime via simplified transmission conditions | Weizhu Bao et.al. | 2602.04988 | translate | read | null |
| 2026-02-04 | The Birthmark Standard: Privacy-Preserving Photo Authentication via Hardware Roots of Trust and Consortium Blockchain | Sam Ryan et.al. | 2602.04933 | translate | read | null |
| 2026-02-04 | ConvRML: High-Quality Lensless Imaging with Random Multi-Focal Lenslets | Leyla A. Kabuli et.al. | 2602.04834 | translate | read | null |
| 2026-02-04 | XtraLight-MedMamba for Classification of Neoplastic Tubular Adenomas | Aqsa Sultana et.al. | 2602.04819 | translate | read | null |
| 2026-02-04 | X2HDR: HDR Image Generation in a Perceptually Uniform Space | Ronghuan Wu et.al. | 2602.04814 | translate | read | null |
| 2026-02-04 | Adaptive Prompt Elicitation for Text-to-Image Generation | Xinyi Wen et.al. | 2602.04713 | translate | read | null |
| 2026-02-04 | Turbulence teaches equivariance to neural networks | Ryley McConkey et.al. | 2602.04695 | translate | read | null |
| 2026-02-04 | Investigating Disability Representations in Text-to-Image Models | Yang Yian et.al. | 2602.04687 | translate | read | null |
| 2026-02-04 | Rethinking the Design Space of Reinforcement Learning for Diffusion Models: On the Importance of Likelihood Estimation Beyond Loss Design | Jaemoo Choi et.al. | 2602.04663 | translate | read | null |
| 2026-02-04 | HoloHema: Digital Holographic Hematology Analyzer | Andreas Erik Gejl Madsen et.al. | 2602.04618 | translate | read | null |
| 2026-02-04 | Bayesian PINNs for uncertainty-aware inverse problems (BPINN-IP) | Ali Mohammad-Djafari et.al. | 2602.04459 | translate | read | null |
| 2026-02-04 | From Sparse Sensors to Continuous Fields: STRIDE for Spatiotemporal Reconstruction | Yanjie Tong et.al. | 2602.04201 | translate | read | null |
| 2026-02-04 | Continuous Degradation Modeling via Latent Flow Matching for Real-World Super-Resolution | Hyeonjae Kim et.al. | 2602.04193 | translate | read | null |
| 2026-02-04 | Spatial Angular Pseudo-Derivative Searching: A Single Snapshot Super-resolution Sparse DOA Scheme with Potential for Practical Application | Longxin Bai et.al. | 2602.04169 | translate | read | null |
| 2026-02-04 | PFluxTTS: Hybrid Flow-Matching TTS with Robust Cross-Lingual Voice Cloning and Inference-Time Model Fusion | Vikentii Pankov et.al. | 2602.04160 | translate | read | null |
| 2026-02-03 | Progressive Checkerboards for Autoregressive Multiscale Image Generation | David Eigen et.al. | 2602.03811 | translate | read | null |
| 2026-02-03 | Multi-Objective Optimization for Synthetic-to-Real Style Transfer | Estelle Chigot et.al. | 2602.03625 | translate | read | null |
| 2026-02-03 | Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation | Yijia Xu et.al. | 2602.03448 | translate | read | null |
| 2026-02-03 | Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction | Zhengbo Jiao et.al. | 2602.03414 | translate | read | null |
| 2026-02-03 | Enhancing Quantum Diffusion Models for Complex Image Generation | Jeongbin Jo et.al. | 2602.03405 | translate | read | null |
| 2026-02-03 | Tiled Prompts: Overcoming Prompt Underspecification in Image and Video Super-Resolution | Bryan Sangwoo Kim et.al. | 2602.03342 | translate | read | null |
| 2026-02-03 | Invisible Clean-Label Backdoor Attacks for Generative Data Augmentation | Ting Xiang et.al. | 2602.03316 | translate | read | null |
| 2026-02-03 | Spectral Evolution Search: Efficient Inference-Time Scaling for Reward-Aligned Image Generation | Jinyan Ye et.al. | 2602.03208 | translate | read | null |
| 2026-02-03 | LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution | Tianxing Wu et.al. | 2602.03182 | translate | read | null |
| 2026-02-03 | Inverse Design of Tunable Infrared Metasurface Absorbers via a Conditional Wasserstein Generative Adversarial Network | H. Shen et.al. | 2602.03062 | translate | read | null |
| 2026-02-03 | HP-GAN: Harnessing pretrained networks for GAN improvement with FakeTwins and discriminator consistency | Geonhui Son et.al. | 2602.03039 | translate | read | link |
| 2026-02-03 | Thinking inside the Convolution for Image Inpainting: Reconstructing Texture via Structure under Global and Local Side | Haipeng Liu et.al. | 2602.03013 | translate | read | null |
| 2026-02-03 | Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation | David McShannon et.al. | 2602.02955 | translate | read | null |
| 2026-02-02 | Training-Free Self-Correction for Multimodal Masked Diffusion Models | Yidong Ouyang et.al. | 2602.02927 | translate | read | null |
| 2026-02-02 | From Tokens to Numbers: Continuous Number Modeling for SVG Generation | Michael Ogezi et.al. | 2602.02820 | translate | read | null |
| 2026-02-02 | Super-Resolution and Denoising of Corneal B-Scan OCT Imaging Using Diffusion Model Plug-and-Play Priors | Yaning Wang et.al. | 2602.02795 | translate | read | null |
| 2026-02-02 | CryoLVM: Self-supervised Learning from Cryo-EM Density Maps with Large Vision Models | Weining Fu et.al. | 2602.02620 | translate | read | null |
| 2026-02-02 | PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss | Zehong Ma et.al. | 2602.02493 | translate | read | link |
| 2026-02-02 | UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing | Dianyi Wang et.al. | 2602.02437 | translate | read | null |
| 2026-02-02 | Trust Region Continual Learning as an Implicit Meta-Learner | Zekun Wang et.al. | 2602.02417 | translate | read | null |
| 2026-02-02 | Personalized Image Generation via Human-in-the-loop Bayesian Optimization | Rajalaxmi Rajagopalan et.al. | 2602.02388 | translate | read | null |
| 2026-02-02 | VQ-Style: Disentangling Style and Content in Motion with Residual Quantized Representations | Fatemeh Zargarbashi et.al. | 2602.02334 | translate | read | null |
| 2026-02-02 | Variational Entropic Optimal Transport | Roman Dyachenko et.al. | 2602.02241 | translate | read | null |
| 2026-02-02 | Geometry- and Relation-Aware Diffusion for EEG Super-Resolution | Laura Yao et.al. | 2602.02238 | translate | read | null |
| 2026-02-02 | Show, Don’t Tell: Morphing Latent Reasoning into Image Generation | Harold Haodong Chen et.al. | 2602.02227 | translate | read | link |
| 2026-02-02 | Lung Nodule Image Synthesis Driven by Two-Stage Generative Adversarial Networks | Lu Cao et.al. | 2602.02171 | translate | read | null |
| 2026-02-02 | Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training | Xin Ding et.al. | 2602.02114 | translate | read | null |
| 2026-02-02 | SIDiffAgent: Self-Improving Diffusion Agent | Shivank Garg et.al. | 2602.02051 | translate | read | null |
| 2026-02-02 | One Size, Many Fits: Aligning Diverse Group-Wise Click Preferences in Large-Scale Advertising Image Generation | Shuo Lu et.al. | 2602.02033 | translate | read | null |
| 2026-02-02 | Edge-Aligned Initialization of Kernels for Steered Mixture-of-Experts | Martin Determann et.al. | 2602.02031 | translate | read | null |
| 2026-02-02 | Leveraging Latent Vector Prediction for Localized Control in Image Generation via Diffusion Models | Pablo Domingo-Gregorio et.al. | 2602.01991 | translate | read | null |
| 2026-02-02 | Trust but Verify: Adaptive Conditioning for Reference-Based Diffusion Super-Resolution via Implicit Reference Correlation Modeling | Yuan Wang et.al. | 2602.01864 | translate | read | null |
| 2026-02-02 | Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation | Jun He et.al. | 2602.01756 | translate | read | link |
| 2026-02-02 | Physics Informed Generative AI Enabling Labour Free Segmentation For Microscopy Analysis | Salma Zahran et.al. | 2602.01710 | translate | read | null |
| 2026-02-02 | Moonworks Lunara Aesthetic II: An Image Variation Dataset | Yan Wang et.al. | 2602.01666 | translate | read | null |
| 2026-02-02 | Cloud-Cloud Collisions Induce Filament-Mediated Super Star Cluster Formation in the Antennae Overlap Region: Evidence from ALMA and JWST | Tomonari Michiyama et.al. | 2602.01616 | translate | read | null |
| 2026-02-02 | Token Pruning for In-Context Generation in Diffusion Transformers | Junqing Lin et.al. | 2602.01609 | translate | read | null |
| 2026-02-02 | Know Your Step: Faster and Better Alignment for Flow Matching Models via Step-aware Advantages | Zhixiong Yue et.al. | 2602.01591 | translate | read | null |
| 2026-02-01 | Theoretical Analysis of Measure Consistency Regularization for Partially Observed Data | Yinsong Wang et.al. | 2602.01437 | translate | read | null |
| 2026-02-01 | PromptRL: Prompt Matters in RL for Flow-Based Image Generation | Fu-Yun Wang et.al. | 2602.01382 | translate | read | null |
| 2026-02-01 | Balancing Understanding and Generation in Discrete Diffusion Models | Yue Liu et.al. | 2602.01362 | translate | read | null |
| 2026-02-01 | FlowCast: Trajectory Forecasting for Scalable Zero-Cost Speculative Flow Matching | Divya Jyoti Bajpai et.al. | 2602.01329 | translate | read | null |
| 2026-02-01 | StoryState: Agent-Based State Control for Consistent and Editable Storybooks | Ayushman Sarkar et.al. | 2602.01305 | translate | read | null |
| 2026-02-01 | Gradient-Aligned Calibration for Post-Training Quantization of Diffusion Models | Dung Anh Hoang et.al. | 2602.01289 | translate | read | null |
| 2026-02-01 | Q-DiT4SR: Exploration of Detail-Preserving Diffusion Transformer Quantization for Real-World Image Super-Resolution | Xun Zhang et.al. | 2602.01273 | translate | read | null |
| 2026-02-01 | Bridging Lexical Ambiguity and Vision: A Mini Review on Visual Word Sense Disambiguation | Shashini Nilukshi et.al. | 2602.01193 | translate | read | null |
| 2026-02-01 | Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization | Haochen You et.al. | 2602.01140 | translate | read | null |
| 2026-02-01 | PISA: Piecewise Sparse Attention Is Wiser for Efficient Diffusion Transformers | Haopeng Li et.al. | 2602.01077 | translate | read | null |
(<a href=../Image_Generation.md>back to Image Generation</a>)