Skip to content

KAIST-VICLab/AVSR-Diff

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

AVSR-Diff (ECCV 2026)

   Geunhyuk Youk1  Jeonghyeok Do1  Dayeon Kim1  Jihyong Oh† 2  Munchurl Kim† 1
1Korea Advanced Institute of Science and Technology (KAIST), South Korea
2Chung-Ang University, South Korea
Co-corresponding authors

This repository is the official implementation of "AVSR-Diff: Scale-Agnostic Diffusion Priors for Temporally Consistent Arbitrary-Scale Video Super-Resolution".

📧 News

  • July 1, 2026: This repository is created.
  • June 18, 2026: AVSR-Diff is accepted to ECCV 2026 🎉

📖 Abstract

Diffusion models have significantly advanced video super-resolution (VSR) but remain largely constrained to fixed upsampling scales. Conversely, while coordinate-based arbitrary-scale VSR methods offer scale flexibility, they inherently suffer from severe over-smoothing at large scaling factors. Integrating generative priors with continuous decoding is promising but currently hindered by severe temporal flickering caused by the stochasticity of diffusion sampling. To address this, we propose AVSR-Diff (Arbitrary-scale Video Super-Resolution with Diffusion), a novel decoupled framework that separates scale-agnostic latent denoising from continuous coordinate rendering, effectively avoiding computationally heavy resolution-specific sampling. Our approach introduces a Temporally-Gated Feature Recurrence (TGFR) module to extract strictly aligned, temporally consistent latent priors. Furthermore, we design a continuous video VAE decoder incorporating a Scale-Aware Fourier Refinement (SAFR) module to dynamically adapt frequency components to any target scale. Extensive experiments demonstrate that AVSR-Diff consistently preserves high-frequency details and strong temporal stability across various scales, surpassing state-of-the-art arbitrary-scale baselines. Remarkably, our framework outperforms recent fixed-scale generative models even on their native resolution.

🖼️ Method Overview

AVSR-Diff is a decoupled framework built upon a pre-trained single-image super-resolution LDM (SD×4 Upscaler), decomposed into two stages: scale-agnostic latent denoising and arbitrary-scale continuous decoding.

  • A trainable ControlNet guides the frozen denoising U-Net for scale-agnostic latent denoising. The Temporally-Gated Feature Recurrence (TGFR) module aligns and dynamically gates recurrent features across adjacent frames, suppressing the diffusion-inherent temporal flickering.
  • The denoised latent sequence is decoded by the Continuous Video Decoder, which employs the Scale-Aware Fourier Refinement (SAFR) module to conditionally modulate high-frequency details based on the target scale, enabling high-fidelity, temporally consistent rendering at any continuous resolution.
Overview of AVSR-Diff

📊 Results

Memory Efficiency

AVSR-Diff performs diffusion sampling entirely within a fixed LR latent space, so its peak GPU memory stays nearly constant regardless of the target scale, while resolution-specific baselines (e.g., VEnhancer) grow rapidly.

Peak GPU Memory vs. Target Scale

Qualitative Comparison

Across arbitrary and large scaling factors, AVSR-Diff synthesizes sharp, faithful textures while preserving structural fidelity, without the over-smoothing of regression-based methods or the structural distortions of prior generative baselines.

Qualitative Comparison across scales

🚀 Code Release Plan

The full code and pretrained models will be released soon.

  • Inference code
  • Pretrained models
  • Training scripts
  • Evaluation scripts

📑 Citation

If you find AVSR-Diff useful, please consider citing:

@inproceedings{youk2026avsr,
    author    = {Youk, Geunhyuk and Do, Jeonghyeok and Kim, Dayeon and Oh, Jihyong and Kim, Munchurl},
    title     = {AVSR-Diff: Scale-Agnostic Diffusion Priors for Temporally Consistent Arbitrary-Scale Video Super-Resolution},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year      = {2026}
}

📬 Contact

For any questions, please contact rmsgurkjg@kaist.ac.kr via email.

About

Official repository of AVSR-Diff (ECCV 2026)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors