Skip to content

AIDASLab/Awesome-Diffusion-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

107 Commits
 
 

Repository files navigation

Awesome-Large-Language-Diffusion-Models

Awesome Maintained

A comprehensive and structured list of research papers about Large-Language-Diffusion-Models (dLLMs).

Last major update: June 2026 — added ~90 new papers from Feb–May 2026: new scaling/training results (Scaling Beyond Masked DLMs, LIFT, TIDE MoE), decoding advances (DiCo, PSD, WINO, FeF-DLLM, S2D2), continuous diffusion (RePlaid, LangFlow, TextLDM, BitLM), hybrid/block methods (DCDM, Breaking Block Boundaries), RL/alignment (TraFL, RSPO, TRIMS), caching (MAGE, MetaState, EntropyCache, LoSA, PulseCol), theory/analysis, and downstream applications.


⚙️ Framework (Taxonomy)

  1. Surveys & Useful Resources
  2. Core Methodologies
  3. Reasoning & Policy Optimization
  4. Token Ordering & Generation Strategies
  5. System Efficiency & Acceleration
  6. Multi-modal & Physical AI
  7. Agentic & Tool-Use dLLMs
  8. Theory, Guidance & Applications
  9. Seminal Diffusion Papers

1. Surveys & Useful Resources

📚 Blogs & Reports

📝 Survey & Perspective Papers

Paper Title Year Venue Remark
Diffusion Models for Non-autoregressive Text Generation: A Survey 2023.03 IJCAI Early NAR-text survey
A Survey of Diffusion Models in NLP 2023.05 Arxiv Early NLP survey
Discrete Diffusion in Large Language and Multimodal Models: A Survey 2025.06 Arxiv dLLM + dMLLM
Diffusion-based Large Language Models Survey 2025.08 TechRxiv -
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models 2025.08 Arxiv -
A Survey on Diffusion Language Models 2025.08 Arxiv VILA-Lab; comprehensive
Efficient Diffusion Language Models: A Comprehensive Survey 2026.01 Efficiency-focused
Top 10 Open Challenges Steering the Future of Diffusion Language Model and Its Variants 2026.01 Arxiv Perspective / roadmap
A Tutorial on Diffusion Theory: From Differential Equations to Diffusion Models 2026.05 Arxiv Tutorial, Diffusion theory

2. Core Methodologies

2.1 Discrete & Masked Diffusion

Paper Title Year Venue Remark
DiffusER: Discrete Diffusion via Edit-based Reconstruction 2022.10 ICLR <7B
SSD-LM: Semi-autoregressive Simplex-based Diffusion for Modular Control 2022.10 ACL <7B, Simplex
DiffusionBERT: Improving Generative Masked Language Models 2022.11 ACL <7B, Masked
A Reparameterized Discrete Diffusion Model for Text Generation 2023.02 COLM <7B
David helps Goliath: Inference-Time Collaboration Between Small and Large Diffusion LMs 2023.05 NAACL >7B, Scale-collaboration
TESS: Text-to-Text Self-Conditioned Simplex Diffusion 2023.05 EACL <7B, Simplex
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning 2023.08 Arxiv >7B, Scaling
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (SEDD) 2023.10 ICML <7B, Discrete
Simplified and Generalized Masked Diffusion for Discrete Data (MD4) 2024.06 NeurIPS -
Simple and Effective Masked Diffusion Language Models (MDLM) 2024.06 NeurIPS <7B, Masked
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data (RADD) 2024.06 ICLR <7B, Masked
Scaling up Masked Diffusion Models on Text (SMDM) 2024.10 ICLR <7B, 1.1B Scaling
Energy-Based Diffusion Language Models for Text Generation (EDLM) 2024.10 ICLR <7B
Conditional MASK Discrete Diffusion Language Model 2024.11 EMNLP <7B
Non-Markovian Discrete Diffusion with Causal Language Models 2025.02 NeurIPS <7B
Large Language Diffusion Models (LLaDA) 2025.02 NeurIPS >7B, LLaDA-8B
Anchored Diffusion Language Model (ADLM) 2025.05 NeurIPS >7B; ANELBO objective
LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs 2025.06 Arxiv >7B, Context Scaling
Esoteric Language Models (Eso-LMs) 2025.06 Arxiv AR + MDM hybrid
Dream 7B: Diffusion Large Language Models 2025.08 Arxiv >7B, Dream-7B
Sequential Diffusion Language Models 2025.09 Arxiv >7B
LLaDA-MoE: A Sparse MoE Diffusion Language Model 2025.09 Arxiv >7B, 7B-A1B MoE from scratch
UltraLLaDA: Scaling Context to 128K 2025.10 Arxiv >7B, Context Scaling
Next Semantic Scale Prediction via Hierarchical Diffusion Language Models 2025.10 NeurIPS -
Masked Diffusion Models as Energy Minimization 2025.10 NeurIPS <7B
Soft-Masked Diffusion Language Models 2025.10 Arxiv <7B
Variational Masked Diffusion Models 2025.10 Arxiv <7B
Diffusion LLM with Native Variable Generation Lengths: Let [EOS] Lead the Way 2025.10 Arxiv >7B, Variable Length
Diffusion Language Models are Super Data Learners 2025.11 Arxiv Data efficiency
DiffuMamba: High-Throughput Diffusion LMs with Mamba Backbone 2025.11 Arxiv Non-Transformer Backbone
TiDAR: Think in Diffusion, Talk in Autoregression 2025.11 Arxiv >7B
C2DLM: Causal Concept-Guided Diffusion Large Language Models 2025.11 Arxiv >7B
Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models 2026.01 ACL Soft tokens, Masked
LLaDA2.0: Scaling Up Diffusion Language Models to 100B 2025.12 Arxiv >100B, MoE; Ant Group
LLaDA2.1: Speeding Up Text Diffusion via Token Editing 2026.02 Arxiv Editable State Evolution
Introspective Diffusion Language Models (I-DLM) 2026.04 Arxiv Introspective consistency
W1-4B-dLLM (WhaletechAI) 2026.04 HF Model 4B open dLLM; demo
Scaling Beyond Masked Diffusion Language Models 2026.02 Arxiv Uniform-state & interpolating diffusion scaling
dLLM: Simple Diffusion Language Modeling 2026.02 Arxiv Unified open-source dLLM framework
Generalized Discrete Diffusion from Snapshots 2026.03 Arxiv Unified arbitrary noising framework
Diffutron: A Masked Diffusion Language Model for Turkish Language 2026.03 Arxiv Multilingual, Turkish MDM
Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models 2026.04 Arxiv MoE, Expert-choice routing
Rethinking Token Prediction: Tree-Structured Diffusion Language Model 2026.04 Arxiv Tree-structured token prediction
Drifting Objectives for Refining Discrete Diffusion Language Models 2026.05 Arxiv TokenDrift, anti-symmetric objective

2.2 Continuous & Latent Space Diffusion

Paper Title Year Venue Remark
Diffusion-LM Improves Controllable Text Generation 2022.05 NeurIPS <7B, Embedding
DiffuSeq: Sequence to Sequence Text Generation 2022.10 ICLR <7B, Embedding
Latent Diffusion for Language Generation 2022.12 NeurIPS <7B, Latent
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning 2022.12 NAACL <7B
Empowering Diffusion Models on the Embedding Space for Text Generation 2022.12 NAACL <7B, Embedding
Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise 2022.12 ICML <7B, Embedding
DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises 2023.02 TACL <7B, Embedding
Likelihood-Based Diffusion Language Models (Plaid) 2023.05 NeurIPS <7B, Plaid 1B
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model 2023.06 NeurIPS <7B, Latent
Edit Flows: Flow Matching with Edit Operations 2025.06 Arxiv -
Coevolutionary Continuous Discrete Diffusion: Latent Reasoner 2025.10 Arxiv >7B; CCDD
Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning 2026.02 Arxiv Latent planning + AR hybrid
CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think 2026.03 Arxiv Contextual AR decoder for continuous diffusion
LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling 2026.04 Arxiv Flow matching via Bregman divergence
Scaling Properties of Continuous Diffusion Spoken Language Models 2026.04 Arxiv Continuous diffusion SLM, scaling laws
Towards Closing the Autoregressive Gap via Entropy-Gated Continuous Bitstream Diffusion 2026.05 Arxiv Continuous bitstream diffusion
TextLDM: Language Modeling with Continuous Latent Diffusion 2026.05 Arxiv DiT-style flow matching for text
How to Train Your Latent Diffusion Language Model Jointly With the Latent Space 2026.05 Arxiv Joint latent encoder+diffusion training
BitLM: Unlocking Multi-Token Language Generation with Bitwise Continuous Diffusion 2026.05 Arxiv Bitwise continuous diffusion head
Language Generation as Optimal Control: Closed-Loop Diffusion in Latent Control Space 2026.05 Arxiv HJB-based latent optimal control
Continuous Diffusion Scales Competitively with Discrete Diffusion for Language 2026.05 Arxiv RePlaid scaling law, continuous vs discrete

2.3 AR-to-Diffusion Adaptation

Paper Title Year Venue Remark
Scaling Diffusion Language Models via Adaptation from Autoregressive Models (DiffuLLM) 2024.10 ICLR >7B, GPT2/LLaMA2 Adaptation
Large Language Models to Diffusion Finetuning 2025.01 ICML >7B
TESS 2: A Large-Scale Generalist Diffusion Language Model 2025.02 ACL >7B, Adapted from Mistral
SDAR: A Synergistic Diffusion-AutoRegression Paradigm 2025.10 Arxiv >7B, Qwen3-based BD
From Next-Token to Next-Block: Principled Adaptation Path 2025.11 Arxiv >7B, Adaptation Path
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed 2025.12 Arxiv >7B
LLaDA2.0: Scaling Up Diffusion Language Models to 100B 2025.12 Arxiv >7B, AR→dLLM at 100B
Where Should Diffusion Enter a Language Model? Geometry-Guided Hidden-State Replacement 2026.05 Arxiv Geometry-guided layer insertion

2.4 Hybrid AR-Diffusion (Block / Forcing)

A new section: hybrids that interleave block-level AR with intra-block diffusion, or "forcing" approaches that retain causal masks for KV-cache reuse.

Paper Title Year Venue Remark
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models (BD3-LM) 2025.03 ICLR <7B, Interpolation
D2F: Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing 2025.08 ICLR >7B, Faster-than-AR
Blockwise SFT for Diffusion Language Models: Reconciling Bidirectional Attention and Autoregressive Decoding 2025.08 Arxiv >7B
SDAR: Synergistic Diffusion-AutoRegression Paradigm 2025.10 Arxiv >7B, Block hybrid
Encoder-Decoder Block Diffusion Language Models for Efficient Training and Inference (E2D2) 2025.10 NeurIPS Block Enc-Dec
Fast-dLLM v2: Efficient Block-Diffusion LLM 2025.09 Arxiv >7B, Block Decoding
WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference 2025.12 Arxiv Causal-attn diffusion
ReFusion: Diffusion LLM with Parallel Autoregressive Decoding 2025.12 Arxiv Slot-level interleaving
Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models 2026.02 Arxiv Adaptive block
DFlash: Block Diffusion for Flash Speculative Decoding 2026.02 Arxiv Block + speculative
Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models 2026.04 Arxiv Anchor-based cross-block decoding
When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models 2026.04 Arxiv Variable-size blocks
Dynamic Chunking for Diffusion Language Models 2026.05 Arxiv Content-defined semantic chunks

3. Reasoning & Policy Optimization

3.1 Reasoning & Planning

Paper Title Year Venue Remark
Diffusion of Thought: Chain-of-Thought Reasoning in dLLMs 2024.02 NeurIPS <7B, CoT Foundation
Beyond Autoregression: Discrete Diffusion for Complex Reasoning 2024.10 ICLR <7B
Tree Reward-Aligned Search for TReASURe in Masked Diffusion Language Models 2024.10 Arxiv Planning
d1: Scaling Reasoning in dLLMs via RL 2025.04 NeurIPS >7B, Reasoning scaling
Reinforcing the Diffusion Chain of Lateral Thought 2025.05 NeurIPS >7B
Thinking Inside the Mask: In-Place Prompting in dLLMs 2025.08 Arxiv >7B
Reinforced Context Order Recovery for Adaptive Reasoning 2025.08 Arxiv <7B, Planning
d2: Improved Techniques for Training Reasoning dLLMs 2025.09 Arxiv >7B
LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning 2025.10 Arxiv >7B
Beyond Surface Reasoning: Unveiling Long CoT Capacity 2025.10 Arxiv >7B
Coevolutionary Continuous Discrete Diffusion: Latent Reasoner 2025.10 Arxiv >7B
On the Reasoning Abilities of Masked Diffusion Language Models 2025.10 Arxiv >7B
Planner and Executor: Collaboration between Discrete Diffusion And Autoregressive Models in Reasoning 2025.10 Arxiv Collaboration
Diffuse Thinking: Exploring Diffusion Language Models as Efficient Thought Proposers for Reasoning 2025.10 Arxiv >7B
Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching 2026.02 Arxiv Step-level rationale stitching
Reasoning or Rationalization? The Role of Justifications in Masked Diffusion Models for Fact Verification 2026.03 Arxiv CoT dynamics analysis
Diffusion LLMs can think EoS-by-EoS 2026.03 Arxiv EoS-guided reasoning via padding
LogicDiff: Logic-Guided Denoising Improves Reasoning in Masked Diffusion Language Models 2026.03 Arxiv Logic-guided unmasking order
Learnability-Informed Fine-Tuning of Diffusion Language Models 2026.05 Arxiv LIFT, SFT with learnability schedule

3.2 Alignment & Reinforcement Learning

Paper Title Year Venue Remark
Preference-Based Alignment of Discrete Diffusion Models 2025.03 Arxiv >7B
DiFFPO: Training dLLMs to Reason Fast and Furious via RL 2025.05 Arxiv >7B, Direct Preference
LLaDA 1.5: Variance-Reduced Preference Optimization 2025.05 Arxiv >7B
wd1: Weighted Policy Optimization for Reasoning 2025.07 Arxiv >7B
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position 2025.08 Arxiv >7B, Safety
Jailbreaking Large Language Diffusion Models: Revealing Hidden Safety Flaws in Diffusion-Based Text Generation 2025.07 Arxiv Safety
The Devil behind the mask: An emergent safety vulnerability 2025.07 Arxiv Safety
MDPO: Overcoming the Training-Inference Divide 2025.08 Arxiv >7B
Reward-Weighted Sampling: Enhancing Non-Autoregressive Characteristics in Masked Diffusion LLMs 2025.08 EMNLP >7B
Inpainting-Guided Policy Optimization for dLLMs 2025.09 Arxiv >7B
Taming Masked Diffusion via Consistency Trajectory RL 2025.09 Arxiv >7B
TR2-D2: Tree Search Guided Trajectory-Aware Fine-Tuning 2025.09 Arxiv >7B
Revolutionizing RL Framework for Diffusion Large Language Models 2025.09 Arxiv >7B
A2D: Any-Order, Any-Step Safety Alignment for Diffusion Language Models 2025.09 Arxiv Safety
DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models 2025.09 Arxiv Safety
RFG: Test-Time Scaling for Diffusion Large Language Model Reasoning with Reward-Free Guidance 2025.09 Arxiv >7B
AGRPO: Simple Policy Gradients for Reasoning with Diffusion Language Models 2025.10 Arxiv >7B
Improving Reasoning via Group Diffusion Policy Optimization (GDPO) 2025.10 Arxiv >7B
Step-Aware Policy Optimization for Reasoning 2025.10 Arxiv >7B
MRO: Enhancing Reasoning via Multi-Reward Optimization 2025.10 Arxiv >7B
Enhancing Reasoning via Distribution Matching Policy Optimization 2025.10 Arxiv >7B
Boundary-Guided Policy Optimization for Memory-efficient RL 2025.10 Arxiv >7B
SPG: Sandwiched Policy Gradient for Masked Diffusion 2025.10 Arxiv >7B
Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies 2025.10 Arxiv >7B
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States 2025.10 Arxiv >7B
Principled RL for Diffusion LLMs Emerges from a Sequence-Level Perspective 2025.12 Arxiv >7B
d-TreeRPO: Towards More Reliable Policy Optimization for Diffusion Language Models 2025.12 Arxiv >7B
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor 2026.04 Arxiv Unified RL framework
DiRL: An Efficient Post-Training Framework for Diffusion Language Models 2025.12 Arxiv Post-training
Efficient and Stable Reinforcement Learning for Diffusion Language Models 2026.02 Arxiv Variance reduction
Agents of Diffusion: Enhancing Diffusion Language Models with Multi-Agent Reinforcement Learning for Structured Data Generation 2026.01 Arxiv Multi-agent RL
Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages 2026.03 Arxiv MDP formulation, entropy-guided steps
TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models 2026.04 Arxiv Trajectory-ranked SFT
Relative Score Policy Optimization for Diffusion Language Models 2026.05 Arxiv RSPO, RLVR for dLLMs
Adaptive Steering and Remasking for Safe Generation in Diffusion Language Models 2026.05 Arxiv Safety, contrastive steering
Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models 2026.05 Arxiv TraFL, trajectory-balance objective

4. Token Ordering & Generation Strategies

Paper Title Year Venue Remark
SSD-LM: Semi-autoregressive Simplex-based Diffusion for Modular Control 2022.10 ACL <7B, Blockwise
AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation 2023.05 NeurIPS <7B, AR-like noise
Train for the Worst, Plan for the Best: Understanding Token Ordering 2025.02 ICML <7B, Ordering Analysis
Block Diffusion: Interpolating Between Autoregressive and Diffusion LMs 2025.03 ICLR <7B, Interpolation
Review, Remask, Refine (R3): Process-Guided Block Diffusion 2025.07 ICML MOSS >7B, Block-wise
Any-Order Flexible Length Masked Diffusion 2025.09 Arxiv <7B, Order Flexibility
Don't Settle Too Early: Self-Reflective Remasking for Diffusion Language Models 2025.09 Arxiv >7B, Remasking
Don't Let It Fade: Preserving Edits via Token Timestep Allocation 2025.10 NeurIPS <7B, Edit preservation
Finish First, Perfect Later: Test-Time Token-Level Cross-Validation for Diffusion Large Language Models 2025.10 Arxiv >7B, Unmasking
Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies 2025.10 Arxiv >7B, Unmasking
Parallel Sampling from Masked Diffusion Models via Conditional Independence Testing 2025.10 Arxiv >7B, Unmasking
Diffusion Language Model Inference with Monte Carlo Tree Search 2025.12 Arxiv >7B, MCTS
Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty 2025.12 Arxiv >7B, Unmasking
Adaptation to Intrinsic Dependence in Diffusion Language Models 2026.02 Arxiv Distribution-agnostic schedule
Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration 2026.03 ACL Self-evaluation, Flexible length
D5P4: Partition Determinantal Point Process for Diversity in Parallel Discrete Diffusion Decoding 2026.03 Arxiv Diversity-aware decoding
Improving Sampling for Masked Diffusion Models via Information Gain 2026.02 Arxiv Info-Gain sampler
DOS: Dependency-Oriented Sampler for Masked Diffusion Language Models 2026.03 Arxiv Dependency-aware unmasking
Diffusion Language Models Are Natively Length-Aware 2026.03 Arxiv Length-aware EoS generation
Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models 2026.04 Arxiv Quality vs exploration trade-off
Remask, Don't Replace: Token-to-Mask Refinement in Masked Diffusion Language Models 2026.04 Arxiv T2M refinement, LLaDA2.1 analysis
Edit-Based Refinement for Parallel Masked Diffusion Language Models 2026.05 Arxiv ME-DLM, edit-based post-correction
When Confidence Misleads: Suffix Anchoring and Anchor-Proximity Confidence Modulation for Diffusion Language Models 2026.05 Arxiv Suffix anchor, confidence modulation

5. System Efficiency & Acceleration

5.1 Caching & Memory Strategy

Paper Title Year Venue Remark
dKV-Cache: The Cache for Diffusion Language Models 2025.05 NeurIPS >7B
FlashDLM: Accelerating Diffusion Language Model Inference via Efficient KV Caching and Guided Diffusion 2025.05 Arxiv >7B
Fast-dLLM: Training-free Acceleration via KV Cache + Parallel Decoding 2025.05 Arxiv NVIDIA; KV cache + parallel
dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching 2025.06 Arxiv >7B
d^2Cache: Accelerating via Dual Adaptive Caching 2025.09 Arxiv >7B
Attention Is All You Need for KV Cache in dLLMs 2025.10 Arxiv >7B
Attention Sinks in Diffusion Language Models 2025.10 Arxiv >7B
WeDLM: Reconciling Diffusion Language Models with Standard Causal Attention for Fast Inference 2025.12 Arxiv >7B, Causal cache
Stop the Flip-Flop: Context-Preserving Verification for Fast Revocable Diffusion Decoding (COVER) 2026.02 Arxiv KV-override verification
Focus-dLLM: Accelerating Long-Context Diffusion LLM Inference via Confidence-Guided Context Focusing 2026.02 Arxiv Long-context sparsity
Mosaic: Unlocking Long-Context Inference for Diffusion LLMs via Global Memory Planning and Dynamic Peak Taming 2026.01 Arxiv Long-context memory
Residual Context Diffusion Language Models 2026.01 Arxiv Recycle discarded tokens
MAGE: All-[MASK] Block Already Knows Where to Look in Diffusion LLM 2026.02 Arxiv MASK-guided sparse attention, block dLLM
MetaState: Persistent Working Memory for Discrete Diffusion Language Models 2026.03 Arxiv GRU-style cross-step memory
DyLLM: Efficient Diffusion LLM Inference via Saliency-based Token Selection and Partial Attention 2026.03 Arxiv Saliency-based partial attention
EntropyCache: Decoded Token Entropy Guided KV Caching for Diffusion Language Models 2026.03 Arxiv Entropy-guided KV cache refresh
LoSA: Locality Aware Sparse Attention for Block-Wise Diffusion Language Models 2026.04 Arxiv Locality-aware sparse KV, block dLLM
PulseCol: Periodically Refreshed Column-Sparse Attention for Accelerating Diffusion Language Models 2026.05 Arxiv Column-sparse attention, periodic refresh

5.2 Decoding & Sampling

Paper Title Year Venue Remark
Speculative Diffusion Decoding: Accelerating Language Generation through Diffusion 2024.08 NAACL <7B, Speculative Decoding
Wide-In, Narrow-Out: Revokable Decoding for Effective dLLMs 2025.07 Arxiv >7B
Accelerating Diffusion LLMs via Adaptive Parallel Decoding (APD) 2025.05 NeurIPS >7B
DLM-One: Diffusion Language Models for One-Step Generation 2025.06 Arxiv <7B
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles 2025.06 Arxiv >7B
Plan for Speed: Dilated Scheduling for Masked Diffusion Language Models 2025.06 Arxiv >7B
Beyond Fixed: Training-Free Variable-Length Denoising for Diffusion Large Language Models 2025.08 Arxiv >7B
DPad: Efficient Diffusion Language Models with Suffix Dropout 2025.08 Arxiv >7B
Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning 2025.09 NeurIPS >7B
AdaBlock-dLLM: Semantic-Aware Inference via Adaptive Block Size 2025.09 Arxiv >7B
dParallel: Learnable Parallel Decoding for dLLMs 2025.09 Arxiv >7B
Learning to Parallel: Accelerating dLLMs via Learnable Parallel Decoding 2025.09 Arxiv >7B
Spiffy: Multiplying Acceleration via Lossless Speculative Decoding 2025.09 Arxiv >7B, Speculative Decoding
DiffuSpec: Unlocking dLLMs for Speculative Decoding 2025.09 Arxiv >7B, Speculative Decoding
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall 2025.10 Arxiv >7B
Saber: Efficient Sampling with Backtracking Enhanced Remasking 2025.10 Arxiv >7B
CreditDecoding: Parallel Decoding with Trace Credits 2025.10 Arxiv >7B
Accelerating dLLM Inference via Local Determinism Propagation 2025.10 Arxiv >7B
Self Speculative Decoding for Diffusion Large Language Models 2025.10 Arxiv >7B, Speculative Decoding
SpecDiff-2: Scaling Diffusion Drafter Alignment 2025.11 Arxiv >7B, Speculative Decoding
Orchestrating Dual-Boundaries: An Arithmetic Intensity Inspired Acceleration Framework for Diffusion Language Models 2025.11 Arxiv >7B
Beyond Confidence: Adaptive and Coherent Decoding for Diffusion Language Models 2025.11 Arxiv >7B
Fail Fast, Win Big: Rethinking the Drafting Strategy in Speculative Decoding via Diffusion LLMs 2025.12 Arxiv >7B, Speculative Decoding
Fast-Decoding via Progress-Aware Confidence Schedules 2025.12 Arxiv >7B
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding 2025.12 Arxiv >7B
Context-Aware Initialization for Reducing Generative Path Length in Diffusion Language Models 2025.12 Arxiv >7B
DART: Diffusion-Inspired Speculative Decoding for Fast LLM Inference 2026.01 Arxiv Speculative drafting
DFlash: Block Diffusion for Flash Speculative Decoding 2026.02 Arxiv Block + speculative
Swordsman: Entropy-Driven Adaptive Block Partition for Efficient Diffusion Language Models 2026.02 Arxiv Entropy-adaptive blocks
Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding 2026.02 Arxiv DiCo, three-phase parallel decoding
Free Lunch for Pass@k? Low Cost Diverse Sampling for Diffusion Language Models 2026.03 Arxiv Diverse sampling, Pass@k
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation 2026.03 Arxiv Self-speculation, block diffusion
Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models 2026.04 Arxiv DEMASK, dependency predictor
DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models 2026.04 Arxiv Speculative + causal drafter
Accelerating Speculative Decoding with Block Diffusion Draft Trees 2026.04 Arxiv Draft trees for block diffusion
Stability-Weighted Decoding for Diffusion Language Models 2026.04 Arxiv KL-based token stability metric
R²-dLLM: Accelerating Diffusion Large Language Models via Spatio-Temporal Redundancy Reduction 2026.04 Arxiv Spatial + temporal redundancy reduction
Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast 2026.05 Arxiv Self-contrast, HD token focus
Factorization-Error-Free Discrete Diffusion Language Model via Speculative Decoding 2026.05 Arxiv FeF-DLLM, prefix-conditioned factorization
PSD: Pushing the Pareto Frontier of Diffusion LLMs via Parallel Speculative Decoding 2026.05 Arxiv Parallel speculative, hierarchical acceptance
Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers 2026.05 Arxiv WINO revokable parallel decoding

5.3 Distillation, Quantization & Sparsity

Paper Title Year Venue Remark
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time 2024.10 ICLR <7B, Distillation
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction 2025.08 Arxiv >7B, Sparsity
DLLMQuant: Quantizing Diffusion-based Large Language Models 2025.08 Arxiv >7B, Quantization
Quantization Meets dLLMs: Post-training Quantization Study 2025.08 Arxiv >7B, Quantization
FS-DFM: Few-Step Diffusion Language Model 2025.09 Arxiv >7B
SparseD: Sparse Attention for Diffusion Language Models 2025.09 Arxiv >7B, Sparsity
LLaDA-MoE: A Sparse MoE Diffusion Language Model 2025.09 Arxiv >7B, MoE
Ultra-Fast Language Generation via Discrete Diffusion Divergence Instruct 2025.10 Arxiv >7B, Distillation
CDLM: Consistency Diffusion Language Models For Faster Sampling 2025.11 Arxiv >7B, Consistency
Sink-Aware Pruning for Diffusion Language Models 2026.02 Arxiv Unstable sink pruning
FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation 2026.04 Arxiv Few-step distillation, S2S
On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks 2026.04 Arxiv GPTQ/HAWQ on code dLLMs
Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models 2026.04 Arxiv Cross-architecture dLLM distillation
TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM 2026.05 Arxiv Trajectory self-distillation
Infinite Mask Diffusion for Few-Step Distillation 2026.05 Arxiv IMDM, stochastic infinite-state mask
Self-Distilled Trajectory-Aware Boltzmann Modeling for Diffusion Language Models 2026.05 Arxiv TABOM, Boltzmann ranking objective
DiLaDiff: Distilled Latent-Augmented Diffusion for Language Modeling 2026.05 Arxiv Latent + consistency distillation

5.4 Inference Frameworks & Systems

New section: production-grade frameworks and runtime engineering for dLLMs.

Paper Title Year Venue Remark
dlmserve 2026.05 Repo First OSS serving engine for diffusion LMs (LLaDA family); OpenAI-compatible HTTP, step-level batching (2.5x HF), LocalLeap (1.8x). MIT.
FOCUS: DLLMs Know How to Tame Their Compute Bound 2026.01 ICML Training-free inference system; token eviction for higher throughput
dInfer: An Efficient Inference Framework for Diffusion Language Models 2025.10 Arxiv Modular framework, >1100 TPS
JetEngine (SDAR) 2025.10 Repo Lightweight engine for SDAR (3700+ TPS on H200)
Mercury: Ultra-Fast Language Models Based on Diffusion 2025.06 Arxiv Inception Labs commercial dLLM
Seed Diffusion: Large-Scale dLLM with High-Speed Inference 2025.08 Arxiv ByteDance code-focused dLLM
TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload 2026.05 Arxiv MoE expert offload, lossless

6. Multi-modal & Physical AI

6.1 Multi-modal dLLMs

Paper Title Year Venue Remark
Dual Diffusion for Unified Image Generation and Understanding 2025.01 Arxiv Unified Task
Unified Multimodal Discrete Diffusion (UniDisc) 2025.03 Arxiv Unified Diffusion
LaViDa: A Large Diffusion LLM for Multimodal Understanding 2025.05 NeurIPS Spotlight Understanding
MMaDA: Multimodal Large Diffusion Language Models 2025.05 NeurIPS Native Multimodal
Dimple: Discrete Diffusion Multimodal LLM with Parallel Decoding 2025.05 Arxiv Parallel Multimodal
LLaDA-V: Diffusion LLMs with Visual Instruction Tuning 2025.06 Arxiv Visual Tuning
Muddit: Liberating Generation Beyond Text-to-Image 2025.05 Arxiv Multi-modal
Show-o2: Improved Native Unified Multimodal Models 2025.06 Arxiv Unified Generation
Diffuse Everything: Multimodal Diffusion on Arbitrary Spaces 2025.06 ICML Arbitrary Spaces
TBAC-UniImage: Unified Understanding and Generation by Ladder-Side Diffusion Tuning 2025.08 Arxiv Tencent ladder-side tuning
Lumina-DiMOO: Omni Diffusion LLM for Generation 2025.10 Arxiv Omni-generation
MMaDA-Parallel: Thinking-Aware Editing and Generation 2025.11 Arxiv Parallel Multimodal
DiffusionVL: Translating AR Models into Diffusion VL Models 2025.12 Arxiv VL Adaptation
SDAR-VL: Stable and Efficient Block-wise Diffusion for Vision-Language Understanding 2025.12 Arxiv Block-diffusion VL
Dream-VL: Open Vision-Language Model with Diffusion Backbone 2025.12 Arxiv dVLM from Dream-7B
LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models 2026.02 Arxiv Unified RL post-training
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion 2026.03 Arxiv Any-to-any (text/speech/image)
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation 2026.04 Arxiv SigLIP-VQ + block diffusion
Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM 2026.04 Arxiv AR-VLM → block-diffusion VLM
Analyzing Diffusion and Autoregressive VLMs in Multimodal Embedding Space 2026.02 Arxiv Embedding analysis
Dynin-Omni: Omnimodal Unified Large Diffusion Language Model 2026.03 Arxiv Omnimodal

6.2 Vision-Language-Action (VLA)

Scope note: this section covers VLA models that use a diffusion/masked-diffusion language model as the backbone (dVLM-based VLA) or apply discrete diffusion as the action-decoding mechanism (not continuous diffusion action heads grafted onto an AR VLM). Pure continuous-diffusion-policy VLAs such as DiVLA (Wen et al., 2024), HybridVLA, and ProgressVLA are intentionally excluded because their language model is autoregressive — only the action head is diffusion-based.

(a) dVLM-backbone VLA — language backbone itself is a diffusion language model.

Paper Title Year Venue Remark
LLaDA-VLA: Vision Language Diffusion Action Models 2025.06 Arxiv First LLaDA(d-VLM)-based VLA
dVLA: Diffusion VLA with Multimodal Chain-of-Thought 2025.09 Arxiv dLLM backbone + multimodal CoT
Dream-VLA: Open Vision-Language-Action Model with Diffusion Backbone 2025.12 Arxiv dVLA from Dream-7B; first dLLM pretrained VLA
MMaDA-VLA: Large Diffusion VLA with Unified Multi-Modal Instruction and Generation 2026.03 Arxiv Native discrete-diffusion VLA from MMaDA

(b) Discrete-diffusion action decoding — language backbone may still be AR-VLM, but action chunks are decoded via discrete diffusion. Closely tied to dLLM literature for inference techniques.

Paper Title Year Venue Remark
Discrete Diffusion VLA: Action Decoding in VLA Policies 2025.08 Arxiv Unified-transformer + discrete-diffusion actions
E0: Enhancing Generalization and Fine-Grained Control in VLA Models via Tweedie Discrete Diffusion 2025.11 Arxiv AR-VLM backbone + Tweedie discrete diffusion on action tokens

6.3 Autonomous Driving / World Models

Scope note: works that apply discrete diffusion / masked-diffusion language modeling to driving trajectories, action codebooks, or tokenized world states. Continuous trajectory-diffusion planners (e.g., classical Diffusion Policy applied to driving) are out of scope.

Paper Title Year Venue Remark
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion 2023.11 ICLR Discrete diffusion on tokenized point-cloud world model
ReflectDrive: Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving 2025.09 Arxiv dLLM finetuned on discretized 2D driving space
Efficient and Explainable End-to-End Autonomous Driving via Masked Vision-Language-Action Diffusion 2026.02 Arxiv Discrete action codebook + masked diffusion
Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving 2026.05 Arxiv Block-diffusion VLA, speculative scaffold decoding

7. Agentic & Tool-Use dLLMs

New section — emerging line: how dLLMs behave as agents (planning, multi-turn, tool calling). Critical for connecting dLLMs to robotics and physical-AI agent stacks.

Paper Title Year Venue Remark
The Bitter Lesson of Diffusion Language Models for Agentic Workflows: A Comprehensive Reality Check 2026.01 Arxiv Embodied + tool-call eval
Agents of Diffusion: Enhancing Diffusion Language Models with Multi-Agent Reinforcement Learning for Structured Data Generation 2026.01 Arxiv Multi-agent RL
DLLM Agent: See Farther, Run Faster 2026.02 Arxiv dLLM-as-agent comparison

8. Theory, Guidance & Applications

8.1 Theory & Analysis

Paper Title Year Venue Remark
Can Diffusion Model Achieve Better Performance in Text Generation? Bridging the Gap between Training and Inference! 2023.05 ACL Findings <7B
TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings 2024.02 AAAI <7B
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling 2024.10 ICLR <7B
Theoretical Benefit and Limitation of Diffusion Language Model 2025.02 NeurIPS TER vs SER analysis
Generalized Interpolating Discrete Diffusion (GIDD) 2025.03 ICML Noising
Understanding the Quality-Diversity Trade-off in Diffusion Language Models 2025.03 ICML Quality-Diversity Trade-off
Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes 2025.05 ACL <7B
The Diffusion Duality 2025.06 ICML <7B, Theoretical Duality
Your Absorbing Discrete Diffusion Secretly Models the Bayesian Posterior 2025.07 Arxiv <7B
Time Is a Feature: Exploiting Temporal Dynamics in dLLMs 2025.08 Arxiv Temporal focus
Diffusion LLMs Know the Answer Before Decoding 2025.08 Arxiv Semantic focus
What Makes Diffusion Language Models Super Data Learners? 2025.10 Arxiv Data efficiency
Why mask diffusion does not work 2025.10 Arxiv Failure analysis
Empirical Analysis of Decoding Biases in Masked Diffusion Models 2025.10 Arxiv Decoding Bias
Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models 2025.10 Arxiv Speed Analysis
On the Role of Discreteness in Diffusion LLMs 2025.12 Arxiv Speed Analysis
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs 2025.10 ICLR
Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration 2026.03 ACL Self-evaluation, Generalization analysis
On the Role of Discreteness in Diffusion LLMs 2025.12 Arxiv Discreteness analysis
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs 2025.10 ICLR Benchmark
Diffusion Language Models are Super Data Learners 2025.11 Arxiv Data learner analysis
Adaptation to Intrinsic Dependence in Diffusion Language Models 2026.02 Arxiv Distribution-agnostic schedule theory
Confidence-Based Decoding is Provably Efficient for Diffusion Language Models 2026.03 Arxiv First theory of confidence-based decoding
Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding? 2026.02 Arxiv AR-like decoding analysis, NAP approach
Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects 2026.03 Arxiv Memorization, privacy analysis
Skip to the Good Part: Representation Structure & Inference-Time Layer Skipping in Diffusion vs. Autoregressive LLMs 2026.03 Arxiv Layer skipping, representation analysis
Autoregressive vs. Masked Diffusion Language Models: A Controlled Comparison 2026.03 Arxiv Controlled AR vs MDM empirical study
Why Gaussian Diffusion Models Fail on Discrete Data? 2026.04 Arxiv Multimodal sampling interval theory
Generative Frontiers: Why Evaluation Matters for Diffusion Language Models 2026.04 Arxiv Evaluation methodology critique
Lost in Diffusion: Uncovering Hallucination Patterns and Failure Modes in Diffusion Large Language Models 2026.04 Arxiv Hallucination patterns analysis
Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models 2026.04 Arxiv Proximity bias analysis
Measuring Temporal Linguistic Emergence in Diffusion Language Models 2026.04 Arxiv Temporal probing, linguistic emergence
Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data 2026.04 Arxiv Associative memory theory
Understanding and Accelerating the Training of Masked Diffusion Language Models 2026.05 Arxiv Bell-shaped time sampling, training speed
Uncertainty Quantification for Large Language Diffusion Models 2026.05 Arxiv UQ, hallucination detection for dLLMs

8.2 Guidance & Downstream Applications

Paper Title Year Venue Remark
DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation 2023.06 ACL Dialogue
DiffuDetox: A Mixed Diffusion Model for Text Detoxification 2023.06 ACL Findings Detoxification
PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in Poetry Generation 2023.06 AAAI Poetry Generation
ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer 2023.08 AAAI Text Style Transfer
P^3SUM: Preserving Author's Perspective in News Summarization with Diffusion Language Models 2023.11 NAACL Summarization
DiffuCOMET: Contextual Commonsense Knowledge Diffusion 2024.02 ACL Commonsense
DiffusionDialog: A Diffusion Model for Diverse Dialog Generation with Latent Space 2024.04 LREC-COLING Dialogue
Diffusion Guided Language Modeling 2024.08 ACL Findings Control
DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models 2024.11 ACL Findings Data Synthesis
Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models 2024.12 ACL Text Segmentation
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models 2025.02 ACL Text Editing
Constrained Discrete Diffusion 2025.03 NeurIPS Constraint
Planning with Diffusion Models for Target-Oriented Dialogue Systems 2025.04 ACL Dialogue
CtrlDiff: Boosting dLLMs with Dynamic Block Prediction 2025.05 Arxiv Control
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective 2025.05 Arxiv Embedding
DINGO: Constrained Inference for Diffusion LLMs 2025.05 Arxiv Constrained Decoding
Inference-Time Scaling of Discrete Diffusion Models via Importance Weighting and Optimal Proposal Design 2025.05 ICLR SMC test-time scaling
Mercury: Ultra-Fast Language Models Based on Diffusion 2025.06 Arxiv Code
DiffuCoder: Improving Masked Diffusion for Code Generation 2025.06 Arxiv Code
Unveiling the Potential of Diffusion Large Language Model in Controllable Generation 2025.07 Arxiv Control
Arg-LLaDA: Argument Summarization via Large Language Diffusion Models and Sufficiency-Aware Refinement 2025.07 Arxiv Summarization
Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling 2025.08 Arxiv Text Style Transfer
Seed Diffusion: Large-Scale dLLM with High-Speed Inference 2025.08 Arxiv Code
TreeDiff: AST-Guided Code Generation with Diffusion LLMs 2025.08 Arxiv Code (syntax-aware)
Beyond Autoregression: Empirical Study for Code Generation 2025.09 Arxiv Code
Tree Reward-Aligned Search for TReASURe in Masked Diffusion Language Models 2024.10 Arxiv Control
Syntax-Guided Diffusion Language Models with User-Integrated Personalization 2025.10 Arxiv Personalization
TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models 2025.10 Arxiv Hallucination
Don't Let It Fade: Preserving Edits via Token Timestep Allocation 2025.10 NeurIPS Control
Diffusion Language Models for Speech Recognition 2026.04 Arxiv ASR rescoring (MDLM/USDM)
CAGenMol: Condition-Aware Diffusion Language Model for Goal-Directed Molecular Generation 2026.04 Arxiv Molecular generation
TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion 2026.02 Arxiv Tabular generation, masked diffusion
Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models 2026.03 Arxiv RAG, retrieval-prior conflict handling
DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning 2026.03 Arxiv Hallucination detection
Unlocking Prompt Infilling Capability for Diffusion Language Models 2026.04 Arxiv Prompt infilling via full-sequence masking
DiffuMask: Diffusion Language Model for Token-level Prompt Pruning 2026.04 Arxiv Prompt compression
BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning 2026.04 Arxiv Molecule generation + captioning
HIVE: Hidden-Evidence Verification for Hallucination Detection in Diffusion Large Language Models 2026.04 Arxiv Hallucination detection, denoising dynamics
Chainwash: Multi-Step Rewriting Attacks on Diffusion Language Model Watermarks 2026.05 Arxiv Watermark attack, security
DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models 2026.05 Arxiv Dense retrieval with dLLMs
Guidance Is Not a Hyperparameter: Learning Dynamic Control in Diffusion Language Models 2026.05 Arxiv Dynamic CFG via RL
Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models 2026.05 Arxiv Adaptive guidance schedule, SAE analysis
Constrained Code Generation with Discrete Diffusion 2026.05 Arxiv Neurosymbolic constrained code generation
Prompt Compression in Diffusion Large Language Models: Evaluating LLMLingua-2 on LLaDA 2026.05 Arxiv Prompt compression study
Machine Unlearning for Masked Diffusion Language Models 2026.05 Arxiv MDU, unlearning framework

9. Seminal Diffusion Papers

Paper Title Year Venue Remark
Deep Unsupervised Learning using Nonequilibrium Thermodynamics 2015.03 ICML Formulation
Denoising Diffusion Probabilistic Models (DDPM) 2020.06 NeurIPS -
Denoising Diffusion Implicit Models (DDIM) 2020.10 ICLR -
Score-Based Generative Modeling through SDEs 2020.11 ICLR -
Diffusion Models Beat GANs on Image Synthesis 2021.05 NeurIPS CG
Structured Denoising Diffusion in Discrete State-Spaces (D3PM) 2021.07 NeurIPS Discrete
Vector Quantized Diffusion Model (VQ-Diffusion) 2021.11 CVPR VQ
High-Resolution Image Synthesis with Latent Diffusion (LDM) 2021.12 CVPR -
Progressive Distillation for Fast Sampling 2022.02 ICLR Distillation
DPM-Solver: Fast ODE Solver for Sampling 2022.06 NeurIPS -
Classifier-Free Diffusion Guidance 2022.07 NeurIPS CFG
Analog Bits: Generating Discrete Data using Diffusion 2022.08 ICLR Self-conditioning
Scalable Diffusion Models with Transformers (DiT) 2022.12 ICCV Scalable focus
Consistency Models 2023.03 ICML -

🤝 Contact

About

A comprehensive list of papers about Large-Language-Diffusion-Models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors