Skip to content

FLUX kohya LoRA conversion crashes on final_layer (KeyError on missing adaLN_modulation_1; "Incompatible keys" on final_layer alphas) #13998

@christopher5106

Description

@christopher5106

Describe the bug

The kohya / sd-scripts FLUX LoRA converter (_convert_sd_scripts_to_ai_toolkit in lora_conversion_utils.py) crashes on LoRAs that include final_layer weights, in two distinct ways. Both reproduce on diffusers==0.38.0 and on current main.

Case A — KeyError when final_layer.linear has no adaLN_modulation_1 companion.
assign_remaining_weights pops the adaLN_modulation_1 source key unconditionally, so a LoRA that trained final_layer.linear but not final_layer.adaLN_modulation.1 (a real, if uncommon, kohya export) raises KeyError.

Case B — Incompatible keys detected when final_layer carries .alpha entries.
assign_remaining_weights consumes only lora_down/lora_up, never the .alpha keys. The leftover lora_unet_final_layer_*.alpha keys then reach the remaining_keys guard, which only tolerates lora_te* prefixes, so the (otherwise valid) LoRA is rejected.

Reproduction

import torch
from diffusers.loaders.lora_pipeline import FluxLoraLoaderMixin

RANK, HID = 4, 3072

def kohya(name, out, inp, alpha=True):
    d = {f"{name}.lora_down.weight": torch.zeros(RANK, inp),
         f"{name}.lora_up.weight":   torch.zeros(out, RANK)}
    if alpha:
        d[f"{name}.alpha"] = torch.tensor(float(RANK))
    return d

def base():  # one recognized block so detection routes to the kohya converter
    return kohya("lora_unet_double_blocks_0_img_attn_proj", HID, HID)

# Case A: final_layer.linear without adaLN_modulation_1  ->  KeyError
sd = base() | kohya("lora_unet_final_layer_linear", 64, HID)
FluxLoraLoaderMixin.lora_state_dict(sd)
# KeyError: 'lora_unet_final_layer_adaLN_modulation_1.lora_down.weight'

# Case B: final_layer linear + adaLN present, alphas included  ->  Incompatible keys
sd = base() | kohya("lora_unet_final_layer_linear", 64, HID) \
            | kohya("lora_unet_final_layer_adaLN_modulation_1", 2 * HID, HID)
FluxLoraLoaderMixin.lora_state_dict(sd)
# ValueError: Incompatible keys detected:
#   lora_unet_final_layer_linear.alpha, lora_unet_final_layer_adaLN_modulation_1.alpha

Root cause

In _convert_sd_scripts_to_ai_toolkit:

  • assign_remaining_weights does value = source.pop(source_key) (L554) with no fallback — Case A. It also only handles lora_down/lora_up, so final_layer .alpha keys are never consumed.
  • The unconsumed alphas then hit the remaining_keys guard, which raises unless every leftover key starts with lora_te/lora_te1Case B.

Suggested fix

  • In assign_remaining_weights, skip an assignment whose source_key is absent (source.pop(source_key, None) + continue on None) so a lone final_layer.linear still converts.
  • Consume final_layer .alpha keys alongside lora_down/lora_up (or strip them before the remaining_keys guard), as the per-block _convert_to_ai_toolkit path already does.

System Info

  • diffusers 0.38.0 (and main @ 7bf0000)
  • transformers 5.9.0, peft 0.19.1, torch 2.11+cu128, Python 3.11, Linux
  • Model: black-forest-labs/FLUX.1 family (kohya/sd-scripts LoRAs with final_layer weights)

Who can help?

@sayakpaul @BenjaminBossan

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions