Describe the bug
The kohya / sd-scripts FLUX LoRA converter (_convert_sd_scripts_to_ai_toolkit in lora_conversion_utils.py) crashes on LoRAs that include final_layer weights, in two distinct ways. Both reproduce on diffusers==0.38.0 and on current main.
Case A — KeyError when final_layer.linear has no adaLN_modulation_1 companion.
assign_remaining_weights pops the adaLN_modulation_1 source key unconditionally, so a LoRA that trained final_layer.linear but not final_layer.adaLN_modulation.1 (a real, if uncommon, kohya export) raises KeyError.
Case B — Incompatible keys detected when final_layer carries .alpha entries.
assign_remaining_weights consumes only lora_down/lora_up, never the .alpha keys. The leftover lora_unet_final_layer_*.alpha keys then reach the remaining_keys guard, which only tolerates lora_te* prefixes, so the (otherwise valid) LoRA is rejected.
Reproduction
import torch
from diffusers.loaders.lora_pipeline import FluxLoraLoaderMixin
RANK, HID = 4, 3072
def kohya(name, out, inp, alpha=True):
d = {f"{name}.lora_down.weight": torch.zeros(RANK, inp),
f"{name}.lora_up.weight": torch.zeros(out, RANK)}
if alpha:
d[f"{name}.alpha"] = torch.tensor(float(RANK))
return d
def base(): # one recognized block so detection routes to the kohya converter
return kohya("lora_unet_double_blocks_0_img_attn_proj", HID, HID)
# Case A: final_layer.linear without adaLN_modulation_1 -> KeyError
sd = base() | kohya("lora_unet_final_layer_linear", 64, HID)
FluxLoraLoaderMixin.lora_state_dict(sd)
# KeyError: 'lora_unet_final_layer_adaLN_modulation_1.lora_down.weight'
# Case B: final_layer linear + adaLN present, alphas included -> Incompatible keys
sd = base() | kohya("lora_unet_final_layer_linear", 64, HID) \
| kohya("lora_unet_final_layer_adaLN_modulation_1", 2 * HID, HID)
FluxLoraLoaderMixin.lora_state_dict(sd)
# ValueError: Incompatible keys detected:
# lora_unet_final_layer_linear.alpha, lora_unet_final_layer_adaLN_modulation_1.alpha
Root cause
In _convert_sd_scripts_to_ai_toolkit:
assign_remaining_weights does value = source.pop(source_key) (L554) with no fallback — Case A. It also only handles lora_down/lora_up, so final_layer .alpha keys are never consumed.
- The unconsumed alphas then hit the
remaining_keys guard, which raises unless every leftover key starts with lora_te/lora_te1 — Case B.
Suggested fix
- In
assign_remaining_weights, skip an assignment whose source_key is absent (source.pop(source_key, None) + continue on None) so a lone final_layer.linear still converts.
- Consume
final_layer .alpha keys alongside lora_down/lora_up (or strip them before the remaining_keys guard), as the per-block _convert_to_ai_toolkit path already does.
System Info
- diffusers 0.38.0 (and
main @ 7bf0000)
- transformers 5.9.0, peft 0.19.1, torch 2.11+cu128, Python 3.11, Linux
- Model: black-forest-labs/FLUX.1 family (kohya/sd-scripts LoRAs with
final_layer weights)
Who can help?
@sayakpaul @BenjaminBossan
Describe the bug
The kohya / sd-scripts FLUX LoRA converter (
_convert_sd_scripts_to_ai_toolkitinlora_conversion_utils.py) crashes on LoRAs that includefinal_layerweights, in two distinct ways. Both reproduce ondiffusers==0.38.0and on currentmain.Case A —
KeyErrorwhenfinal_layer.linearhas noadaLN_modulation_1companion.assign_remaining_weightspops theadaLN_modulation_1source key unconditionally, so a LoRA that trainedfinal_layer.linearbut notfinal_layer.adaLN_modulation.1(a real, if uncommon, kohya export) raisesKeyError.Case B —
Incompatible keys detectedwhenfinal_layercarries.alphaentries.assign_remaining_weightsconsumes onlylora_down/lora_up, never the.alphakeys. The leftoverlora_unet_final_layer_*.alphakeys then reach theremaining_keysguard, which only tolerateslora_te*prefixes, so the (otherwise valid) LoRA is rejected.Reproduction
Root cause
In
_convert_sd_scripts_to_ai_toolkit:assign_remaining_weightsdoesvalue = source.pop(source_key)(L554) with no fallback — Case A. It also only handleslora_down/lora_up, sofinal_layer.alphakeys are never consumed.remaining_keysguard, which raises unless every leftover key starts withlora_te/lora_te1— Case B.Suggested fix
assign_remaining_weights, skip an assignment whosesource_keyis absent (source.pop(source_key, None)+ continue onNone) so a lonefinal_layer.linearstill converts.final_layer.alphakeys alongsidelora_down/lora_up(or strip them before theremaining_keysguard), as the per-block_convert_to_ai_toolkitpath already does.System Info
main@ 7bf0000)final_layerweights)Who can help?
@sayakpaul @BenjaminBossan