Correct t5 max lenght for Stable Diffsuion 3.5 LoRA training

It has recently come to light that the t5_max_length for Stable Diffusion 3 models (both Large and Medium) is not 256 as previously documented.

Including the line t5_max_length: 154 in the training configuration has shown significant improvements in LoRA training results, as demonstrated with both StableTuner and AI-Toolkit training scripts. While I have not yet tested this with kohya_ss there is no indication that the improvement would not apply to it as well.

Supporting References:

Reddit Post from terminusresearchorg:
"The eternal problem child, SD3.5, has some training parameter fixes that make it worth reattempting training for. The T5 text encoder, previously claimed by StabilityAI to use a sequence length of 256, is now understood to have actually used a sequence length of 154. Updating this results in more likeness being trained into the model with less degradation."
GitHub Pull Request by bghira (SimpleTuner creator):
This PR highlights that the "256 tokens is total, not just T5." See code change here.

Request:

Update the official documentation for SD3.5 models to reflect the correct t5_max_length: 154 value.

Civitai

Correct t5 max lenght for Stable Diffsuion 3.5 LoRA training

Request:

Subscribe to post

Subscribe to post