This is an automated archive made by the Lemmit Bot.

The original was posted on /r/stablediffusion by /u/More_Bid_2197 on 2025-07-15 02:39:33+00:00.

Original Title: People complain that training LoRas in Flux destroys the text/anatomy after more than 4,000 steps. And, indeed, this happens. But I just read on hugginface that Alimama’s Turbo LoRa was trained on 1 million images. How did they do this without destroying the model ?


Can we apply this method to train smaller loras ?

Learning rate: 2e-5

Our method fix the original FLUX.1-dev transformer as the discriminator backbone, and add multi heads to every transformer layer. We fix the guidance scale as 3.5 during training, and use the time shift as 3.