Skip to content

Commit d1e5939

Browse files
committed
one-liner call to action.
1 parent 507c7c2 commit d1e5939

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

lora-fast.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,9 @@ This post outlined an optimization recipe for fast LoRA inference with Flux, dem
183183

184184
For consumer GPUs, specifically the RTX 4090, we tackled memory limitations by introducing T5 text encoder quantization (NF4) and leveraging regional compilation. This comprehensive recipe achieved a substantial 2.04x speedup, making LoRA inference on Flux viable and performant even with limited VRAM. The key insight is that by carefully managing compilation and quantization, the benefits of LoRA can be fully realized across different hardware configurations.
185185

186+
Hopefully, the recipes from this post will inspire you to optimize your
187+
LoRA-based use cases, benefitting from speedy inference.
188+
186189
## Resources
187190

188191
Below is a list of the important resources that we cited throughout this post:

0 commit comments

Comments
 (0)