AI Technical Discussion and Info

Ptar

Administrator
Staff member
Lifetime VIP
Joined
Sep 18, 2006
Messages
30,659
Reaction score
63,967
Location
221B Baker Street
There seems to be real interest in AI Image Generation here in the sub-forum, including lots of questions:
How do I do that?
How do I get started?
What software are you using; is it a website?

This thread is for these questions and technical discussion about AI image generation.

Personally, I started using Stable Diffusion.
Great tutorial here: https://www.datacamp.com/tutorial/how-to-run-stable-diffusion
With models downloaded from here: https://civitai.com/
How to Write Amazing Prompts: https://www.fotor.com/blog/stable-diffusion-prompts/

I would still be lost in 'free-for-a-week' online sites like https://sexy.ai/ or https://www.catbird.ai/

Feel free to ask questions and provide answers!


(Oh, and keep in mind gentlemen, the Rules of the Forum still and will always, apply to all our activity here. :cool:)
 
There seems to be real interest in AI Image Generation here in the sub-forum, including lots of questions:
How do I do that?
How do I get started?
What software are you using; is it a website?

This thread is for these questions and technical discussion about AI image generation.

Personally, I started using Stable Diffusion.
Great tutorial here: https://www.datacamp.com/tutorial/how-to-run-stable-diffusion
With models downloaded from here: https://civitai.com/
How to Write Amazing Prompts: https://www.fotor.com/blog/stable-diffusion-prompts/

I would still be lost in 'free-for-a-week' online sites like https://sexy.ai/ or https://www.catbird.ai/

Feel free to ask questions and provide answers!


(Oh, and keep in mind gentlemen, the Rules of the Forum still and will always, apply to all our activity here. :cool:)
Not a word about your mentor? :LOL:
 
RuntimeError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0;15.90 GiB total capacity; 12.04 GiB already allocated; 2.72 GiB free; 12.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

So, I got this error and wrestled with it for a couple days.
The solution looks like you need a degree in Python programming language. :rolleyes:
Although, I did see comments about decreasing batch size and the number of batches, and so on.
What finally worked for me was to keep my width and height measurements reasonable. I was pushing out 800x1000 and have decreased to 500x700. It seems to have fixed it up for me. :)
 
An example of positive and negative prompts I've used in the past: :)

Positive: (((full body portrait))), ((big breasts)), ((busty)), ((cleavage)), crystal clear photo, sharp focus, high detail, high image quality, (skin pores:0.2), (freckles:0.8), (wrinkles:0.1),

Negative prompt: deformed, ugly, distorted, disfigured, blurry, shiny, glossy, unrealistic, painting, cartoon, anime, extra limbs, extra fingers, too many fingers, bad anatomy, out of frame, perfect skin, cgi, photoshop, nude, naked, (((nipple))), topless, nsfw

(gigantic breasts:1.1), <lora:epiNoiseoffset_v2-pynoise:1> dim lighting, rim lighting <lora:hyperfusion_213k_32dim-LoCon-epoc6-v5:0.6> , round bulging breasts, lifting her breasts, presenting her breasts, <lora:fC-engorged_v1.1:0.4>, <lora:LoconBurstingBreasts_v1:1>, cleavage, skindentation
 
So, I got this error and wrestled with it for a couple days.
The solution looks like you need a degree in Python programming language. :rolleyes:
Although, I did see comments about decreasing batch size and the number of batches, and so on.
What finally worked for me was to keep my width and height measurements reasonable. I was pushing out 800x1000 and have decreased to 500x700. It seems to have fixed it up for me. :)
the trick with this is do not generate in sizes too big, like 800x1000 pixels, it needs way too much memory for it, keep it below 800, and when you have one you like it, then that particular one pic, regenerate, and then apply a resize script. There are many options, so far the best combo for me was to generate a simple image, on like 20 steps, and then ask a resize (4-6) and then increase the steps to 30-40, so the extra details are added on resize. This is also much faster than generate a large image from the start
 
the trick with this is do not generate in sizes too big, like 800x1000 pixels, it needs way too much memory for it, keep it below 800, and when you have one you like it, then that particular one pic, regenerate, and then apply a resize script. There are many options, so far the best combo for me was to generate a simple image, on like 20 steps, and then ask a resize (4-6) and then increase the steps to 30-40, so the extra details are added on resize. This is also much faster than generate a large image from the start
Yeah, start smaller and then enlarge to the size that you want. Unless you have one of those A100 graphics cards and then it probably won't matter.
 
Prompts Advice: Want a number of ways to say busty? Or some other feature?
Synonym's are you friend! ;)
Search: busty synonym
 
I will admit I'm still a novice but here's a few tips (once you have installed the SD webui):

Model: Epicrealism (I've tested loads and this one gives me the best results)
Sampling Method: DDIM is really fast and gives nice results but my favourite is DPM++ 2M SDE Karras

Openpose if you want more control over poses (lots of tutorials out there)
Dynamic Thresholding (CFG scale fix)
addetail LORA
Noiseoffset LORA

Next Diffusion (it generates random prompts but I just use it for the lighting and photo style prompts it creates.

Learn about inpainting, it's really powerful and can make all the difference.

Learn how to train your own LORAs for more original looking content: https://civitai.com/models/22530/guide-make-your-own-loras-easy-and-free

Stick to 512x768 or 768x512

If you are getting weird deformed bodies try lowering the resolution to 512x720, too high a resolution can give you double bodies

Play around with the number of steps (even if the model page recommends 20, that might not work well with the LORAs you are using), also play around with the CFG scale

Alternatively you can create a batch of images to compare the different CFG scales, steps and other things like this:

Scripts can be found at the bottom of your generation parameters in txt2img or img2img.

  • X/Y/Z Plot
    Capable of generating a series of images, usually with the exact same seed, but varying parameters of your choice. Can compare almost anything you want, including different models, parts of your prompt, sampler, upscaler and much more. You can have 1, 2, or 3 variable parameters, hence the X, Y and Z.
    Your parameters in X/Y/Z Plot are separated by commas, but anything else can go inbetween. The most common parameter to compare is S/R Prompt, where the first term is a phrase in your prompt and each term afterwards will replace the original. Knowing this, you can compare, say, Lora intensity, like this:
    <lora:my lora:0.4>, <lora:my lora:0.6>, <lora:my lora:0.8>, <lora:my lora:1>

Standard Negative prompts that work really well for me:
(worst quality:1.2), (low quality:1.2), (lowres:1.1), (monochrome:1.1), (greyscale), multiple views, comic, sketch, (((bad anatomy))), (((deformed))), (((disfigured))), watermark, multiple_views, mutation hands, mutation fingers, extra fingers, missing fingers, watermark

For more realistic skin texture etc try these prompts:
masterpiece, best quality, ultra-detailed, (skin texture:1.1) (film grain:1.3),high res, best shadow

and optionally:
goosebumps, pores, imperfect skin:1.2

I have more but if I kept going I might never stop :D
 
Last edited:
Here's some info about Lora training that I've gathered, plus a Lora I just made on pics of 106 models and random celebrities. (all names are in the zip file, and I avoided DNP models)

Couldn't add the Lora as an attachment for some reason: https://www.dosyaupload.com/27cog/selected_subjects_v2.zip
Edit: New version here: https://www.dosyaupload.com/1rf33/selected_subjects_v3.zip

There's many full guides out there with more information. I'd recommend "The Other LoRA Training Rentry", which has lots of detailed info and is up-to-date.

My strategy is to take my folders of images of different women, tag them with the WD14 tagger, and discard images with certain tags (like "2girls" to filter out group pictures). Then I randomly take some number from each, 50 in this case, and train on them. The images are captioned with the full name plus the WD14 tags. For some women, I manually added "breast implants" and "butterface" as tags. The "breast implants" tag works as expected and is great as a negative prompt, but the "butterface" tag didn't end up capturing what I wanted it to.

The tool I use for training is Kohya's SD scripts. Some people have made easier interfaces for this tool, like bmaltais is probably the most popular one and has WD14 tagging built in.

Once I have my training data prepared with a mess of Python scripts I wrote, I ran training with this command in Linux. You'd have to adapt it for Windows and fill in your own directories. I'm mainly putting it here for reference:

python accelerate launch --num_cpu_threads_per_process=2 ./train_network.py \
--pretrained_model_name_or_path=${MODEL_DIR}/v1-5-pruned.ckpt \
--train_data_dir=${TRAINING_DIR}/selected_subjects \
--output_dir=${SD_DIR}/models/Lora/trained \
--logging_dir=${SD_DIR}/models/Lora/trained/logging \
--sample_prompts=cleaned_subjects_prompts.txt \
--enable_bucket --resolution=512,512 --bucket_no_upscale \
--network_alpha=128 --network_dim=256 \
--save_model_as=safetensors --network_module=networks.lora \
--output_name=kohya_selected_subjects_v2 \
--sample_every_n_steps=10000 \
--no_half_vae \
--learning_rate=1 --lr_scheduler=constant --lr_warmup=0 --optimizer_type="Prodigy" \
--optimizer_args "decouple=True" "d0=1e-5" "betas=0.9,0.999" "d_coef=1.0" "weight_decay=0.01" "use_bias_correction=False" "safeguard_warmup=False" \
--train_batch_size=2 \
--scale_weight_norms=1 \
--random_crop --caption_tag_dropout_rate=0.05 \
--save_every_n_epochs=5 --max_train_epochs=50 \
--mixed_precision=bf16 --save_precision=bf16 --seed=12345 --caption_extension=.txt --max_data_loader_n_workers=0 \
--bucket_reso_steps=32 --xformers

The model that I train on is the SD1.5 base model.

I'm running on a 24GB GPU, and you may need to adjust some of these settings to get training to fit into memory. Lowering the batch size and adding --gradient_checkpointing should help.

I set 50 epochs for training, but didn't really want to train that long. After training was done with 20 epochs, I killed the training job and resized the Lora with this command:

python networks/resize_lora.py --save_precision=fp16 --new_rank=24 \
--model=${SD_DIR}/models/Lora/trained/kohya_selected_subjects_v2-000020.safetensors
--save_to=${SD_DIR}/models/Lora/trained/kohya_selected_subjects_v2_e20_resize24.safetensors

The likenesses aren't perfect here, and there's a lot I could improve in my training. Better image filtering (removing low-quality and messy images) and better captioning are the first things I'd want to improve. Messing around with training settings could also help, and if you have suggestions of things to change let me know. I understand some of these settings well, some a little bit, and some not at all. Feel free to ask about them.
 

Attachments

  • ComfyUI_temp_auxgs_00494_.jpg
    ComfyUI_temp_auxgs_00494_.jpg
    114.5 KB · Views: 363
  • ComfyUI_temp_auxgs_00479_.jpg
    ComfyUI_temp_auxgs_00479_.jpg
    170.1 KB · Views: 306
  • ComfyUI_temp_auxgs_00460_.jpg
    ComfyUI_temp_auxgs_00460_.jpg
    125.9 KB · Views: 308
  • ComfyUI_temp_auxgs_00456_.jpg
    ComfyUI_temp_auxgs_00456_.jpg
    124.2 KB · Views: 342
  • ComfyUI_temp_auxgs_00426_.jpg
    ComfyUI_temp_auxgs_00426_.jpg
    138.9 KB · Views: 375
  • ComfyUI_temp_auxgs_00404_.jpg
    ComfyUI_temp_auxgs_00404_.jpg
    158.1 KB · Views: 355
  • ComfyUI_temp_auxgs_00389_.jpg
    ComfyUI_temp_auxgs_00389_.jpg
    153 KB · Views: 330
  • ComfyUI_temp_auxgs_00348_.jpg
    ComfyUI_temp_auxgs_00348_.jpg
    134.3 KB · Views: 350
  • ComfyUI_temp_auxgs_00183_.jpg
    ComfyUI_temp_auxgs_00183_.jpg
    170.9 KB · Views: 364
Last edited:
I continued training the Lora from the previous post and added an updated download link. The likenesses seem to be better, and I uploaded the full 300MB Lora since I'm putting it on an external host anyway.

This was my training command:

python accelerate launch --num_cpu_threads_per_process=2 ./train_network.py \
--pretrained_model_name_or_path=${MODEL_DIR}/models/Stable-diffusion/v1-5-pruned.ckpt \
--train_data_dir=${TRAINING_DIR}/selected_subjects \
--output_dir=${SD_DIR}/models/Lora/trained \
--logging_dir=${SD_DIR}/models/Lora/trained/logging \
--network_weights=${SD_DIR}/models/Lora/trained/kohya_selected_subjects_v2-000020.safetensors \
--sample_prompts=user/cleaned_subjects_prompts.txt \
--enable_bucket --resolution=512,512 --bucket_no_upscale \
--network_alpha=128 --network_dim=256 \
--save_model_as=safetensors --network_module=networks.lora \
--output_name=kohya_selected_subjects_v2_resume \
--sample_every_n_steps=10000 \
--no_half_vae \
--learning_rate=1 --lr_scheduler=constant --lr_warmup=0 --optimizer_type="Prodigy" \
--optimizer_args "decouple=True" "d0=1e-5" "betas=0.9,0.99" "d_coef=1.0" "weight_decay=0.01" "use_bias_correction=True" "safeguard_warmup=True" \
--train_batch_size=8 \
--scale_weight_norms=2 \
--random_crop --caption_tag_dropout_rate=0.05 \
--save_every_n_epochs=5 --max_train_epochs=50 \
--mixed_precision=bf16 --save_precision=bf16 --seed=12345 --caption_extension=.txt --max_data_loader_n_workers=0 \
--bucket_reso_steps=32 --xformers

The main changes were adding the "network_weights" option to resume training, increasing the batch size, tweaking a couple of "optimizer_args" (not sure if that made any difference), and changing "scale_weight_norms" from 1 to 2. scale_weight_norms is an option that helps prevent overfitting, and changing the value from 1 to 2 makes its effect weaker.
 

Attachments

  • Sexy_Venera_comparison_v2v3.png
    Sexy_Venera_comparison_v2v3.png
    3.7 MB · Views: 593
  • Rachel_Bandini_comparison_v2v3.png
    Rachel_Bandini_comparison_v2v3.png
    3.4 MB · Views: 571
  • Nigella_Lawson_comparison_v2v3.png
    Nigella_Lawson_comparison_v2v3.png
    3.3 MB · Views: 538
  • Marie_Claude_Bourbonnais_comparison_v2v3.png
    Marie_Claude_Bourbonnais_comparison_v2v3.png
    3.3 MB · Views: 479
  • Luna_Amor_comparison_v2v3.png
    Luna_Amor_comparison_v2v3.png
    3.3 MB · Views: 465
  • Danica_Collins_comparison_v2v3.png
    Danica_Collins_comparison_v2v3.png
    3.3 MB · Views: 507
  • Christina_Hendricks_comparison_v2v3.png
    Christina_Hendricks_comparison_v2v3.png
    3.3 MB · Views: 493
  • Chloe_Vevrier_comparison_v2v3.png
    Chloe_Vevrier_comparison_v2v3.png
    3.6 MB · Views: 446
  • Ala_Pastelle_comparison_v2v3.png
    Ala_Pastelle_comparison_v2v3.png
    3.4 MB · Views: 440
  • Valory_Irene_comparison_v2v3.png
    Valory_Irene_comparison_v2v3.png
    3.4 MB · Views: 536
This is going to take some time (possibly messy!) to investigate fully but I will thank you in advance for sharing your process and tools. The images are incredibly realistic and match the real model very well
 
I installed Stable Diffusion and have played with it a bit. Any tips on using it to alter existing pics?
That's something I haven't fully explored. :)
I'm still working with text2img
 
Aitrepreneur has put out some good videos for newbies who want to get into generating their own AI art. The easiest way to install Stable Diffusion and Automatic1111 on your computer. Once you're comfortable enough running the program and making your own art you can move on to the more advanced training a Lora for a particular subject to make art of your favorite models or insert yourself into your artwork.

Is this the same for SD 1.5 with regard to training a Lora?
 
Top