Untitled
unknown
plain_text
5 months ago
21 kB
90
No Index
[2024-10-17 18:24:18] [INFO] Running bash "/content/fluxgym-Colab/train.sh" [2024-10-17 18:24:22] [INFO] The following values were not passed to `accelerate launch` and had defaults used instead: [2024-10-17 18:24:22] [INFO] `--num_processes` was set to a value of `1` [2024-10-17 18:24:22] [INFO] `--num_machines` was set to a value of `1` [2024-10-17 18:24:22] [INFO] `--dynamo_backend` was set to a value of `'no'` [2024-10-17 18:24:22] [INFO] To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. [2024-10-17 18:24:27] [INFO] 2024-10-17 18:24:27.973353: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered [2024-10-17 18:24:27] [INFO] 2024-10-17 18:24:27.998102: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered [2024-10-17 18:24:28] [INFO] 2024-10-17 18:24:28.004623: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered [2024-10-17 18:24:29] [INFO] 2024-10-17 18:24:29.127857: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT [2024-10-17 18:24:32] [INFO] 2024-10-17 18:24:32 INFO highvram is enabled / highvramが有効です train_util.py:4090 [2024-10-17 18:24:32] [INFO] WARNING cache_latents_to_disk is enabled, so cache_latents train_util.py:4110 [2024-10-17 18:24:32] [INFO] is also enabled / [2024-10-17 18:24:32] [INFO] cache_latents_to_diskが有効なため、cache_latentsを有 [2024-10-17 18:24:32] [INFO] 効にします [2024-10-17 18:24:32] [INFO] 2024-10-17 18:24:32 INFO Checking the state dict: Diffusers or BFL, dev or flux_utils.py:62 [2024-10-17 18:24:32] [INFO] schnell [2024-10-17 18:24:32] [INFO] INFO t5xxl_max_token_length: 512 flux_train_network.py:152 [2024-10-17 18:24:33] [INFO] /usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 [2024-10-17 18:24:33] [INFO] warnings.warn( [2024-10-17 18:24:34] [INFO] You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 [2024-10-17 18:24:35] [INFO] 2024-10-17 18:24:35 INFO Loading dataset config from train_network.py:280 [2024-10-17 18:24:35] [INFO] /content/fluxgym-Colab/dataset.toml [2024-10-17 18:24:35] [INFO] INFO prepare images. train_util.py:1942 [2024-10-17 18:24:35] [INFO] INFO get image size from name of cache files train_util.py:1859 [2024-10-17 18:24:35] [INFO] 0%| | 0/35 [00:00<?, ?it/s] 100%|██████████| 35/35 [00:00<00:00, 210617.85it/s] [2024-10-17 18:24:35] [INFO] INFO set image size from cache files: 0/35 train_util.py:1887 [2024-10-17 18:24:35] [INFO] INFO found directory train_util.py:1889 [2024-10-17 18:24:35] [INFO] /content/fluxgym-Colab/datasets/erke-20241017 [2024-10-17 18:24:35] [INFO] contains 35 image files [2024-10-17 18:24:35] [INFO] read caption: 0%| | 0/35 [00:00<?, ?it/s] read caption: 100%|██████████| 35/35 [00:00<00:00, 20071.18it/s] [2024-10-17 18:24:35] [INFO] INFO 350 train images with repeating. train_util.py:1983 [2024-10-17 18:24:35] [INFO] INFO 0 reg images. train_util.py:1986 [2024-10-17 18:24:35] [INFO] WARNING no regularization images / train_util.py:1991 [2024-10-17 18:24:35] [INFO] 正則化画像が見つかりませんでした [2024-10-17 18:24:35] [INFO] INFO [Dataset 0] config_util.py:570 [2024-10-17 18:24:35] [INFO] batch_size: 1 [2024-10-17 18:24:35] [INFO] resolution: (512, 512) [2024-10-17 18:24:35] [INFO] enable_bucket: False [2024-10-17 18:24:35] [INFO] network_multiplier: 1.0 [2024-10-17 18:24:35] [INFO] [2024-10-17 18:24:35] [INFO] [Subset 0 of Dataset 0] [2024-10-17 18:24:35] [INFO] image_dir: [2024-10-17 18:24:35] [INFO] "/content/fluxgym-Colab/datasets/erke-20241017" [2024-10-17 18:24:35] [INFO] image_count: 35 [2024-10-17 18:24:35] [INFO] num_repeats: 10 [2024-10-17 18:24:35] [INFO] shuffle_caption: False [2024-10-17 18:24:35] [INFO] keep_tokens: 1 [2024-10-17 18:24:35] [INFO] keep_tokens_separator: [2024-10-17 18:24:35] [INFO] caption_separator: , [2024-10-17 18:24:35] [INFO] secondary_separator: None [2024-10-17 18:24:35] [INFO] enable_wildcard: False [2024-10-17 18:24:35] [INFO] caption_dropout_rate: 0.0 [2024-10-17 18:24:35] [INFO] caption_dropout_every_n_epoches: 0 [2024-10-17 18:24:35] [INFO] caption_tag_dropout_rate: 0.0 [2024-10-17 18:24:35] [INFO] caption_prefix: None [2024-10-17 18:24:35] [INFO] caption_suffix: None [2024-10-17 18:24:35] [INFO] color_aug: False [2024-10-17 18:24:35] [INFO] flip_aug: False [2024-10-17 18:24:35] [INFO] face_crop_aug_range: None [2024-10-17 18:24:35] [INFO] random_crop: False [2024-10-17 18:24:35] [INFO] token_warmup_min: 1, [2024-10-17 18:24:35] [INFO] token_warmup_step: 0, [2024-10-17 18:24:35] [INFO] alpha_mask: False, [2024-10-17 18:24:35] [INFO] is_reg: False [2024-10-17 18:24:35] [INFO] class_tokens: Erke [2024-10-17 18:24:35] [INFO] caption_extension: .txt [2024-10-17 18:24:35] [INFO] [2024-10-17 18:24:35] [INFO] [2024-10-17 18:24:35] [INFO] INFO [Dataset 0] config_util.py:576 [2024-10-17 18:24:35] [INFO] INFO loading image sizes. train_util.py:914 [2024-10-17 18:24:35] [INFO] 0%| | 0/35 [00:00<?, ?it/s] 100%|██████████| 35/35 [00:00<00:00, 28694.42it/s] [2024-10-17 18:24:35] [INFO] INFO prepare dataset train_util.py:939 [2024-10-17 18:24:35] [INFO] INFO preparing accelerator train_network.py:345 [2024-10-17 18:24:35] [INFO] accelerator device: cuda [2024-10-17 18:24:35] [INFO] INFO Checking the state dict: Diffusers or BFL, dev or flux_utils.py:62 [2024-10-17 18:24:35] [INFO] schnell [2024-10-17 18:24:35] [INFO] INFO Building Flux model dev from BFL checkpoint flux_utils.py:116 [2024-10-17 18:24:35] [INFO] INFO Loading state dict from flux_utils.py:133 [2024-10-17 18:24:35] [INFO] /content/fluxgym-Colab/models/unet/flux1-dev-fp8.safe [2024-10-17 18:24:35] [INFO] tensors [2024-10-17 18:24:36] [INFO] 2024-10-17 18:24:36 INFO Loaded Flux: <All keys matched successfully> flux_utils.py:145 [2024-10-17 18:24:36] [INFO] INFO Loaded fp8 FLUX model flux_train_network.py:76 [2024-10-17 18:24:36] [INFO] INFO Building CLIP flux_utils.py:165 [2024-10-17 18:24:36] [INFO] INFO Loading state dict from flux_utils.py:258 [2024-10-17 18:24:36] [INFO] /content/fluxgym-Colab/models/clip/clip_l.safetensors [2024-10-17 18:24:37] [INFO] 2024-10-17 18:24:37 INFO Loaded CLIP: <All keys matched successfully> flux_utils.py:261 [2024-10-17 18:24:37] [INFO] INFO Loading state dict from flux_utils.py:306 [2024-10-17 18:24:37] [INFO] /content/fluxgym-Colab/models/clip/t5xxl_fp8.safetens [2024-10-17 18:24:37] [INFO] ors [2024-10-17 18:24:38] [INFO] 2024-10-17 18:24:38 INFO Loaded T5xxl: <All keys matched successfully> flux_utils.py:309 [2024-10-17 18:24:38] [INFO] INFO Loaded fp8 T5XXL model flux_train_network.py:98 [2024-10-17 18:24:38] [INFO] INFO Building AutoEncoder flux_utils.py:152 [2024-10-17 18:24:38] [INFO] INFO Loading state dict from flux_utils.py:157 [2024-10-17 18:24:38] [INFO] /content/fluxgym-Colab/models/vae/ae.sft [2024-10-17 18:24:38] [INFO] INFO Loaded AE: <All keys matched successfully> flux_utils.py:160 [2024-10-17 18:24:38] [INFO] import network module: networks.lora_flux [2024-10-17 18:24:38] [INFO] INFO [Dataset 0] train_util.py:2466 [2024-10-17 18:24:38] [INFO] INFO caching latents with caching strategy. train_util.py:1039 [2024-10-17 18:24:38] [INFO] INFO caching latents... train_util.py:1084 [2024-10-17 18:24:57] [INFO] 0%| | 0/35 [00:00<?, ?it/s] 3%|▎ | 1/35 [00:00<00:27, 1.25it/s] 6%|▌ | 2/35 [00:01<00:21, 1.57it/s] 9%|▊ | 3/35 [00:01<00:19, 1.68it/s] 11%|█▏ | 4/35 [00:02<00:17, 1.76it/s] 14%|█▍ | 5/35 [00:02<00:16, 1.80it/s] 17%|█▋ | 6/35 [00:03<00:15, 1.83it/s] 20%|██ | 7/35 [00:03<00:15, 1.85it/s] 23%|██▎ | 8/35 [00:04<00:14, 1.87it/s] 26%|██▌ | 9/35 [00:05<00:13, 1.87it/s] 29%|██▊ | 10/35 [00:05<00:13, 1.88it/s] 31%|███▏ | 11/35 [00:06<00:12, 1.88it/s] 34%|███▍ | 12/35 [00:06<00:12, 1.88it/s] 37%|███▋ | 13/35 [00:07<00:11, 1.88it/s] 40%|████ | 14/35 [00:07<00:11, 1.89it/s] 43%|████▎ | 15/35 [00:08<00:10, 1.89it/s] 46%|████▌ | 16/35 [00:08<00:10, 1.88it/s] 49%|████▊ | 17/35 [00:09<00:09, 1.88it/s] 51%|█████▏ | 18/35 [00:09<00:09, 1.89it/s] 54%|█████▍ | 19/35 [00:10<00:08, 1.89it/s] 57%|█████▋ | 20/35 [00:10<00:07, 1.88it/s] 60%|██████ | 21/35 [00:11<00:07, 1.88it/s] 63%|██████▎ | 22/35 [00:11<00:06, 1.88it/s] 66%|██████▌ | 23/35 [00:12<00:06, 1.87it/s] 69%|██████▊ | 24/35 [00:13<00:05, 1.88it/s] 71%|███████▏ | 25/35 [00:13<00:05, 1.87it/s] 74%|███████▍ | 26/35 [00:14<00:04, 1.88it/s] 77%|███████▋ | 27/35 [00:14<00:04, 1.88it/s] 80%|████████ | 28/35 [00:15<00:03, 1.87it/s] 83%|████████▎ | 29/35 [00:15<00:03, 1.87it/s] 86%|████████▌ | 30/35 [00:16<00:02, 1.87it/s] 89%|████████▊ | 31/35 [00:16<00:02, 1.87it/s] 91%|█████████▏| 32/35 [00:17<00:01, 1.87it/s] 94%|█████████▍| 33/35 [00:17<00:01, 1.87it/s] 97%|█████████▋| 34/35 [00:18<00:00, 1.87it/s] 100%|██████████| 35/35 [00:18<00:00, 1.87it/s] 100%|██████████| 35/35 [00:18<00:00, 1.85it/s] [2024-10-17 18:24:57] [INFO] 2024-10-17 18:24:57 INFO move vae and unet to cpu to save memory flux_train_network.py:205 [2024-10-17 18:24:58] [INFO] 2024-10-17 18:24:58 INFO move text encoders to gpu flux_train_network.py:213 [2024-10-17 18:25:21] [INFO] 2024-10-17 18:25:21 INFO prepare T5XXL for fp8: set to flux_train_network.py:493 [2024-10-17 18:25:21] [INFO] torch.float8_e4m3fn, set embeddings to [2024-10-17 18:25:21] [INFO] torch.bfloat16, add hooks [2024-10-17 18:25:21] [INFO] INFO [Dataset 0] train_util.py:2488 [2024-10-17 18:25:21] [INFO] INFO caching Text Encoder outputs with caching strategy. train_util.py:1218 [2024-10-17 18:25:21] [INFO] INFO checking cache validity... train_util.py:1229 [2024-10-17 18:25:21] [INFO] 0%| | 0/35 [00:00<?, ?it/s] 100%|██████████| 35/35 [00:00<00:00, 63330.73it/s] [2024-10-17 18:25:21] [INFO] INFO caching Text Encoder outputs... train_util.py:1260 [2024-10-17 18:25:21] [INFO] 0%| | 0/35 [00:00<?, ?it/s] WARNING T5 model is using fp8 weights for caching. This strategy_flux.py:150 [2024-10-17 18:25:21] [INFO] may affect the quality of the cached outputs. / [2024-10-17 18:25:21] [INFO] T5モデルはfp8の重みを使用しています。これはキャッ [2024-10-17 18:25:21] [INFO] シュの品質に影響を与える可能性があります。 [2024-10-17 18:26:36] [INFO] 3%|▎ | 1/35 [00:03<01:50, 3.24s/it] 6%|▌ | 2/35 [00:05<01:07, 2.04s/it] 9%|▊ | 3/35 [00:07<01:05, 2.04s/it] 11%|█▏ | 4/35 [00:09<01:02, 2.03s/it] 14%|█▍ | 5/35 [00:11<01:01, 2.04s/it] 17%|█▋ | 6/35 [00:13<00:59, 2.04s/it] 20%|██ | 7/35 [00:15<00:57, 2.04s/it] 23%|██▎ | 8/35 [00:17<00:55, 2.05s/it] 26%|██▌ | 9/35 [00:19<00:53, 2.05s/it] 29%|██▊ | 10/35 [00:21<00:51, 2.06s/it] 31%|███▏ | 11/35 [00:23<00:49, 2.06s/it] 34%|███▍ | 12/35 [00:25<00:47, 2.06s/it] 37%|███▋ | 13/35 [00:27<00:45, 2.06s/it] 40%|████ | 14/35 [00:29<00:43, 2.06s/it] 43%|████▎ | 15/35 [00:31<00:41, 2.07s/it] 46%|████▌ | 16/35 [00:34<00:39, 2.08s/it] 49%|████▊ | 17/35 [00:36<00:37, 2.08s/it] 51%|█████▏ | 18/35 [00:38<00:35, 2.08s/it] 54%|█████▍ | 19/35 [00:40<00:33, 2.09s/it] 57%|█████▋ | 20/35 [00:42<00:31, 2.09s/it] 60%|██████ | 21/35 [00:44<00:29, 2.09s/it] 63%|██████▎ | 22/35 [00:46<00:27, 2.10s/it] 66%|██████▌ | 23/35 [00:48<00:25, 2.11s/it] 69%|██████▊ | 24/35 [00:50<00:23, 2.12s/it] 71%|███████▏ | 25/35 [00:52<00:21, 2.11s/it] 74%|███████▍ | 26/35 [00:55<00:19, 2.12s/it] 77%|███████▋ | 27/35 [00:57<00:17, 2.13s/it] 80%|████████ | 28/35 [00:59<00:14, 2.13s/it] 83%|████████▎ | 29/35 [01:01<00:12, 2.14s/it] 86%|████████▌ | 30/35 [01:03<00:10, 2.14s/it] 89%|████████▊ | 31/35 [01:05<00:08, 2.15s/it] 91%|█████████▏| 32/35 [01:07<00:06, 2.16s/it] 94%|█████████▍| 33/35 [01:10<00:04, 2.15s/it] 97%|█████████▋| 34/35 [01:12<00:02, 2.16s/it] 100%|██████████| 35/35 [01:14<00:00, 2.17s/it] 100%|██████████| 35/35 [01:14<00:00, 2.12s/it] [2024-10-17 18:26:36] [INFO] 2024-10-17 18:26:36 INFO cache Text Encoder outputs for sample prompt: flux_train_network.py:229 [2024-10-17 18:26:36] [INFO] /content/fluxgym-Colab/sample_prompts.txt [2024-10-17 18:26:36] [INFO] INFO cache Text Encoder outputs for prompt: flux_train_network.py:240 [2024-10-17 18:26:36] [INFO] Generate a portrait of Erke in a serene [2024-10-17 18:26:36] [INFO] setting, showcasing their unique facial [2024-10-17 18:26:36] [INFO] features and expressions. Use soft, natural [2024-10-17 18:26:36] [INFO] lighting to highlight their characteristics [2024-10-17 18:26:36] [INFO] and maintain a realistic style. [2024-10-17 18:26:36] [INFO] INFO cache Text Encoder outputs for prompt: flux_train_network.py:240 [2024-10-17 18:26:38] [INFO] 2024-10-17 18:26:38 INFO move t5XXL back to cpu flux_train_network.py:253 [2024-10-17 18:26:44] [INFO] 2024-10-17 18:26:44 INFO move vae and unet back to original device flux_train_network.py:258 [2024-10-17 18:26:44] [INFO] INFO create LoRA network. base dim (rank): 4, alpha: 1 lora_flux.py:594 [2024-10-17 18:26:44] [INFO] INFO neuron dropout: p=None, rank dropout: p=None, module lora_flux.py:595 [2024-10-17 18:26:44] [INFO] dropout: p=None [2024-10-17 18:26:44] [INFO] INFO train all blocks only lora_flux.py:605 [2024-10-17 18:26:44] [INFO] INFO create LoRA for Text Encoder 1: lora_flux.py:741 [2024-10-17 18:26:44] [INFO] INFO create LoRA for Text Encoder 1: 72 modules. lora_flux.py:744 [2024-10-17 18:26:45] [INFO] 2024-10-17 18:26:45 INFO create LoRA for FLUX all blocks: 304 modules. lora_flux.py:765 [2024-10-17 18:26:45] [INFO] INFO enable LoRA for text encoder: 72 modules lora_flux.py:911 [2024-10-17 18:26:45] [INFO] INFO enable LoRA for U-Net: 304 modules lora_flux.py:916 [2024-10-17 18:26:45] [INFO] FLUX: Gradient checkpointing enabled. CPU offload: False [2024-10-17 18:26:45] [INFO] prepare optimizer, data loader etc. [2024-10-17 18:26:45] [INFO] INFO Text Encoder 1 (CLIP-L): 72 modules, LR 0.0008 lora_flux.py:1018 [2024-10-17 18:26:45] [INFO] INFO use Adafactor optimizer | {'relative_step': False, train_util.py:4735 [2024-10-17 18:26:45] [INFO] 'scale_parameter': False, 'warmup_init': False} [2024-10-17 18:26:45] [INFO] override steps. steps for 16 epochs is / 指定エポックまでのステップ数: 5600 [2024-10-17 18:26:45] [INFO] enable fp8 training for U-Net. [2024-10-17 18:26:45] [INFO] enable fp8 training for Text Encoder. [2024-10-17 18:27:41] [INFO] 2024-10-17 18:27:41 INFO prepare CLIP-L for fp8: set to flux_train_network.py:464 [2024-10-17 18:27:41] [INFO] torch.float8_e4m3fn, set embeddings to [2024-10-17 18:27:41] [INFO] torch.bfloat16 [2024-10-17 18:27:42] [INFO] running training / 学習開始 [2024-10-17 18:27:42] [INFO] num train images * repeats / 学習画像の数×繰り返し回数: 350 [2024-10-17 18:27:42] [INFO] num reg images / 正則化画像の数: 0 [2024-10-17 18:27:42] [INFO] num batches per epoch / 1epochのバッチ数: 350 [2024-10-17 18:27:42] [INFO] num epochs / epoch数: 16 [2024-10-17 18:27:42] [INFO] batch size per device / バッチサイズ: 1 [2024-10-17 18:27:42] [INFO] gradient accumulation steps / 勾配を合計するステップ数 = 1 [2024-10-17 18:27:42] [INFO] total optimization steps / 学習ステップ数: 5600 [2024-10-17 18:28:58] [INFO] steps: 0%| | 0/5600 [00:00<?, ?it/s]2024-10-17 18:28:58 INFO unet dtype: torch.float8_e4m3fn, device: cuda:0 train_network.py:1060 [2024-10-17 18:28:58] [INFO] INFO text_encoder [0] dtype: torch.float8_e4m3fn, train_network.py:1066 [2024-10-17 18:28:58] [INFO] device: cuda:0 [2024-10-17 18:28:58] [INFO] INFO text_encoder [1] dtype: torch.float8_e4m3fn, train_network.py:1066 [2024-10-17 18:28:58] [INFO] device: cpu [2024-10-17 18:28:58] [INFO] [2024-10-17 18:28:58] [INFO] epoch 1/16 [2024-10-17 18:28:58] [INFO] huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... [2024-10-17 18:28:58] [INFO] To disable this warning, you can either: [2024-10-17 18:28:58] [INFO] - Avoid using `tokenizers` before the fork if possible [2024-10-17 18:28:58] [INFO] - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) [2024-10-17 18:28:58] [INFO] huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... [2024-10-17 18:28:58] [INFO] To disable this warning, you can either: [2024-10-17 18:28:58] [INFO] - Avoid using `tokenizers` before the fork if possible [2024-10-17 18:28:58] [INFO] - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) [2024-10-17 18:28:58] [INFO] INFO epoch is incremented. current_epoch: 0, epoch: 1 train_util.py:706 [2024-10-17 18:28:58] [INFO] INFO epoch is incremented. current_epoch: 0, epoch: 1 train_util.py:706 [2024-10-17 18:29:08] [INFO] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:1399: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. [2024-10-17 18:29:08] [INFO] with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined] [2024-10-17 18:29:30] [INFO] /usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:295: FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead. [2024-10-17 18:29:30] [INFO] with torch.enable_grad(), device_autocast_ctx, torch.cpu.amp.autocast(**ctx.cpu_autocast_kwargs): # type: ignore[attr-defined]
Editor is loading...