// loss 突然变成0
python train.py -b=8
INFO: Using device cpu
INFO: Network:
1 input channels
7 output channels (classes)
Bilinear upscaling
INFO: Creating dataset with 868 examples
INFO: Starting training:
Epochs: 5
Batch size: 8
Learning rate: 0.001
Training size: 782
Validation size: 86
Checkpoints: True
Device: cpu
Images scaling: 1
Epoch 1/5: 10%|██████████████▏ | 80/782 [01:3313:21, 1.14s/img, loss (batch)=0.886I
NFO: Validation cross entropy: 1.86862473487854
Epoch 1/5: 20%|███████████████████████████▊ | 160/782 [03:3411:51, 1.14s/img, loss (batch)=2.35e-7I
NFO: Validation cross entropy: 5.887489884504049e-10
Epoch 1/5: 31%|███████████████████████████████████████████▌ | 240/782 [05:4111:29, 1.27s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 41%|██████████████████████████████████████████████████████████ | 320/782 [07:4909:16, 1.20s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 51%|████████████████████████████████████████████████████████████████████████▋ | 400/782 [09:5507:31, 1.18s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 61%|███████████████████████████████████████████████████████████████████████████████████████▏ | 480/782 [12:0205:58, 1.19s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 72%|█████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 560/782 [14:0404:16, 1.15s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 82%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 640/782 [16:1102:49, 1.20s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 92%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 720/782 [18:2101:18, 1.26s/img, loss (batch)=0I
NFO: Validation cross entropy: 0.0
Epoch 1/5: 94%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 736/782 [19:1701:12, 1.57s/img, loss (batch)=0]
Traceback (most recent call last):
File "train.py", line 182, in module>
val_percent=args.val / 100)
File "train.py", line 66, in train_net
for batch in train_loader:
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 819, in __next__
return self._process_data(data)
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 4.
Original Traceback (most recent call last):
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 74, in dictcomp>
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/public/home/lidd/.conda/envs/lgg2/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: Expected object of scalar type Double but got scalar type Byte for sequence element 4 in sequence argument at position #1 'tensors'
如果是因为标签为0,那么一开始loss就可能为0.