Pytorch's DataLoader It's nice to be very easy to use. I personally thought that it would be easier to analyze if I could see the file name that failed to be classified when I had a classification problem, so I will describe how to retrieve the file name as a memorandum.
It's not difficult at all, but it's taken directly from the Dataloader. Create dataloaders by referring to TRANSFER LEARNING FOR COMPUTER VISION STRUCT [1] of PyTorch.
IMAGE_SIZE=224
BATCH_SIZE=20
TRAIN = 'train'
VAL = 'val'
DATA_DIR = 'H:\\dataset/predata/' # select your dataset directory
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
data_transforms = {
TRAIN: transforms.Compose([
transforms.Resize(IMAGE_SIZE),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
VAL: transforms.Compose([
transforms.Resize(IMAGE_SIZE),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]),
}
image_datasets = {x: datasets.ImageFolder(os.path.join(DATA_DIR, x), data_transforms[x]) for x in [TRAIN, VAL]}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x],
batch_size=BATCH_SIZE, shuffle=True, num_workers=4) for x in [TRAIN, VAL]}
dataset_sizes = {x: len(image_datasets[x]) for x in [TRAIN, VAL]}
class_names = image_datasets[TRAIN].classes
Extract the data path from the created dataloaders.
from enum import Enum
class Dataset(Enum):
FILE_PATH = 0
LABEL = 1
# full dataset
for j in range(dataset_sizes[VAL]):
# abs path
print(dataloaders[VAL].dataset.imgs[j][Dataset.FILE_PATH.value])
# file name only
print(os.path.basename(dataloaders[VAL].dataset.imgs[j][Dataset.FILE_PATH.value]))
# there is one dataset
print(dataloaders[VAL].dataset.imgs[0][Dataset.FILE_PATH.value])
After all, it is painful because you have to look directly at the classified data if there is no file name. So why not search by file name match?
Recommended Posts