[PYTHON] Creating learning data for face image dataset sorting (# 1)

Overview

Create training data for a model that sorts the UTKFace dataset by feature. I would appreciate it if you could point out various things.

UTKFace UTK Face Page ・ There are 3 compressed folders when you download  ①900MB  ②500MB  ③70MB Images are stored in each. The total of three exceeds 20,000. -The file name of the image is labeled "Age-Gender-Ethnicity- ○○○ .jpg "

Sorting

When you check the data set, there are data that has been processed and data that shows multiple people. Using the data in the above ② folder ·no problem ·grayscale ・ Multiple people are shown ・ Processed Create learning data to determine.

This time, the link below helped me a lot. Predict age from face image

environment

Google Colaboratory(GPU)


import os, zipfile, io, re
from PIL import Image           #Image.Needed in open
import numpy as np

X=[]
Y=[]
im_size = 299

#No problem Data ZIP reading
z = zipfile.ZipFile('/content/drive/My Drive/image.zip')
imgfiles = [ x for x in z.namelist() if re.search(r"^image.*jpg$", x)]

for imgfile in imgfiles:
    image = Image.open(io.BytesIO(z.read(imgfile)))
    image = image.convert('RGB')
    image = image.resize((im_size, im_size))
    data = np.asarray(image)
    X.append(data)
    Y.append(0)

z.close()
del z, imgfiles

In advance, divide the image data in the folder ② into four folders manually. Read each folder

Normal data is Y.append (0) Grayscale is Y.append (1) Multiple people are Y.append (2) Processing is Y.append (3) It was made.


X = np.asarray(X)
Y = np.asarray(Y)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

from keras.utils.np_utils import to_categorical

#Correct label one-Make it hot
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

del X,Y

I wanted to use a FOR statement to read the four manually sorted ZIP files and create learning data, but I couldn't think of it, so it's actually quite a long sentence.

Next time, I would like to use this to challenge deep learning. Model construction for face image data set sorting-Transfer learning of VGG19

Recommended Posts

Creating learning data for face image dataset sorting (# 1)
Face image dataset sorting using machine learning model (# 3)
Model construction for face image dataset sorting-VGG19 transfer learning (# 2)
Data set for machine learning
Inflating learning data [Image Date Generator]
[AI] Deep Learning for Image Denoising
[Deep learning] Nogizaka face detection ~ For beginners ~
About data expansion processing for deep learning
Creating a development environment for machine learning
I tried to process and transform the image and expand the data for machine learning
Read & implement Deep Residual Learning for Image Recognition
Implementation of Deep Learning model for image recognition
Creating training data
Inflated learning image
Create a dataset of images to use for learning
[Translation] scikit-learn 0.18 tutorial Statistical learning tutorial for scientific data processing
Use scikit-learn training dataset with chainer (for learning / prediction)
Tool for creating training data for object detection in OpenCV
xgboost: A valid machine learning model for table data