[PYTHON] Extract images from cifar and CUCUMBER-9 datasets

It may be better to have a title that generates an image from a numpy array in Python3 (dare to be a niche (^^;)

cifar is an image set of airplanes and cats, and cucumber is an image set of cucumbers (as the name implies). It is distributed in numpy array format so that it is easy to use for machine learning, but the motivation is to take a look at the actual contents.

https://www.cs.toronto.edu/~kriz/cifar.html https://github.com/workpiles/CUCUMBER-9

It seems that cucmber has the same format as cifer, but in reality it was different and required separate processing. Is the cucumber format more convenient for Tensorflow? (I don't know the reason)

data.shape = (10000, 3072) # cifar-10
data.shape = (1485, 1024) # CUCUMBER-9

The sample program is as follows. For Python2, use cPickle. Requires numpy and PIL.

import numpy as np
import _pickle,random
from PIL import Image

def unpickle(file):
    fo = open(file, 'rb')
    dict = _pickle.load(fo, encoding='latin-1')
    return dict

def image_from_cifar(index,dic):
    name = dic['filenames'][index]
    data = dic['data'][index].reshape(3, 32, 32).transpose(1, 2, 0) # shape=(32, 32, 3)
    img = Image.fromarray(data, 'RGB')

def image_from_cucumber(index,dic):
    name = dic['filenames'][index]
    r = dic['data'][index*3]
    g = dic['data'][index*3+1]
    b = dic['data'][index*3+2]
    data = np.array([r,g,b]).T.reshape(32,32,3)
    img = Image.fromarray(data, 'RGB')

if __name__ == "__main__":
    dic = unpickle('cifar-10-batches-py/data_batch_1')

    dic = unpickle('CUCUMBER-9-master/prototype_1/cucumber-9-python/data_batch_1')

If you execute each data set in the current directory, one will be randomly taken out and saved as an image.



Recommended Posts

Extract images from cifar and CUCUMBER-9 datasets
Extract and package initrd images
Extract text from images in Python
[Python] (Line) Extract values from graph images
[Python] How to read data from CIFAR-10 and CIFAR-100
Extract images and tables from pdf with python to reduce the burden of reporting
Extract characters from images using docomo's character recognition API
[Python beginner] Extract prefectures and cities from addresses (3 lines).
Extract components and callbacks from app.py with plotly Dash
Search and save images of Chino Kafu from Twitter
[Python] Extract only numbers from lists and character strings
Extract data from S3
Extract features (features) from sentences.
Extract table from wikipedia
Follow Blender's data structure and extract vertex coordinates from fbx