[PYTHON] [tensorflow, keras, mnist] Take out n sheets for each label from the mnist data and create 10 * n sheets of data.

Preface

Why do you bother to take a small amount of data?

code

github Load mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Store y_train once in the pandas dataframe, divide the dataframe from it, and retrieve the index. Take out with n = 100 in the flow.

#Do it using pandas, a code to get out 100 sheets of each label
df = pd.DataFrame(columns=["label"])
df["label"] = y_train.reshape([-1])

list_0 = df.loc[df.label==0].sample(n=100)#n=Sampling at 100
list_1 = df.loc[df.label==1].sample(n=100)
list_2 = df.loc[df.label==2].sample(n=100)
list_3 = df.loc[df.label==3].sample(n=100)
list_4 = df.loc[df.label==4].sample(n=100)
list_5 = df.loc[df.label==5].sample(n=100)
list_6 = df.loc[df.label==6].sample(n=100)
list_7 = df.loc[df.label==7].sample(n=100)
list_8 = df.loc[df.label==8].sample(n=100)
list_9 = df.loc[df.label==9].sample(n=100)

label_list = pd.concat([list_0,list_1,list_2,list_3,list_4,list_5,list_6,list_7,list_8,
                       list_9])
label_list = label_list.sort_index()
label_idx = label_list.index.values

train_label = label_list.label.values

"""
x_dataframe for label from train.By extracting the index, the data corresponding to the label is extracted.
"""
x_train = x_train[label_idx]
y_train= train_label
x_train = x_train / 255
x_test = x_test / 255

Now you can sample every 100 labels.

Recommended Posts

[tensorflow, keras, mnist] Take out n sheets for each label from the mnist data and create 10 * n sheets of data.
I tried the MNIST tutorial for beginners of tensorflow.
Check the increase / decrease of Bitcoin for each address from the blockchain
Studying web scraping for the purpose of extracting data from Filmarks # 2
Shift the data for 3 months Shift the data for n months
[MNIST] Convert data to PNG for keras