First, after explaining the specifications of MNIST handwritten digit image data, vectorize it and display / save it as an image [Java code](https:: // is explained.
It is a handwritten digit image data set from 0 to 9 published on the net. The size of each image is 28 pixels square, and the color is grayscale with 255 gradations. It is often used for learning and evaluating machine learning.
This section describes the specifications of MNIST handwritten digit data.
There are 60,000 image data from 0 to 9 and correct label data used for training (learning).
train-images-idx3-ubyte.gz The training image data is stored in its own binary format.
offset | Data type | value | Description |
0 | 32bit integer | 2051 | Magic number(MSB First) |
4 | 32bit integer | 60000 | Number of images |
8 | 32bit integer | 28 | Number of vertical pixels in the image |
12 | 32bit integer | 28 | Number of pixels next to the image |
16 | Unsigned bytes | 0 to 255 | Grayscale value of the pixels in the 1st row and 1st column of the 1st image |
17 | Unsigned bytes | 0 to 255 | Grayscale value of pixels in the 1st row and 2nd column of the first image |
・ ・ ・ | Unsigned bytes | 0 to 255 | Grayscale value of the pixels in the 28th row and 28th column of the 60,000th image |
train-labels-idx1-ubyte.gz The correct label of the training data is stored in a unique binary format.
offset | Data type | value | Description |
0 | 32bit integer | 2049 | Magic number(MSB First) |
4 | 32bit integer | 60000 | Number of images |
8 | Unsigned bytes | 0 to 9 | Correct label for the first image |
9 | Unsigned bytes | 0 to 9 | Correct label for the second image |
・ ・ ・ | Unsigned bytes | 0 to 9 | Correct label for the 60,000th image |
There are 10,000 image data from 0 to 9 and correct label data used for the test (evaluation).
t10k-images-idx3-ubyte.gz The test image data is stored in its own binary format.
offset | Data type | value | Description |
0 | 32bit integer | 2051 | Magic number(MSB First) |
4 | 32bit integer | 10000 | Number of images |
8 | 32bit integer | 28 | Number of vertical pixels in the image |
12 | 32bit integer | 28 | Number of pixels next to the image |
16 | Unsigned bytes | 0 to 255 | Grayscale value of the pixels in the 1st row and 1st column of the 1st image |
17 | Unsigned bytes | 0 to 255 | Grayscale value of pixels in the 1st row and 2nd column of the first image |
・ ・ ・ | Unsigned bytes | 0 to 255 | Grayscale value of the pixels in the 28th row and 28th column of the 10,000th image |
t10k-labels-idx1-ubyte.gz The correct label of the test data is stored in a unique binary format.
offset | Data type | value | Description |
0 | 32bit integer | 2049 | Magic number(MSB First) |
4 | 32bit integer | 10000 | Number of images |
8 | Unsigned bytes | 0 to 9 | Correct label for the first image |
9 | Unsigned bytes | 0 to 9 | Correct label for the second image |
・ ・ ・ | Unsigned bytes | 0 to 9 | Correct label for the 10,000th image |
Source code on GitHub See for easy usage.
Read the data using the DataInputStream. Use readInt to read the magic number, the number of images, the number of pixels in the vertical direction of the image, and the number of pixels in the horizontal direction of the image. The number of dimensions is 28 * 28 = 784. Read image data into readUnsignedByte in double type 2D array features. The first dimension is the image index and the second dimension is the dimension index. The value is divided by 255.0 to normalize it for use in machine learning.
private void loadFeatures() throws IOException {
System.out.println("Loading feature data from " + fileName + " ...");
DataInputStream is = new DataInputStream(new GZIPInputStream(new FileInputStream(Const.BASE_PATH + fileName)));
numImages = is.readInt();
numDimensions = is.readInt() * is.readInt();
features = new double[numImages][numDimensions];
for (int i = 0; i < numImages; i++) {
for (int j = 0; j < numDimensions; j++) {
features[i][j] = (double) is.readUnsignedByte() / 255.0;
Read the data using the DataInputStream. Read the magic number and the number of images with readInt. Read the correct label with readUnsignedByte in the int type array labels.
private void loadLabels() throws IOException {
System.out.println("Loading label data from " + fileName + " ...");
DataInputStream is = new DataInputStream(new GZIPInputStream(new FileInputStream(Const.BASE_PATH + fileName)));
numLabels = is.readInt();
labels = new int[numLabels];
for (int i = 0; i < numLabels; i++) {
labels[i] = is.readUnsignedByte();
Specify the index of the image as an argument to display the outline of the image in text on the console.
public void showImageAsText(int index) {
System.out.println("Label: " + labels[index]);
for (int i = 0; i < 28; i++) {
for (int j = 0; j < 28; j++) {
double value = images[index][i * 28 + j];
if (value > 0.0) {
} else {
System.out.print(" ");
Restore the vectorized image data to create a BufferedImage. The value is normalized, so multiply it by 255.0 to return it to its original grayscale value.
private BufferedImage makeImage(int index) {
BufferedImage image =
new BufferedImage(28, 28, BufferedImage.TYPE_INT_RGB);
for (int i = 0; i < 28; i++) {
for (int j = 0; j < 28; j++) {
int value = (int) (images[index][i * 28 + j] * 255.0);
image.setRGB(j, i, 0xff000000 | value << 16 | value << 8 | value);
return image;
The BufferedImage loaded by makeImage is displayed in the dialog.
public void showImage(int index) {
BufferedImage image = makeImage(index);
Icon icon = new ImageIcon(image);
JOptionPane.showMessageDialog(null, labels[index], "MnistImageViewer", JOptionPane.PLAIN_MESSAGE, icon);
Save the BufferedImage loaded by makeImage to a gif file.
public void saveImage(String dir, String prefix, int index) throws IOException {
BufferedImage image = makeImage(index);
File file = new File(dir + "/" + prefix + "_" + String.format("%05d", index) + "_" + labels[index] + ".gif");
if (file.exists()) file.delete();
ImageIO.write(image, "gif", file);
Recommended Posts