[PYTHON] Simple implementation example of one kind of data augmentation

Overview of mixup

(Scheduled to be edited)

See ↓ https://arxiv.org/pdf/1710.09412.pdf

Easy implementation example

def mixup(x, y, batch_size, alpha = 0.2):
    l = np.random.beta(alpha, alpha, batch_size)
    x_, y_ = sklearn.utils.shuffle(x, y)
    shape = tuple(-1 if i == 0 else 1 for i in range(len(x.shape)))
    x = l.reshape(shape) * x + (1 - l).reshape(shape) * x_
    shape = tuple(-1 if i == 0 else 1 for i in range(len(y.shape)))
    y = l.reshape(shape) * y + (1 - l).reshape(shape) * y_
    return x, y

Code description

argument

Argument name Description (type)
x Input data (numpy.ndarray)
y Output data (numpy.ndarray)
batch_size Literally batch size (int)
alpha Parameters that determine the β distribution (float)

It is assumed that x and y are input in batches, and the dataset is shuffled after each epoch.

The contents of the function

l = np.random.beta(alpha, alpha, batch_size) Here, the blending weight is calculated. The shape of l is(batch_size).

x_, y_ = sklearn.utils.shuffle(x, y) Here, the shuffle is performed while maintaining the correspondence between x and y.

shape = tuple(-1 if i == 0 else 1 for i in range(len(x.shape))) Here, we are looking for a shape that makes l broadcastable to x. For example, if the shape of x is(batch_size, width, height, ch), and the shape of l is (batch_size), then when you do l * x The calculation result is not as expected. Also, depending on the shape of x, an error such as ʻoperands could not be broadcast ...` may occur.

x = l.reshape(shape) * x + (1 - l).reshape(shape) * x_ Here, x is blended.

shape = tuple(-1 if i == 0 else 1 for i in range(len(y.shape))) Here, we are looking for a shape that allows l to be broadcast to y.

y = l.reshape(shape) * y + (1 - l).reshape(shape) * y_ Here, y is blended.

Return value

Return value name Description (type)
x Mixed input data (numpy.ndarray)
y Mixed output data (numpy.ndarray)

sample

When the mixup is applied by regarding the identity matrix as input data, the result is as follows.

α = 0.1 α = 0.2 α = 0.3
image.png image.png image.png
α = 0.4 α = 0.5 α = 0.6
image.png image.png image.png
α = 0.7 α = 0.8 α = 0.9
image.png image.png image.png

Reference

Recommended Posts

Simple implementation example of one kind of data augmentation
Implementation of a simple particle filter
Explanation and implementation of simple perceptron
Pandas: A very simple example of DataFrame.rolling ()
Example of efficient data processing with PANDAS
A simple example of how to use ArgumentParser
I summarized one year of self-taught data science.
"Parentheses character string" Simple parser implementation example summary
[Python] [Word] [python-docx] Simple analysis of diff data using python
A very simple example of an ortoolpy optimization problem
Implementation example of LINE BOT server for actual operation
DataNitro, implementation of function to read data from sheet