[PYTHON] Move the Augmentaiton of Albumentations earnestly
What is Albumentations?
- https://github.com/albumentations-team/albumentations
- Python library for data extension for machine learning
- A wealth of features commonly used in Data augmentation
pip install albumentations
This article
Blur
Blur
- Blur with a randomly sized kernel
- blur_limit (int) – Maximum blurr kernel size. Default: (3, 7)
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/164123a5-d8f5-89ee-87d8-2fb00ae1fd32.png)
MotionBlur
- Apply motion blur (box filter) with random kernel size
- blur_limit (int) – Maximum blurr kernel size. Default: (3, 7).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/034d003c-dad4-b8a5-eb3f-1c67c8b9f50f.png)
GaussianBlur
- Apply Gaussian filter with random kernel size
- blur_limit (int) – Maximum kernel size to blur, must be odd. Default: (3, 7).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/689f8272-29fb-28dc-a1e5-f5feb8046f7c.png)
GlassBlur
- Adds glass noise (effect like frosted glass)
- sigma (float) – Standard deviation of the Gaussian kernel. Defalt: 0.7.
- max_delta (int) – Maximum distance between swapped pixels. . Defalt: 4.
- iterations (int) – Number of repeats. Default: (2).
- mode (str) – Computation mode. Fast or exact. Default: “fast”.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/36121dd2-62ed-0223-8dc2-caf962f391ba.png)
Noise, Compression
GaussNoise
- Add Gaussian noise
- var_limit ((float, float) or float) – Noise distribution. Default: (10.0, 50.0).
- mean (float) – Noise average. Default: 0
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/67660dd4-8849-2354-a721-c8ea0aed7f1f.png)
JpegCompression
- Apply Jpeg compression noise
- quality_lower (float) – The lower limit of quality.
- quality_upper (float) – Quality limit.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/e5bb579b-297e-17f4-9752-e9f811023587.png)
ImageCompression
- Apply Jpeg / WebP compression noise
- quality_lower (float) – The lower limit of quality.
- quality_upper (float) – Quality limit.
- compression_type (ImageCompressionType) – Compression type (JPEG / WEBP). should be ImageCompressionType.JPEG or ImageCompressionType.WEBP. Default: ImageCompressionType.JPEG
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/415ed8fe-ab99-b4a0-88b2-22211a01a651.png)
ISONoise
- Add camera sensor noise
- color_shift (float, float) – amount of hue change.
- intensity ((float, float) – Intensity of color / luminance noise.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/3b94d49a-b7d3-ae38-1b54-3c6dfa1a3acf.png)
MultiplicativeNoise
- Multiply by random number array
- multiplier (float or tuple of floats) – A range of numbers to multiply. Default: (0.9, 1.1).
- per_channel (bool) – False: Same for all channels, True: Use sample values for each channel. Default False.
- elementwise (bool) – False: Multiply all pixels in the image, True: Randomly sample and multiply pixel by pixel. Defaule: False.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/6d901474-c118-3500-14e1-908a0d1657ff.png)
Downscale
- Downscale and then upscale to reduce image quality
- scale_min (float) – The lower limit of the scale. Should be <1. Default: 0.25.
- scale_max (float) – The upper limit of the scale. Default: 0.25.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/ca7ef9d8-43a8-abaa-f6aa-ea708183b1ec.png)
Simple geometric change system (Flip, Crop, Rotate, Scale, Transpose)
Flip
- Randomly flips horizontally, vertically, or both horizontally and vertically
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/7ead49a5-db86-f540-a3b8-f781408bc0ce.png)
VerticalFlip
- Flip vertically around the x-axis
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/a2c32338-c9ec-3770-dd7e-d1c06f50e876.png)
HorizontalFlip
- Flip horizontally around the y-axis
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/24468832-9643-1e77-0c22-5b1aa7d9bf87.png)
Crop
- Cut out the area
- x_min (int) – The minimum value of the upper left x coordinate.
- y_min (int) – The minimum value of the upper left y coordinate.
- x_max (int) – Maximum value of the lower right x coordinate.
- y_max (int) – Maximum value of the lower right y coordinate.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/778cbdd4-2c99-1bd0-bc2f-fd97511d46fe.png)
RandomCrop
- Randomly cut
- height (int) – The height to cut.
- width (int) – The width to cut.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/308412ad-9d56-2490-d0b9-ba8dc51838f2.png)
CenterCrop
- Crop the center part
- height (int) – The height to crop.
- width (int) – The width to crop.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/a461a8e5-9334-383f-a7a5-8ea2a5166634.png)
RandomSizedCrop
- Randomly cut and rescale to a specific size
- min_max_height ((int, int)) – Crop size range.
- height (int) – Height after resizing.
- width (int) – Width after resizing.
- w2h_ratio (float) – Crop aspect ratio. Default: 1.0.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/e88d42fe-5156-9c18-0f68-51671606ab88.png)
RandomResizedCrop
- Randomly cut and rescale to a specific size (Torchvision variant)
- height (int) – Height after resizing.
- width (int) – Width after resizing.
- scale ((float, float)) – The size range of the area to crop. Default: (0.08, 1.0).
- ratio ((float, float)) – The range of aspect ratios for the area to be cropped. Default: (0.75, 1.3333333333333333).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/5479daf6-ad4b-bdb8-5234-197ba7a958f9.png)
Rotate
- Rotate at random angles
- limit: Range of angles, (-limit, limit) for single numbers. Default: (-90, 90)
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/980d9511-12f7-6f03-2afc-565488331eea.png)
RandomScale
- Randomly change the image size
- scale_limit ((float, float) or float) – Scaling range (note that 0 is unchanged). Default: (0.9, 1.1).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/773027c3-e5f8-d803-5d18-983186e2d2d0.png)
- Since it is previewed in the same size, the original image size has changed although it is not visible.
RandomRotate90
- Randomly rotate in 90 ° increments
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/415afe65-9b29-2b05-73b6-fc04db451f41.png)
Transpose
- Transpose rows and columns
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/b1065804-13cf-6be8-8f15-e6162e328932.png)
Advanced geometric transformation system (Affine, Distortion)
ShiftScaleRotate
- Randomly apply affine transformations (translation, scaling, rotation)
- shift_limit: The range of translation. Default: (-0.0625, 0.0625).
- scale_limit: Scale range (note that 0 is unchanged). Default: (-0.1, 0.1).
- rotate_limit: rotation range. The range of rotation. Default: (-45, 45).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/850c7b34-f8c8-0e71-1a87-39d299fd0a03.png)
OpticalDistortion
- Reproduce optical distortion
- distort_limit (float, (float, float)) – Range of distortion. Default: (-0.05, 0.05).
- shift_limit (float, (float, float))) – The range to shift. Default: (-0.05, 0.05).
- Not applicable to Bounding Box and Keypoints
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/f78f4c8f-d52c-af72-b5e8-e832ec9459c6.png)
GridDistortion
- Reproduce Grid distortion
- num_steps (int) – Specify the number of grid cells on each side. Default: 5.
- distort_limit (float, (float, float)) – Range of distortion. Default: (-0.03, 0.03).
- Not applicable to Bounding Box and Keypoints
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/1939c444-0f6e-7d0b-0a63-d3a8f5b178ab.png)
ElasticTransform
- Elastic deformation
- alpha (float) – transformation parameters. Default: 1.
- sigma (float) – Gaussian filter parameters. Default: 50.
- alpha_affine (float) – Range of alpha_affine. Default: 50.
- Not applicable to Bounding Box and Keypoints
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/d6c4408c-bcf1-1abc-8d85-643c75612469.png)
RandomGridShuffle
- Randomly shuffle cells in the grid
- grid ((int, int)) – The size of the grid that divides the image. Default: (3,3).
- Not applicable to Bounding Box and Keypoints
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/158e7d36-460e-47bd-7261-c1450dc4d70b.png)
Color tinkering system
HueSaturationValue
- Randomly change hue, saturation, and brightness
- hue_shift_limit ((int, int) or int) – Hue range. Default: (-20, 20).
- sat_shift_limit ((int, int) or int) – Saturation range. Default: (-30, 30).
- val_shift_limit ((int, int) or int) – Brightness range. Default: (-20, 20).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/27a07b22-7eb7-febc-f42a-260f08c1bcdf.png)
RGBShift
- Randomly change the value of each RGB channel
- r_shift_limit ((int, int) or int) – Red channel range of change. Default: (-20, 20).
- g_shift_limit ((int, int) or int) – Green channel range of change. Default: (-20, 20).
- b_shift_limit ((int, int) or int) – Blue channel range of change. Default: (-20, 20).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/d4fb3754-9c5f-681c-5579-5a7ceceae9ad.png)
ChannelShuffle
- Randomly sort RGB channels
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/f308555c-8663-539f-1874-d8d70d44da3c.png)
ChannelDropout
- Randomly drop channels
- channel_drop_range (int, int) – The range of channels to drop.
- fill_value (int, float) – Pixel value to fill the dropped channel instead.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/40fc70b4-48c5-50f7-e32f-28108e67422a.png)
Posterize
- Reduce the number of bits in each color channel
- num_bits ((int, int) – range of bits. Default: 4.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/aa5f3f6b-20be-ba62-a4f3-2c85df555cc8.png)
ToGray
- Convert RGB image to grayscale
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/506e9259-4c2c-e245-fec6-cf08f6e884f8.png)
ToSepia
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/7870cc17-46d2-74b9-573d-82baeb181e1b.png)
Brightness and contrast tinkering system
InvertImg
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/0808bc0e-f89a-945a-a1a1-5624cc8f4ce8.png)
Normalize
- Divide the pixel value by 255 → subtract the average value for each channel → divide by the standard deviation for each channel
- ** Mean and standard deviation are just parameters and are not implicitly calculated internally **
- mean (float, list of float) – Mean. Dafault: (0.485, 0.456, 0.406).
- std (float, list of float) – Standard deviation. Dafault: (0.229, 0.224, 0.225).
- max_pixel_value (float) – Maximum pixel value. Dafault: 255.0
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/b7d1ce48-ba54-4796-0774-f2b3568c3133.png)
RandomGamma
- Randomly apply gamma conversion
- gamma_limit (float or (float, float)) – The upper limit of gamma. Default: (80, 120).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/9230bd39-798a-1d25-c525-cc998b8ca9d4.png)
RandomBrightness
- Randomly change the brightness
- limit ((float, float) or float) – Range of change in brightness. Default: (-0.2, 0.2).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/4a2edcc9-581f-87d9-5882-25b3bc2c786f.png)
RandomContrast
- Randomly change the contrast
- limit ((float, float) or float) – Range of change in contrast. Default: (-0.2, 0.2).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/f1a3b737-3b12-6158-20a9-079f4f812de6.png)
RandomBrightnessContrast
- Randomly change brightness and contrast
- brightness_limit ((float, float) or float) – The range of change in brightness. Default: (-0.2, 0.2).
- contrast_limit ((float, float) or float) – Contrast range of change. Default: (-0.2, 0.2).
- brightness_by_max (Boolean) – How to adjust the contrast. Default: True.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/82a62803-2bd7-a6bc-ac25-03c4dbc5d206.png)
CLAHE
- Contrast limit adaptive histogram equalization
- clip_limit (float or (float, float)) – The upper threshold of the contrast limit. Default: (1, 4).
- tile_grid_size ((int, int)) – Grid size for histogram equalization. Default: (8, 8).
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/d6707309-1d38-bb39-1d3f-b4ecb1b428d0.png)
Solarize
- Invert pixel values above the threshold (solar)
- threshold ((int, int) or int, or (float, float) or float) – Inversion threshold. Default: 128.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/c5cb2b6c-60c3-2cb5-a900-7aaa693d5a74.png)
Dropout system
Cutout
- Rough Dropout in rectangular area
- num_holes (int) – Number of regions to drop to zero. Defalt: 8.
- max_h_size (int) – Maximum height of the area. Defalt: 8.
- max_w_size (int) – Maximum width of the area. Defalt: 8.
- fill_value (int, float, lisf of int, list of float). Pixel value of the dropped area. Defalt: 0.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/23d488f4-9ddc-3ba3-a298-760411316b1e.png)
CoarseDropout
- Rough dropout of rectangular area (minimum value can be specified)
- max_holes (int) – Maximum number of regions to drop to zero.
- max_height (int) – Maximum height of the area. Defalt: 8.
- max_width (int) – Maximum width of the area. Defalt: 8.
- min_holes (int) – The minimum number of regions to drop to zero. Default: None.
- min_height (int) – Minimum height of the area. Default: None.
- min_width – The minimum width of the area. Default: None.
- fill_value (int, float, lisf of int, list of float). Pixel value of the dropped area. Defalt: 0.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/d6071c76-b3fb-bd94-1b35-56d4f502066a.png)
Weather change / environment / optical reproduction system
RandomSnow
- Simulate snow
- snow_point_lower (float) – The lower limit for snow. Default: 0.1.
- snow_point_upper (float) – Upper limit for snow. Default: 0.3.
- brightness_coeff (float) – Higher values result in more snow. Should be> = 0. Default: 2.5.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/1ff27889-ce9b-05b0-0713-f9b66584fa1a.png)
RandomRain
- Rain effect
- slant_lower – Diagonal down condition. Default: -10.
- slant_upper – Diagonal up. Default: 10.
- drop_length – The length of the rain. Default: 20.
- drop_width – The width of the rain. Default: 1.
- drop_color (list of (r, g, b)) – The color of the rain line. Default: (200, 200, 200).
- blur_value (int) – A raindrop blur. Default: 7.
- brightness_coefficient (float) – Brightness. Default: 0.7.
- rain_type – The type of rain. [None, “drizzle”, “heavy”, “torrestial”]. Default: None.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/8ae98cd6-e913-b6fd-f6ae-0a07d0fc375b.png)
RandomFog
- Simulate fog
- fog_coef_lower (float) – Lower limit of fog intensity.
- fog_coef_upper (float) – Upper limit of fog intensity.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/ec44172f-5e32-429d-1d24-23a42600b521.png)
RandomSunFlare
- Simulate solar flares
- flare_roi (float, float, float, float) – The area where flares appear (x_min, y_min, x_max, y_max).
- angle_lower (float) – Lower limit of angle.
- angle_upper (float) – Upper limit of angle.
- num_flare_circles_lower (int) – Lower limit for the number of flares.
- num_flare_circles_upper (int) – Maximum number of flares.
- src_radius (int) – Flare radius.
- src_color ((int, int, int)) – Flare color.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/b7f620f4-17b3-806d-64e2-dcc2d5841797.png)
RandomShadow
- Simulate shadows
- shadow_roi (float, float, float, float) – The area where the shadow appears (x_min, y_min, x_max, y_max).
- num_shadows_lower (int) – Lower limit for the number of shadows.
- num_shadows_upper (int) – Maximum number of shadows.
- shadow_dimension (int) – The number of sides of the shadow polygon.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/a75d1c83-a2d3-0387-049a-be8bfc231240.png)
other
FancyPCA
- Extension with Fancy PCA
- alpha (float) – How much to scale the eigenvectors and eigenvalues.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/94d0276f-e6ac-bd8d-ddfa-fab2b972ba41.png)
PadIfNeeded
- Pad the edges of the image for the desired resolution
- min_height (int) – Minimum image height. Default: 1024.
- min_width (int) – Minimum image width. Default: 1024.
![image.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/215987/6f90d6b8-1484-c231-65a5-344d673a7d2e.png)