[PYTHON] I tried using ESPCN

Introduction

This article is [here](https://aiotadiary.wp.xdomain.jp/2020/02/27/espcn%e3%82%92%e7%94%a8%e3%81%84%e3%81% 9f% e8% b6% 85% e8% a7% a3% e5% 83% 8f% e3% 82% 92% e3% 82% 84% e3% 81% a3% e3% 81% a6% e3% 81% bf% This is a rewrite of the e3% 81% 9f /) article for qiita.

Super resolution

In a nutshell, super-resolution means "increasing the resolution of an image." When you hear that, you may think that you just stretch the image, but when you reduce the image, the information of the original image is inevitably lost. Let's do a simple verification. First, reduce the size of the appropriate image to half. Then use the bilinear method to enlarge the image so that it is the same size as the original image. The result is the image below. image.png The image on the left is the image below, and the image on the right is the image that was once reduced and then enlarged.

It's hard to understand at a glance, but even if the resolution is the same, it becomes blurry and small features disappear. The task of super-resolution is not just to increase the resolution, but to convert it into a high-resolution image while compensating for small features.

About ESPCN

ESPCN (efficient sub-pixel convolutional neural network) is a deep neural network-based super-resolution model announced in 2016.

A similar model is the SRCNN (Super-Resolution Convolutional Neural Network) announced in 2015. This is a model that uses a neural network for super-resolution tasks and transforms it into a high-resolution image with three convolution layers.

As a special point of ESPCN, when increasing the resolution, an operation called deconvolution is usually added, but this operation may generate grid noise. To overcome this, ESPCN raises the resolution by an operation called Pixel Shuffle. For example, if you want to double the size of an image, create four images just before output and combine the four images in a fixed positional relationship for output.

image.png

Implementation and results

I learned using the code of here and the image obtained by scraping. By the way, about 10,000 images were used for learning. image.png The image on the left is the lower image, the image in the center is enlarged after being reduced once, and the image on the right is using ESPCN.

There is a slight sense of incongruity in the details compared to the original image, but most of it has been restored. If you use this, it seems that you can also increase the resolution of videos.

At the end

This time, I tried super-resolution using ESPCN. It seems interesting to try creating APIs using such a model. In addition, it seems that you can also use this model to increase the resolution of images generated using GAN etc.

The following are the sites and papers that I referred to when creating this article.

[Super-resolution images: ESPCN pytorch implementation / learning](https://nykergoto.hatenablog.jp/entry/2019/05/28/%E7%94%BB%E5%83%8F%E3%81% AE% E8% B6% 85% E8% A7% A3% E5% 83% 8F% E5% BA% A6% E5% 8C% 96% 3A_ESPCN_% E3% 81% AE_pytorch_% E5% AE% 9F% E8% A3% 85_ / _% E5% AD% A6% E7% BF% 92)  Image Super-Resolution Using Deep Convolutional Networks(2015)  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network(2016)

Recommended Posts

I tried using ESPCN
I tried using parameterized
I tried using argparse
I tried using mimesis
I tried using anytree
I tried using aiomysql
I tried using Summpy
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried using openpyxl
I tried using Ipython
I tried using PyCaret
I tried using cron
I tried using ngrok
I tried using face_recognition
I tried using Jupyter
I tried using PyCaret
I tried using Heapq
I tried using doctest
I tried using folium
I tried using jinja2
I tried using folium
I tried using time-window
[I tried using Pythonista 3] Introduction
I tried using Random Forest
I tried using BigQuery ML
I tried using Amazon Glacier
I tried using git inspector
[Python] I tried using OpenPose
I tried using magenta / TensorFlow
I tried using AWS Chalice
I tried using Slack emojinator
I tried using Rotrics Dex Arm # 2
I tried using Rotrics Dex Arm
I tried using GrabCut of OpenCV
I tried using Thonny (Python / IDE)
I tried server-client communication using tmux
I tried reinforcement learning using PyBrain
I tried deep learning using Theano
Somehow I tried using jupyter notebook
[Kaggle] I tried undersampling using imbalanced-learn
I tried shooting Kamehameha using OpenPose
I tried using the checkio API
[Python] I tried using YOLO v3
I tried asynchronous processing using asyncio
I tried scraping
I tried PyQ
I tried AutoKeras
I tried papermill
I tried django-slack
I tried Django
I tried spleeter
I tried cgo
I tried using Amazon SQS with django-celery
I tried using Azure Speech to Text.
I tried using Twitter api and Line api
I tried using YOUTUBE Data API V3
I tried using Selenium with Headless chrome