This article is [here](https://aiotadiary.wp.xdomain.jp/2020/02/27/espcn%e3%82%92%e7%94%a8%e3%81%84%e3%81% 9f% e8% b6% 85% e8% a7% a3% e5% 83% 8f% e3% 82% 92% e3% 82% 84% e3% 81% a3% e3% 81% a6% e3% 81% bf% This is a rewrite of the e3% 81% 9f /) article for qiita.
In a nutshell, super-resolution means "increasing the resolution of an image." When you hear that, you may think that you just stretch the image, but when you reduce the image, the information of the original image is inevitably lost. Let's do a simple verification. First, reduce the size of the appropriate image to half. Then use the bilinear method to enlarge the image so that it is the same size as the original image. The result is the image below. The image on the left is the image below, and the image on the right is the image that was once reduced and then enlarged.
It's hard to understand at a glance, but even if the resolution is the same, it becomes blurry and small features disappear. The task of super-resolution is not just to increase the resolution, but to convert it into a high-resolution image while compensating for small features.
ESPCN (efficient sub-pixel convolutional neural network) is a deep neural network-based super-resolution model announced in 2016.
A similar model is the SRCNN (Super-Resolution Convolutional Neural Network) announced in 2015. This is a model that uses a neural network for super-resolution tasks and transforms it into a high-resolution image with three convolution layers.
As a special point of ESPCN, when increasing the resolution, an operation called deconvolution is usually added, but this operation may generate grid noise. To overcome this, ESPCN raises the resolution by an operation called Pixel Shuffle. For example, if you want to double the size of an image, create four images just before output and combine the four images in a fixed positional relationship for output.
I learned using the code of here and the image obtained by scraping. By the way, about 10,000 images were used for learning. The image on the left is the lower image, the image in the center is enlarged after being reduced once, and the image on the right is using ESPCN.
There is a slight sense of incongruity in the details compared to the original image, but most of it has been restored. If you use this, it seems that you can also increase the resolution of videos.
This time, I tried super-resolution using ESPCN. It seems interesting to try creating APIs using such a model. In addition, it seems that you can also use this model to increase the resolution of images generated using GAN etc.
The following are the sites and papers that I referred to when creating this article.
[Super-resolution images: ESPCN pytorch implementation / learning](https://nykergoto.hatenablog.jp/entry/2019/05/28/%E7%94%BB%E5%83%8F%E3%81% AE% E8% B6% 85% E8% A7% A3% E5% 83% 8F% E5% BA% A6% E5% 8C% 96% 3A_ESPCN_% E3% 81% AE_pytorch_% E5% AE% 9F% E8% A3% 85_ / _% E5% AD% A6% E7% BF% 92) Image Super-Resolution Using Deep Convolutional Networks(2015) Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network(2016)
Recommended Posts