Quote: http://mscoco.org/
A dataset with image recognition, segmentation, and captioning.
Even if you want to do image processing, the first problem that conflicts is the data set problem, but Microsoft COCO has the necessary data set, and APIs for python and Matlab are also provided, so use it. The feature is that it is easy.
First of all, please download the dataset from the following.
http://mscoco.org/dataset/#download
A screen like the one above will appear. Image data is large in size, so if you just want to try what kind of analysis you can do, you should download Validation Data and try it.
MS COCO API
API for analyzing the above data. All you have to do is download the data and you can analyze it freely. I use python a lot, so I built an environment with python and tried it.
https://github.com/pdollar/coco
# API Download
git clone https://github.com/pdollar/coco
# API install
cd coco/PythonAPI
python setup.py install
Create the following directory in the coco folder and store the image and annotation data.
annotations images
MatlabAPI README.txt images results
PythonAPI annotations license.txt
The folder of Python API is as follows, and it can be parsed by using ipython notebook.
Makefile pycocoDemo.ipynb pycocotools
build pycocoEvalDemo.ipynb setup.py
Please refer to the following for the introduction of ipython notebook
http://qiita.com/icoxfog417/items/175f69d06f4e590face9
Now it's time to start the analysis.
I will pick up and explain only the parts that seem to be important.
Select from 3 types of train2014, val2014, test2014 in dataType.
dataDir ='..'
dataType='val2014'
annFile='%s/annotations/instances_%s.json'%(dataDir,dataType)
You can see the data category and super category with the code below. Super categories are like a superordinate concept of categories. Example: Category: dogs, cats ... Super Category: Animals
# initialize COCO api for instance annotations
coco=COCO(annFile)
# display COCO categories and supercategories
cats = coco.loadCats(coco.getCatIds())
nms=[cat['name'] for cat in cats]
print 'COCO categories: \n\n', ' '.join(nms)
nms = set([cat['supercategory'] for cat in cats])
print 'COCO supercategories: \n', ' '.join(nms)
Select which category of data to select with the code below. This time, "dog" is selected. Randomly selected and read from the data in the "dog" category.
# get all images containing given categories, select one at random
catIds = coco.getCatIds(catNms=['dog']);
imgIds = coco.getImgIds(catIds=catIds );
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
Images are loaded and displayed below.
# load and display image
I = io.imread('%s/%s/%s'%(dataDirImage,dataType,img['file_name']))
plt.figure()
plt.imshow(I)
This is the displayed image. It will be very healing. Lol
Let's analyze the healing image.
# load and display instance annotations
plt.imshow(I)
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)
You can see that the dog data in the image is green.
I'm filtering on the coco.getAnnIds
part.
Looking at the explanation of this function, it is as follows.
This time, the id of the read image is processed by ʻimgIds = img ['id'], and the image is filtered by
catIds = catIds`.
"""
Get ann ids that satisfy given filter conditions. default skips that filter
:param imgIds (int array) : get anns for given imgs
catIds (int array) : get anns for given cats
areaRng (float array) : get anns for given area range (e.g. [0 inf])
iscrowd (boolean) : get anns for given crowd label (False or True)
:return: ids (int array) : integer array of ann ids
"""
Next, let's look at the caption of the image. An image caption is a description of the image.
I'm getting captions by specifying an image with caps.getAnnIds
.
Images and captions are generated using captions obtained with other code.
# load and display caption annotations
annIds = caps.getAnnIds(imgIds=img['id']);
anns = caps.loadAnns(annIds)
caps.showAnns(anns)
plt.imshow(I)
plt.show()
I was able to confirm multiple caption candidates. Captions are interpreted differently depending on the person, so it would be helpful if there were multiple candidates.
I think it's easy to face the problem of lack of data even if you want to analyze images. It can be said that Microsoft provides something that can be easily analyzed in this way. We are also paying attention to the future activities of Microsoft, which is becoming more open, such as Visual Studio Code.
RuntimeError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are Working with Matplotlib in a virtual enviroment see 'Working with Matplotlib in Virtual environments' in the Matplotlib FAQ
In case of the above error, add the following line to ~ / .matplotlib / matplotlibrc
backend : TkAgg
Recommended Posts