The first thing you need to do to create a machine learning model for recognizing objects in an image is to collect a large number of training images. Common items such as dogs and cars can be downloaded from services such as ImageNet, but there are no images of Japanese celebrities, for example. This time, I will introduce how to collect image data for machine learning using the Tumblr API.
Click here for Google Custom Search API Pikachu
Click Register App
Next, enter the application information. URL input (application website, App Store URL, Google Play Store URL) is required, but since we do not actually create an Oauth application, we will dodge it brilliantly with an appropriate URL. (This time, I used the URL of the app I made a long time ago)
Then a screen like this will be displayed, so click the Explore API.
Click on permission
Then the screen will look like this Click Show Keys in the upper right
The API Key is displayed here. This is what I wanted this time, so make a note of it.
Now, let's actually get the image using the obtained API KEY. Tumblr has a lot of photo posts, so it seems that it is not suitable for acquiring characters such as Pikachu. So this time I will get a picture of Mr. Riho Yoshioka, who is popular recently. The acquired images are saved in a directory called images. (Reference: http://taka-say.hateblo.jp/entry/2016/12/19/235554)
import requests
import time
import shutil
LOOP = 10
URL = 'https://api.tumblr.com/v2/tagged'
payload = {
'api_key': 'YOUR API KEY HERE',
'tag': 'Yoshioka Riho'
}
image_idx = 0
photo_urls = []
for i in range(LOOP):
response_json = requests.get(URL, params=payload).json()
for data in response_json['response']:
if data['type'] != 'photo':
continue
for photo in data['photos']:
photo_urls.append(photo['original_size']['url'])
if(len(response_json['response']) == 0):
continue
payload['before'] = response_json['response'][(len(response_json['response']) - 1)]['timestamp']
for photo_url in photo_urls:
path = "images/" + str(image_idx) + ".png "
r = requests.get(photo_url, stream=True)
if r.status_code == 200:
with open(path, 'wb') as f:
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
image_idx+=1
Yes, I got a lot of images like this. so cute!
Recommended Posts