I referred to the following site. https://qiita.com/taedookim/items/63759e79426514c8a729
Use google-images-download. The head family seems to have stopped updating, so from the fork destination.
git clone https://github.com/Joeclinton1/google-images-download.git
Download folder \ google-images-download \ google_images_download \ google_images_download.py
google_images_download.py line 935
NG: json_file = json.load(open(arguments['config_file']))
OK: json_file = json.load(open(arguments['config_file'], encoding='utf-8'))
Now you can search in Japanese.
image_scraper.py
import os
from google_images_download import googleimagesdownload #importing the library
#Change the current directory to the folder containing the executable file
os.chdir(os.path.dirname(os.path.abspath(__file__)))
print('Changed current working directory')
response = googleimagesdownload() #class instantiation
paths = response.download({"config_file": "config.json"}) #passing the arguments to the function
print(paths) #printing absolute paths of the downloaded images
Download from below https://chromedriver.chromium.org/downloads
Once this is done, it will be available. It seems that you can set multiple search queries.
Renaming is unnecessary, but sample will be inappropriate.
config.json
{
"Records": [
{
"keywords": "apple",
"limit": 5,
"color": "green",
"print_urls": true
},
{
"keywords": "universe",
"limit": 15,
"size": "large",
"print_urls": true
}
]
}
From VS Code or from the console.
Details (English) https://google-images-download.readthedocs.io/en/latest/arguments.html
Observe the following structure. If "limit" is 100 or more, "chrome driver" is required.
{
"Records": [
{
"keywords": "hoge",
"limit": 777,
"format": "png",
"print_urls": true,
"chromedriver": "chromedriver.exe"
}
]
}
Settings | Key | value |
---|---|---|
Maximum number of images | "limit" | Integer, such as 200 |
Image format | "format" | "jpg", "gif", "png", "bmp", "svg", "webp", "ico", "raw" |
Related images | "related_images" | true, false |
size | "size" | "large", "medium", "icon", ">400300", ">640480", ">800600", ">1024768", ">2MP", ">4MP", ">6MP", ">8MP", ">10MP", ">12MP", ">15MP", ">20MP", ">40MP", ">70MP" |
Aspect ratio | "aspect_ratio" | "tall", "square", "wide", "panoramic" |
color | "color" | "red", "orange", "yellow", "green", "teal", "blue", "purple", "pink", "white", "gray", "black", "brown" |
color | "color_type" | "full-color", "black-and-white", "transparent" |
type | "type" | "face", "photo", "clip-art", "line-drawing", "animated" |
time | "time" | "past-24-hours", "past-7-days", "past-month", "past-year" |
period | "time_range" | ‘{“time_min”:”MM/DD/YYYY”,”time_max”:”MM/DD/YYYY”}’ |
license | "usage_rights" | "labeled-for-reuse-with-modifications","labeled-for-reuse", "labeled-for-noncommercial-reuse-with-modification", "labeled-for-nocommercial-reuse" |
Console output | "print_urls" | true, false |
chromedriver | "chromedriver" | "chromedriver.exe" |
Recommended Posts