[PYTHON] Download images using requests

It's just a personal note. Create a Python program that downloads images quickly using a library called requests. ʻurllib.requests` seems to be useful in python3, but it seems that it can not be used in python2 (insufficient research), so I used this. You can set various things such as cookies, but create a simple program that you just access and download.

Official: python-requests

Installation

$ pip install requests

Try using it as a trial

$ python
>>> import requests
>>> url = "http://docs.python-requests.org/en/master/#"
>>> res = requests(url)
>>> res = requests.get(url)
>>> res.status_code
200
>>> res.headers["content-type"]
'text/html'
>>> res.content
'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n...
>>> res.text  
u'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"\n  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html xmlns="http://www.w3.org/1999/xhtml">\n  <head>\n ...

How to use (excerpt)

See The User Guide for more information.

1. How to send a request

When the URL parameter is set, it is given in dictionary format to the argument params.

res = requests.get('http://httpbin.org/get', params={'key':'value'})

print(res.url)  #=> http://httpbin.org/get?key=value

In post and put, form information can be sent with the argument data.

res = requests.post('http://httpbin.org/post', data = {'key':'value'})
res = requests.put('http://httpbin.org/put', data = {'key':'value'})

Methods are provided according to the type of request.

res = requests.get('http://httpbin.org/get')
res = requests.post('http://httpbin.org/post', data = {'key':'value'})
res = requests.put('http://httpbin.org/put', data = {'key':'value'})
res = requests.delete('http://httpbin.org/delete')
res = requests.head('http://httpbin.org/get')
res = requests.options('http://httpbin.org/get')

2. Response processing

You can refer to the following variables.

res = requests.get('http://httpbin.org/get')

# HTML Status Code
response.status_code

#Response header Content-Examine Type
print res.header["content-type"] 

#Acquired data(binary)
print res.content

#Acquired data(Encoded)And encoding
print res.text
print res.encoding

Let's actually download the image

The input is a text file ʻinput.txt with a list of URLs, and the images are output to the output directory ʻimages / in the order of 0.jpg, 1.jpg, 2.jpg, .... In some places, weird code is mixed in because it's cute.

import requests
import os
import sys

#Download image
def download_image(url, timeout = 10):
    response = requests.get(url, allow_redirects=False, timeout=timeout)
    if response.status_code != 200:
        e = Exception("HTTP status: " + response.status_code)
        raise e

    content_type = response.headers["content-type"]
    if 'image' not in content_type:
        e = Exception("Content-Type: " + content_type)
        raise e

    return response.content

#Decide the file name of the image
def make_filename(base_dir, number, url):
    ext = os.path.splitext(url)[1] #Get extension
    filename = number + ext        #Add an extension to the number to make it a file name

    fullpath = os.path.join(base_dir, filename)
    return fullpath

#Save the image
def save_image(filename, image):
    with open(filename, "wb") as fout:
        fout.write(image)

#Main
if __name__ == "__main__":
    urls_txt = "input.txt"
    images_dir = "images"
    idx = 0

    with open(urls_txt, "r") as fin:
        for line in fin:
            url = line.strip()
            filename = make_filename(images_dir, idx, url)

            print "%s" % (url)
            try:
                image = download_image(url)
                save_image(filename, image)
                idx += 1
            except KeyboardInterrupt:
                break
            except Exception as err:
                print "%s" % (err)

Recommended Posts

Download images using requests
Save images using python3 requests
Download images from "Irasutoya" using Scrapy
Collect images using icrawler
Automatically download images with scraping
Categorize cat images using ChainerCV
Download excel using spring mvc
Retry post request using python requests
Geotag prediction from images using DNN
Send messages and images using LineNotify
I can't download images with Google_images_download
Generating multilingual text images using Python
OAuth authentication using requests Example: Flickr
ResourceWarning when using requests: unclosed workaround
Upload and download images with falcon
requests
Download the file from S3 using boto.
Log in to Slack using requests in Python
Download images from URL list in Python
How to download youtube videos using pytube3
Download files in any format using Python
Try to download Youtube videos using Pytube