[PYTHON] Judge the extension and download the image

The blog I wrote a long time ago didn't have an export function. I made the URL list of the image with curl or grep, but since the URL is in a format like / Img? Hogehoge, even if I save it with wget -i, it becomes Img0.1 or Img0.2.

If you search carefully, there may be an option to do something good with curl or wget, but it is troublesome to search, so I wrote a script.

The file name takes the update date from Last-Modified and the extension from context-type. Since there was a file with the same update date, I also added a serial number.

How to use

cat url_list.txt | python get-contents.py

script

get-contents.py


# -*- coding: utf-8 -*-

import sys
import requests
import datetime
import struct

cnt = 0
for line in sys.stdin.readlines():
    r = requests.get(line.strip())
    # print(r.headers)
    ext = (r.headers['Content-Type'].split('/'))[1]
    lm = datetime.datetime.strptime(
        r.headers['Last-Modified'], '%a, %d %b %Y %H:%M:%S GMT')
    fname = lm.strftime('%Y%m%d-%H%M%S') + ('-%03d.' % cnt) + ext
    print(fname)
    with open(fname, "wb") as fout:
        for x in r.content:
            fout.write(struct.pack("B", x))
    cnt = cnt + 1

Recommended Posts

Judge the extension and download the image
Download the top n Google image searches
Add lines and text on the image
Response the resized image using Flask and PILImage
[Small story] Download the image of Ghibli immediately
Implement the Django user extension and register the attached information
POST the image with json and receive it with flask
Resize the image to the specified size and blacken the margins
Download the file in Python
Dot according to the image
The road to download Matplotlib
Remove the frame from the image
100 image processing knocks !! (001 --010) Carefully and carefully
Touch the mock and stub
Image expansion and contraction processing
The image is a slug
Search the file name including the specified word and extension in the directory
Read the image of the puzzle game and output the sequence of each block
[Go] Create a CLI command to change the extension of the image