Bulk download images from specific site URLs with python

Judging whether the acquired URL is a relative bus or an absolute path, and if it is a relative path, do not do the work of making it an absolute path this time Please note that the img path is a program created on the condition that only the absolute path is used on the site you want to acquire, so if you try to acquire an image from a site that uses a relative path, an ERROR will occur. ~~ I will write down the detailed explanation (?) Of the code in the blog linked below. ~~ (The blog has been released) (Scheduled as of August 11, 2014)

downloadImg.py


# -*- coding: utf-8 -*- 

import urllib
import urllib2
import os.path
import sys
from HTMLParser import HTMLParser

def download(url):
    img = urllib.urlopen(url)
    localfile = open(os.path.basename(url),'wb')
    localfile.write(img.read())
    img.close()
    localfile.close()

class imgParser(HTMLParser):

    def __init__(self):
        HTMLParser.__init__(self)

    def handle_starttag(self,tagname,attribute):
        if tagname.lower() == "img":
            for i in attribute:
                if i[0].lower() == "src":
                    img_url=i[1]
                    #Creating a file that collects the URLs of the acquired photos
                    f = open("collection_url.txt","a")
                    f.write("%s\t"%img_url)
                    f.close()
        
if __name__ == "__main__":

    print('Enter the URL of the site where you want to get the photo.')
    input_url = raw_input('>>>  ')
    serch_url = input_url
    htmldata = urllib2.urlopen(serch_url)
    
    print('Currently getting image files...')

    parser = imgParser()
    parser.feed(htmldata.read())

    parser.close()
    htmldata.close()

    #Read the generated file
    f = open("collection_url.txt","r")
    for row in f:
        row_url = row.split('\t')
        len_url = len(row_url)
    f.close()

    number_url = []

    for i in range(0,(len_url-1)):
        number_url.append(row_url[i])

    for j in range(0,(len_url-1)):
        url = number_url[j]
        download(url)

    print('The image download is complete.')

    #Delete file
    os.remove("collection_url.txt")

Twitter :@fantmsite ~~ Blog: Fantm Site-BLOG ~~

Recommended Posts

Bulk download images from specific site URLs with python
Bulk download images from specific URLs with python
Batch download images from a specific URL with python Modified version
Download images from URL list in Python
Scraping from an authenticated site with python
[Python] Download original images from Google Image Search
Convert PDFs to images in bulk with Python
Load images from URLs using Pillow in Python 3
Automatically download images with scraping
Bordering images with python Part 1
With skype, notify with skype from python!
Download csv file with python
Number recognition in images with Python
Get PowerShell commands from malware dynamic analysis site with BeautifulSoup + Python
Call C from Python with DragonFFI
Download images from "Irasutoya" using Scrapy
Using Rstan from Python with PypeR
Implemented file download with Python + Bottle
Install Python from source with Ansible
Create folders from '01' to '12' with python
I can't download images with Google_images_download
Extract text from images in Python
Post multiple Twitter images with python
Run Aprili from Python with Orange
Post images from Python to Tumblr
Animate multiple still images with Python
Load gif images with Python + OpenCV
Call python from nim with Nimpy
[Python] Collect images easily with icrawler!
Read fbx from python with cinema4d
Working with DICOM images in Python
Upload and download images with falcon
[Python] Try to recognize characters from images with OpenCV and pyocr
Download XBRL of securities report, quarterly report, financial report from EDINET / TDNET with Python
Collecting information from Twitter with Python (Twitter API)
Receive textual data from mysql with python
Get html from element with Python selenium
[Note] Get data from PostgreSQL with Python
Play audio files from Python with interrupts
Create wordcloud from your tweet with python3
Amplify images for machine learning with python
Read CSV file with python (Download & parse CSV file)
Exclude tweets containing URLs with tweepy [Python]
HTTP split download guy made with Python
Capturing images with Pupil, python and OpenCV
Tweet from python with Twitter Developer + Tweepy
Download Japanese stock price data with python
Business efficiency starting from scratch with Python
Decrypt files encrypted with openssl from python with openssl
Working with Azure CosmosDB from Python Part.2
Image acquisition from camera with Python + OpenCV
Download files on the web with Python
Horse Racing Site Web Scraping with Python
[python, openCV] base64 Face recognition with images
Getting started with Dynamo from Python boto
[Python] Read images with OpenCV (for beginners)
Try calling Python from Ruby with thrift
Get images from specific users on Twitter
Add Gaussian noise to images with python2.7
Easily download mp3 / mp4 with python and youtube-dl!
Use C ++ functions from python with pybind11