[Python] Let's execute the module regularly using schedule

Introducing the schedule module, which is useful for running modules on a regular basis. schedule is a useful module when you want to perform the same work at regular intervals (every few minutes, hours, days). It is a convenient module for collecting information by web scraping.

This time, after summarizing the basic usage of the schedule module, we will introduce a module that periodically executes scraping as an implementation example.

schedule installation

The schedule module is available with the pip command. It will be installed by executing the following command on the command prompt.

pip install schedule

Sample code

sample_schedule.py


import schedule
import time

#Execution job function
def job():
    print("job execution")


#Register job execution every minute
schedule.every(1).minutes.do(job)

#Register job execution every hour
schedule.every(1).hours.do(job)

#AM11:Register 00 job execution
schedule.every().day.at("11:00").do(job)

#Register job execution on Sunday
schedule.every().sunday.do(job)

#Wednesday 13:Register 15 job runs
schedule.every().wednesday.at("13:15").do(job)

#Job execution monitoring, job function executed at the specified time
while True:
    schedule.run_pending()
    time.sleep(1)

This is a sample module of schedule. The job function is executed at certain intervals, and it becomes a module in which "job execution" is displayed. `` `schedule.every``` is a description to register the job to be executed and the execution interval. It can be executed every few minutes, every few hours, or even at a specific date and time.

#Run job every minute
schedule.every(1).minutes.do(job)

The job registered in the above `schedule.every``` will be executed by the following schedule.run_pending () . If you just call schedule.run_pending () `` normally, the module will end once the job is executed, so you need to put it in an infinite loop state ** with a ** while statement. .. By setting it to the infinite rule state, it is possible to continue to perform the same processing at regular intervals.

while True:
    schedule.run_pending()
    time.sleep(1)

Periodically collect Yahoo news using schedule

Using the schedule module introduced this time, I created a module that performs scraping on a regular basis. It is a module that accesses Yahoo News every hour and gets the title and URL of the news.

scraping_schedule.py


from urllib.request import urlopen
from urllib.error import HTTPError
from urllib.error import URLError
from bs4 import BeautifulSoup
import re
import schedule
import time

def job():
    try:
        html = urlopen('https://news.yahoo.co.jp/topics')
    except HTTPError as e:
        print(e)
    except URLError as e:
        print(e)

    else:

#Access Yahoo topics and collect news information
        bs = BeautifulSoup(html.read(), 'lxml')
        newsList = bs.find('div', {'class': 'topicsListAllMain'}).find_all('a')

#Get the news title and URL from the obtained List and display it
        for news in newsList:
            if re.match('^(https://)', news.attrs['href']):
                print(news.get_text())
                print(news.attrs['href'])

#Run job every hour
schedule.every(1).hours.do(job)

while True:
    schedule.run_pending()
    time.sleep(1)

Summary

This time, I introduced a schedule module that can execute irregular jobs at regular intervals. I mainly use it for scraping, but I think it has a wide range of uses because it is a module that can also be used for regular work execution.

References

Execute job with Python Schedule library Regular execution of Python scripts using the schedule library

Recommended Posts

[Python] Let's execute the module regularly using schedule
Try using the Python Cmd module
I tried using the Datetime module by Python
Operate the schedule app using python from iphone
Try using the collections module (ChainMap) of python3
Let's make a module for Python using SWIG
Let's use the Python version of the Confluence API module.
Write data to KINTONE using the Python requests module
About the Python module venv
View using the python module of Nifty Cloud mobile backend
Solve the Japanese problem when using the CSV module in Python.
Let's execute commands regularly with cron!
Let's see using input in python
Extract the targz file using python
Let's display the map using Basemap
Master the weakref module in Python
I tried using the python module Kwant for quantum transport calculation
Operate Maya from an external Python interpreter using the rpyc module
Regularly upload files to Google Drive using the Google Drive API in Python
Execute Python code on C ++ (using Boost.Python)
[Blender x Python] Let's master the material !!
Try using the Kraken API in Python
Behind the flyer: Using Docker with Python
Pass the path of the imported python module
Tweet using the Twitter API in Python
Create a graph using the Sympy module
Let's read the RINEX file with Python ①
Working with OpenStack using the Python SDK
Let's summarize the Python coding standard PEP8 (1)
Let's observe the Python Logging cookbook SocketHandler
[Python] Import the module one level higher
Reboot the router using Python, Selenium, PhantomJS
Check the path of the Python imported module
Let's summarize the Python coding standard PEP8 (2)
Create a record with attachments in KINTONE using the Python requests module
Execute the COPY command using python's Psycopg
Install the Python module in any directory
python setup.py test the code using multiprocess
Execute raw SQL using python data source with redash and display the result
Let's print PDF with python using foxit reader and specify the printer silently!
Introduction to Python Let's prepare the development environment
Aggregate test results using the QualityForward Python library
vprof --I tried using the profiler for Python
Let's parse the git commit log in Python!
Try using the BitFlyer Ligntning API in Python
Python: Try using the UI on Pythonista 3 on iPad
Write a TCP server using the SocketServer module
Try using the Python web framework Tornado Part 1
Let's judge emotions using Emotion API in Python
Character encoding when using csv module of python 2.7.3
Pass values between pages using Python 3.5 cgi module
Specifying the module loading destination with GAE python
[Python3] Let's analyze data using machine learning! (Regression)
Pre-process the index in Python using Solr's ScriptUpdateProcessor
Sound the buzzer using python on Raspberry Pi 3!
[Python] A rough understanding of the logging module
Find the geometric mean of n! Using Python
Try using the Python web framework Tornado Part 2
Examine Python script bottlenecks with the cProfile module
Use the nghttp2 Python module from Homebrew from pyenv's Python
Try using the DropBox Core API in Python