[PYTHON] How to insert a specific process at the start and end of spider with scrapy

It is a method to write a function that hooks at the start and end of spider.

Place the following content directly under the project.

import scrapy

class SpiderHook(object):

    @classmethod
    def from_crawler(cls, crawler):
        ext = cls

        crawler.signals.connect(ext.spider_opened, signal=scrapy.signals.spider_opened)
        crawler.signals.connect(ext.spider_closed, signal=scrapy.signals.spider_closed)

        return ext

    def spider_opened(self, spider):
        #Processing at the start of spider

    def spider_closed(self, spider):
        #Processing at the end of spider

Then write the settings to load this class in settings.py.

EXTENSIONS = {
    '<project name>.<file name>. SpiderHook': 100,
}

reference: https://doc.scrapy.org/en/latest/topics/extensions.html

Recommended Posts

How to insert a specific process at the start and end of spider with scrapy
Specify the start and end positions of files to be included with qiitap
How to start the PC at a fixed time every morning and execute the python program
Node.js: How to kill offspring of a process started with child_process.fork ()
How to display the CPU usage, pod name, and IP address of a pod created with Kubernetes
How to divide and process a data frame using the groupby function
[Introduction to Python] How to sort the contents of a list efficiently with list sort
How to put a line number at the beginning of a CSV file
Try to react only the carbon at the end of the chain with SMARTS
How to calculate the volatility of a brand
Send Gmail at the end of the process [Python]
Remove specific strings at the end of python
How to put a lot of pipelines together and put them away at once
How to apply updlock, rowlock, etc. with a combination of SQLAlchemy and SQLServer
How to get a list of files in the same directory with python
[Introduction to Python] How to get the index of data with a for statement
How to create a submenu with the [Blender] plugin
Tasks at the start of a new python project
How to identify the element with the smallest number of characters in a Python list?
A memo on how to overcome the difficult problem of capturing FX with AI
[Linux] [C / C ++] How to get the return address value of a function and the function name of the caller
[Ubuntu] How to delete the entire contents of a directory
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
How to kill a process instantly with Python's Process Pool Executor
How to start a simple WEB server that can execute cgi of php and python
How to shuffle a part of a Python list (at random.shuffle)
Get UNIXTIME at the beginning of today with a command
To extract the data of a specific column in a specific sheet in multiple Excel files at once and put the data in each column in one row
Recursively get the Excel list in a specific folder with python and write it to Excel.
Detect objects of a specific color and size with Python
A story about how to deal with the CORS problem
How to find the scaling factor of a biorthogonal wavelet
Create a 2D array by adding a row to the end of an empty array with numpy
How to process camera images with Teams and Zoom Volume of processing in animation style
[End of 2020] A memo to start using AWS CLI (Version 2)
How to plot a lot of legends by changing the color of the graph continuously with matplotlib
A story about porting the code of "Try and understand how Linux works" to Rust
How to start the program
How to connect the contents of a list into a string
I want to improve efficiency with Python even in the experimental system (5) I want to send a notification at the end of the experiment with the slack API
I tried to make a script that traces the tweets of a specific user on Twitter and saves the posted image at once
Process the gzip file UNLOADed with Redshift with Python of Lambda, gzip it again and upload it to S3
Find the white Christmas rate by prefecture with Python and map it to a map of Japan
How to intercept or tamper with the SSL communication of the actual iOS device by a proxy
Learn the flow of Bayesian estimation and how to use Pystan through a simple regression model
[Python scraping] Output the URL and title of the site containing a specific keyword to a text file
How to set a shortcut to switch full-width and half-width with IBus
How to batch start a python program created with Jupyter notebook
Process the contents of the file in order with a shell script
Overview of how to create a server socket and how to establish a client socket
Save the results of crawling with Scrapy to the Google Data Store
[Python] How to force a method of a subclass to do something specific
[Introduction to Python] How to split a character string with the split function
How to get the ID of Type2Tag NXP NTAG213 with nfcpy
[EC2] How to install chrome and the contents of each command
How to check the memory size of a variable in Python
[Python] How to get the first and last days of the month
[Introduction to StyleGAN] I played with "The Life of a Man" ♬
How to make a command to read the configuration file with pyramid
How to make a surveillance camera (Security Camera) with Opencv and Python
Here's a brief summary of how to get started with Django