[PYTHON] Get iPad maintenance by scraping and notify Slack

Introduction

The government will also provide benefits, so I want to buy an iPad I didn't have at a low price with maintenance items! I thought. Click here for maintenance items → https://www.apple.com/jp/shop/refurbished

What I wanted was the iPad Air (3rd generation), but at the time of writing this article, it was quite a battle, and even if it was newly raised, it would be killed instantly. So, I decided to scrape it regularly so that other people wouldn't get ahead of me, and notify me on Slack so that I could purchase it. I created a Python program easily and maintained the iPad Air safely. I was able to buy the item, so I will introduce it this time.

What I did this time

What was used

・ Python ・ Beautiful Soup ・ Slack API ・ Windows time scheduler

Installation of scraping environment

pip install requests
pip install beautifulsoup4
pip install lxml

Get Token for Slack API

I made a Slack that contains only myself, and posted a notification just by posting the API in the form as shown in the link (https://qiita.com/ik-fib/items/b4a502d173a22b3947a0). I got and linked Incoming Webhooks.

Python code

I'm scraping using Beautiful Soup and Python3. This time, I wanted to get the iPad Air sales list, so I am trying to notify when the character strings "Air" and "Wi-Fi" are included. (Since the menu bar is notified only with Air, the Wi-Fi character string is also included) If you want an iPad Pro, you can use "Pro", and if you want an iPad mini, you can use "mini" (I don't know what to do if it's unmarked ...).

apple.py


# coding:utf-8
from bs4 import BeautifulSoup
import json
import requests

if(__name__ == "__main__"):
    url="https://www.apple.com/jp/shop/refurbished/ipad"#List of refurbished products for iPad
    headers = {"User-Agent": "Mozilla/5.0"}
    soup2 = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')#Get html
    a_tags=soup2.select("a")#<a>Get tags
    for entry in a_tags:#Each acquired<a>About tags
        if 'Air' in entry.text and 'Wi-Fi' in entry.text:#When you have an iPad Air
            slackapi="API for the Slack workspace you want to post"
            text = '<!channel> '+entry.text
            #Notify Slack
            requests.post(slackapi, data = json.dumps({
                "text": text
            }))

Periodic execution

This time, I had a Windows desktop that was always turned on, so I used the Windows task scheduler to run Python on a regular basis. (I think it's good to use AWS Lambda)

See here (https://qiita.com/kawa-Kotaro/items/4005a43eb686eae41448) for how to set the task scheduler. By the way, I made two triggers every 5 minutes and executed them every 2 and a half minutes.

Frequent scraping over a short span can overwhelm the web server, causing annoyance to the crawled side and being considered a DoS attack (an attack that overloads the server and disrupts service). So be careful.

Execution result

If all goes well, you can notify Slack like this: In some cases, it's faster than the Twitter bot, the bot that posts refurbished products. スクリーンショット 2020-05-07 01.44.32.png

Recommended Posts

Get iPad maintenance by scraping and notify Slack
[Python x Zapier] Get alert information and notify with Slack
Get Splunk download link by scraping
Nogizaka46 Get blog images by scraping
Get property information by scraping with python
Try web scraping now and get lottery 6 data
I tried to get an image by scraping
Get a list of Qiita likes by scraping
Notify error and execution completion by LINE [Python]
Get boat race match information by web scraping
Get a participant's username and screen name in Slack