[PYTHON] Monitor web page updates with LINE BOT
tl;dr
- If it was completely personal, I could easily create a LINE bot.
- Broadcast (send to all registered friends) can be run without a server.
- Also, by using ngrok etc., it was possible to post to a specific small number of groups without the need for a server.
Introduction
- There may be times when you want to check the web page regularly, such as the limited edition sales page.
- Here, I wrote the code to monitor regular updates and notify by LINE BOT in Python.
flow
- Register with LINE Developers
- Get a Channel access token
- Make friends with bots on LINE
- Write a LINE BOT using line / line-bot-sdk-python
- Write update monitoring code using psf / requests-html
- If you want to post to a specific user or a specific group, you need to get an Id indicating the posting destination.
- You need a web server because you need to receive events for BOTs on your webhook
- If preparation is difficult, use ngrok etc. for the time being.
- Simply receive a webhook in ngrok and Python
Description
Register with LINE Developers
- LINE Developers-To use Messaging API
- It's best to see the formula. Please note that the UI may change a little.
- You need a LINE ID to log in. Please go to the channel creation.
- The channel to be created is Messaging API channel.
Channel selection |
|
Get a Channel access token
- After creating the channel, issue an Issue from the Messaging API tab to get the ``` Channel access token (long-lived)` ``.
channel access token |
|
Register as a friend
- Obtained a channel access token Read the QR code at the top of this screen and make friends on LINE.
- LINE officially publishes an SDK for python, so use this.
Install SDK
$ pip install line-bot-sdk
I will post to all my friends and the group I got earlier
- broadcast will be sent to all registered friends.
from linebot import LineBotApi
from linebot.models import TextSendMessage
access_token = 'XXXXXXXXXXXXXXX'
line_bot_api = LineBotApi(access_token)
line_bot_api.broadcast(TextSendMessage(text='Broadcast to all friends'))
Operation check
- The BOT should have sent the message "Broadcast to all your friends" as shown below.
- If it doesn't work, please review access_token, friend registration, etc.
Message example |
|
- In order to monitor the web page, it is necessary to interpret the displayed contents.
- Here, I will use psf / requests-html.
Installation
$ pip install requests-html
How to use
Description
- As an example, let's see if you can order products on the following page.
- On this web page, whether or not you can place an order is expressed as follows.
- The location of "Items that cannot be ordered" will change to "In stock" etc., so it seems good to keep an eye on this.
<div class="status-text">
<div class="status-heading">
<span class="status">Items that cannot be ordered*</span>
</div>
<div class="status-note">
<p>
</p>
</div><!-- /div#status-note -->
</div>
Try scraping
- After accessing the web page as below
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://XXXXXXXXXXXXX/') # <----Enter the URL you want to access here
- You can use r.html.find () to narrow down the elements.
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('hxxps://XXXXXXXXXXXXX/') # <----Enter the URL you want to access here
r.html.find('div.status-heading span.status')[0].text # <-----Items that cannot be ordered*
- Since I got the character string well, I think it's okay to judge the condition with this.
Try to combine it with the notification part of LINE
- Try to send a notification when the out-of-stock string disappears.
# -*- coding:utf-8 -*-
from linebot import LineBotApi
from linebot.models import TextSendMessage
from requests_html import HTMLSession
ACCESS_TOKEN = 'XXXXXXXXXXX'
TARGET_URL = 'hxxps://XXXXXXXXXXXXX/rb/16057071/'
STATUS_CSS_SELECTOR = 'div.status-heading span.status'
NG_STATUS = 'Items that cannot be ordered'
def get_status():
session = HTMLSession()
r = session.get(TARGET_URL)
return r.html.find(STATUS_CSS_SELECTOR)[0].text
def broadcast_to_friends(message):
line_bot_api = LineBotApi(ACCESS_TOKEN)
line_bot_api.broadcast(TextSendMessage(text=message))
if not NG_STATUS in get_status():
broadcast_to_friends("You can buy the product:" + TARGET_URL)
- When I executed it, the following message was sent.
|Message example|
|:-:|:
||
When you want to post to a specific user or a specific group
- It will be a push message instead of broadcast.
- https://developers.line.biz/en/reference/messaging-api/#send-push-message
- In order to send this, you need an ID that identifies group etc.
- The ID will be sent from LINE via webhook when the operation on the BOT is performed, so prepare for it.
Receive webhook
- Please prepare the environment to receive webhooks by referring to Simply receive webhooks with ngrok and Python.
- When you're ready, enter that URL into the Webhook URL on the Messaging API tab
- Press Verify and if successful, check Use Webhook.
Setting Example |
|
Put a BOT in a group and get a group_id
- Next, set
Allow bot to join group chats
on the Messaging API tab to Enabled.
Setting Example |
|
- In that state, if you put this LINE BOT in an appropriate group, the webhook will fly.
- ngrok has a screen where you can confirm the request, so you can confirm it as follows.
- All you need is events [0] .source.groupId.
Example of checking on the inspect screen of ngrok |
|
To push_message instead of broadcast
- Once you have the groupId, use
`push_message``` instead of
`broadcast```.
- https://github.com/line/line-bot-sdk-python#push_messageself-to-messages-notification_disabledfalse-timeoutnone
- At this time, give id as the first argument.
-line_bot_api.broadcast(TextSendMessage(text='Broadcast to all friends'))
+line_bot_api.push_message(group_id, TextSendMessage(text='Push message to an individual'))
Finally
- After that, you can complete it by executing it regularly with the task scheduler or cron.
- In addition, accessing a web page mechanically in this way may put a load on the server to which it is accessed, so it is recommended that you read the following page when actually operating it.
- http://librahack.jp/
- https://vaaaaaanquish.hatenablog.com/entry/2017/12/01/064227