[PYTHON] Streamline information gathering with the Twitter API and Slack bots

Streamline information gathering with the Twitter API and Slack bots

It is important to collect information on websites, Twitter, blogs, papers, etc. in order to know the latest topics, but it is very troublesome. This time, we have realized the following functions in Slack to improve the efficiency of information collection.

--Inject RSS into Slack --Inject emails (mainly Google Alerts) into Slack --Inject tweets of specific users into Slack

It is convenient that you can centrally manage the necessary information without checking the timeline of each site, and that the read history remains.

Building an automatic collection system for Twitter is the most difficult (although most can be copied), and technically it uses the Twitter API, heroku, and Python (Tweepy).

Pour RSS into Slack

An RSS reader is provided as an app feature of Slack, and it is very easy to execute. RSS can be streamed on a channel-by-channel basis, and bookmarks and read items are provided as Slack functions, so you can use it in a similar way to Feedly.

There is already a lot of information on how to do this.

https://slack.com/intl/ja-jp/help/articles/218688467-Slack-に-RSS-フィードを追加する

https://ferret-plus.com/14641

You can also collect post information of specific tags of Qiita by RSS. It's easy because you don't have to look at the timeline every time.

Send email to Slack

I also collect information through various e-mail newsletters and Google Alerts, so I need to put it into Slack. There is already a lot of information here as well. There are several methods, but I do the following.

  1. Issue an email address to forward to Slack Direct Message --Send an email to https://slack.com/intl/ja-jp/help/articles/206819278-Slack- -This is the section "Set the forwarding email address" on the ↑ page.
  2. Gmail forwards specific emails to the above addresses

Other methods include: I tried various things, but the above was the best in terms of usability.

Zapier

--Send a specific email to a specific channel. --For details on how to do this, refer to https://qiita.com/mr_t_free/items/fd05ac8306fd5fcc1b37, etc. ――However, Slack does not support HTML in the first place, and the layout of Google Alert etc. is broken and it is not easy to use (If you transfer to a direct message, the email will be embedded and displayed neatly)

Slack Email App

――I haven't tried it because it is charged. However, it has the advantage that it is easy to set up and you can forward mail to a specific channel, so it seems convenient if you can pay for it.

Slack for Gmail

--Since the transfer is performed manually, the information collection was not automated.

Gripping implementation using API

――I haven't tried it because it seems to take time.

Stream specific user tweets into Slack

At first I thought it would be easy with IFTTT, but it wasn't easy to use and I gave up after all. I built a Twitter automatic collection bot. The general procedure is as follows.

  1. Environment construction, service registration, token acquisition
  2. Tweet collection and Slack transfer system
  3. heroku executable file
  4. Deploy

The following two are very helpful. Rather, it's almost a copy.

https://qiita.com/yuhkan/items/805159f88dd0ad6b21c7

https://stefafafan.hatenablog.com/entry/2015/11/14/220601

Environment

Mainly uses Python. Tweepy is required as a package.

https://kurozumi.github.io/tweepy/

pip install tweepy
pip install requests

I also use pytz for detailed processing. I also need git.

Registration for various services

Use Twitter API, Slack Webhook, heroku.

The Twitter API requires an access token.

https://qiita.com/kngsym2018/items/2524d21455aac111cdee#consumer-api-keysアクセストークン情報を使用したpythonスクリプト

Slack Webhook requires the URL of your channel, so make a note of it as well.

https://slack.com/intl/ja-jp/help/articles/115005265063-Slack-での-Incoming-Webhook-の利用

heroku uses the free version (no credit registration). However, I will not be able to operate at full capacity for a month, so I plan to register for credit in the future.

Collecting Tweets and forwarding them to Slack

I'm using Streaming instead of Rest API. This is much easier to implement. Specifically, pass the list of user IDs you want to collect as query totweepy.StreamListener.filter (follow = query). Converting a username to a username is also easy with tweepy. In addition to monitoring specific users, you can also monitor keywords.

As a practical implementation of the Slack transfer part, in the ʻon_status method of tweepy.StreamListener`,

--Processing to filter unnecessary tweets such as retweets --Processing process to format data for Slack transfer --Process to transfer to Slack

I will bite you. Filtering for unwanted tweets looks like below, for example.

def is_invalid_tweet(self, status):             
    if isinstance(status.in_reply_to_status_id, int):
        #True for reply
        return True

    if "RT @" in status.text[0:4]:
        #True for RT
        return True

    return False

https://www.pytry3g.com/entry/twitter-api-seq2seq

Below is the formatting to json.

import json
from datetime import timedelta 

def format_status(status):
    channel = '#twitter_collection'
    text = status.text
    status.created_at += timedelta(hours=9) #In Japan time
    username = str(status.user.name) + '@' + str(status.user.screen_name) + ' (from twitter)'

    json_dat = {
        "channel": channel,
        "username": username,
        "icon_url": status.user.profile_image_url,
        "text": text
    }
    json_dat = json.dumps(json_dat)

    return json_dat
                   

Transferring to Slack is very easy, just use request.post.

def post_to_slack(json_dat):
    url = SLACK_WEBJOOK_URL
    requests.post(url, data=json_dat)
    return    

heroku executable file

When using heroku,

You will need the file. Profile describes the main process at runtime. The required packages are listed in requirements.txt.

worker: python main.py

requirements.txt


tweepy==3.8.0
requests==2.22.0
pytz==2019.3

It's important to note that we are using worker as Dyno instead of the web. On the web, ʻError R10` occurs.

https://qiita.com/m_rn/items/9a580d04781b34f64693

Deploy

It looks like the following.

heroku login
git init 
git add .
git commit -m "test"
heroku create #Register the app
git push heroku master #Transfer app
heroku scale worker=1 #Run the app

The web process can be deployed with git push heroku master, but the worker process requires heroku scale worker = 1.

Log monitoring is

heroku logs -t
heroku ps

The stop is as follows.

heroku scale worker=0

https://qiita.com/naberina/items/da4a6d3c480aa7a62b06

Delete

heroku apps:destroy --app app name

It will be.

https://qiita.com/chihiro/items/5c3ff400f6cb99deb945

Improvement points

--Register heroku credits. If you do not register your credit card, you will not be able to operate all the time ...

Recommended Posts

Streamline information gathering with the Twitter API and Slack bots
Try hitting the Twitter API quickly and easily with Python
Specifying the date with the Twitter API
Collecting information from Twitter with Python (Twitter API)
Tweet regularly with the Go language Twitter API
Hit the Twitter API after Oauth authentication with Django
Crawling with Python and Twitter API 1-Simple search function
Get comments and subscribers with the YouTube Data API
[Python] Get user information and article information with Qiita API
It's too easy to access the Twitter API with rauth and I have her ...
Use Twitter API with Python
I tried follow management with Twitter API and Python (easy)
Get information with zabbix api
Try using the Twitter API
Support yourself with Twitter API
Call the API with python3.
Collecting information from Twitter with Python (MySQL and Python work together)
[Python x Zapier] Get alert information and notify with Slack
Successful update_with_media with twitter API
Crawling with Python and Twitter API 2-Implementation of user search function
Grant an access token with the curl command and POST the API
Hit the Etherpad-lite API with Python
[python] Read information with Redmine API
Introduce errBot and work with Slack
Access the Twitter API in Python
[MS Azure] Slack notification of competition information using Azure Functions and Kaggle API
Search for Twitter keywords with tweepy and write the results to Excel
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7
Get the number of articles accessed and likes with Qiita API + Python
Be careful when retrieving tweets at regular intervals with the Twitter API
I tried to get the movie information of TMDb API with Python
Get delay information on Twitter and tweet
Access the Docker Remote API with Requests
Touch around the twitter list with tweepy
I tried using Twitter api and Line api
Simple Slack API client made with Python
Tweet using the Twitter API in Python
Post from another account with Twitter API
Get holidays with the Google Calendar API
[Python] Get Python package information with PyPI API
Extract sudden buzzwords with twitter streaming API
Play with puns using the COTOHA API
Get coincheck virtual currency information with API ♪
I can't use the "next_results" parameter in the Twitter API Search API! ?? Causes and remedies
I tried to automate internal operations with Docker, Python and Twitter API + bonus
Pulling songwriting, composition and arrangement information from the Tower Records site with Python
Get information on the 100 most influential tech Twitter users in the world with python.
Create a clean DB for testing with FastAPI and unittest the API with pytest
Ask the bot to tell you the weather (precipitation information) using the weather information API (YOLP) provided by Yahoo ~ slack bot development with python ④ ~