I made a program to collect images in tweets that I liked on twitter with Python

This article is the 23rd day of Akatsuki Advent Calender 2020

Introduction

It's been a couple of years since I started using it, but I haven't written an article about it, so I decided to do it this time.

First of all, as the background of the matter When you look at twitter, the images you want to collect may come around. Therefore, ** right-click-> download-> specify path ** is very troublesome. What if such tweets come in succession? I can't do it.

So ** I wonder if there is some kind of trigger and it will be downloaded automatically **. So ** if there is a web page for browsing ** it would be perfect. That's why I decided to make it with half the desire to collect and half the desire for technology.

This time only at the collection point. (You might wonder if it's good with pixiv? I want something that can be viewed only by the one I want ...)

Requirements

・ I decided to download only ** like tweets ** instead of RT. If you like it ** there is one less click procedure ** so you can move on to the next tweet ・ Collection is done on a rental server or something Since there is no PC that can be attached at home all the time, it is inconvenient to have to attach it at the time of collection. -At the time of implementation, the streaming api was alive, so get likes tweets in real time. -> Since it is gone, I decided to get it after a while with rest api. ・ Save the tweet content, URL, @ ID, etc. for searching on the web page.

Development environment

Rental Server ・ Sakura server

language ・ Python3.6

Library used ・ Requests_outhlib ・ Mysql.connector

Register for Twitter Dev

https://developer.twitter.com/ I will register here so that I can pass the certification.

To implementation

Omitted here and there

1. Session generation

Make an app with twitter developer. Therefore, in the Key And Token tab, there are the Key and Token required for authentication, so copy them.

twitter_screenshot.png

When creating a session with requests_outhlib Specify the Key and Token with the following code.

twitter_session = OAuth1Session(consumerKey
        , consumerSecret
        , accessToken
        , accessSecret)

2. Get API results

https://developer.twitter.com/en/docs/twitter-api/v1/tweets/post-and-engage/api-reference/get-favorites-list Click here for the like API used this time

In the code I use it like this

#Specify API Resources URL and parameters to pass
#The parameters are listed on the above page, but this time I will mainly use the following two
#screen_name:@Id written by originally user_It is certain to specify the id, but I do not plan to change the collection account, so with this
#count:Number of tweets to get at one time(max:200)
request = twitter_session.get("https://api.twitter.com/1.1/favorites/list.json", params = {"screen_name":ID, "count":200})

#If you want to get old tweets, add the following to the parameters
#tweet_You will get tweets before id. It will be from the latest tweet without specification
"max_id":tweet_id

The result will be returned in json. Determine if there is an error with status_code. Since there is json data in request.text, convert it to json with json.loads.

if request.status_code == 200:
    print("rest connect")
else:
    print("rest Error code%d", request.status_code)
return ""
tweets = json.loads(request.text)

You should also be aware of the following restrictions when doing this: ・ ** 75 requests/15 minutes ** ・ The order of tweets to be acquired is not the order of likes, but the date and time of tweets to be liked in the order of newest **. So if you want to get old tweets, you need to devise

3. Get an image based on json data

Since the tweets variable obtained earlier contains an array of tweets, it can be processed one by one with a for statement.

for tweet in tweets:
    #Get information from tweets

    #Tweet ID Last tweet["id"]Twitter explained above_session.get max_By specifying in id
    #You can process the result obtained by the next request without covering
    tweet["id"]
    #Name to display
    tweet["user"]["name"]
    #@id
    tweet["user"]["screen_name"]
    #Image list(URL)
    tweet["extended_entities"]["media"]
    #Tweet text
    tweet["text"]
    #Tweet URL is the following combination
    "https://twitter.com/"+tweet["user"]["screen_name"]+"/status/"+tweet["id_str"]

This time it's an image list

    #Image list(URL)
    image_list = tweet["extended_entities"]["media"]

Use this. This also comes as an array, so process it with a for statement I've omitted various things, but basically I'm repeating saving after downloading.

for image in image_list:
    url = image["media_url"]
    img = urllib.request.urlopen(url, timeout = 5).read()
    f = open(path, 'wb')
    f.write(img)
    f.close()

The flow is as follows ・ Session creation ・ Get request 75 times loop to limit ・ Get the image of the acquired tweet ・ Hold the last tweet ID of the acquired tweet and pass it at the time of request

If you want to get old tweets, it's a good idea to check all the likes regularly.

4. CRON setting on the rental server

This time the request is limited every 15 minutes, so set CRON accordingly スクリーンショット 2020-12-23 23.28.51.png

This completes the automatic collection of likes. If you want to browse and search on the web, you need to save the tweet information in the DB.

Finally

After that, I was able to browse the DB and web pages by setting them appropriately. ** Honestly, I rarely see it. ** The collection and program worked, so I was happy just to be able to do it. However, I think it was a good subject for learning. I didn't touch the scripting language at all The best work that got you interested in automation? Because it was.

The number of sheets currently collected is image.png ** Approximately 170,000 sheets **

The number of likes image.png ** 150,000 isn't it ** I thought that this person was crazy myself. I'm myself ...

If you want to make a better system, it is best to automate the images collected by machine learning, but once I tried to do it I gave up because it was difficult to judge the images I wanted to collect. Difficulty that you have to drop your taste ... I have an image for learning, though ...

Recommended Posts

I made a program to collect images in tweets that I liked on twitter with Python
I made a program to convert images into ASCII art with Python and OpenCV
I made a payroll program in Python!
I made a Twitter fujoshi blocker with Python ①
I made a Caesar cryptographic program in Python.
I made a web application in Python that converts Markdown to HTML
I made a program to check the size of a file in Python
I made a Twitter BOT with GAE (python) (with a reference)
I made a prime number generation program in Python
I made a familiar function that can be used in statistics with Python
I want to work with a robot in python.
I want to exe and distribute a program that resizes images Python3 + pyinstaller
I made a module in C language to filter images loaded by Python
I made a prime number generation program in Python 2
A story that I was addicted to when I made SFTP communication with python
I made a program in Python that reads CSV data of FX and creates a large amount of chart images
I made a simple typing game with tkinter in Python
Create a list in Python with all followers on twitter
I made a package to filter time series with python
I wrote a program quickly to study DI with Python ①
I tried "a program that removes duplicate statements in Python"
I made a puzzle game (like) with Tkinter in Python
I made a fortune with Python.
I made a daemon with Python
I made a Python program for Raspberry Pi that operates Omron's environmental sensor in the mode with data storage
I made a class to get the analysis result by MeCab in ndarray with python
Summary of points to keep in mind when writing a program that runs on Python 2.5
A program that failed when trying to create a linebot with reference to "Dialogue system made with python"
I made a library to easily read config files with Python
I made a package that can compare morphological analyzers with Python
I want to use a wildcard that I want to shell with Python remove
I made a program that solves the spot the difference in seconds
Easy! Implement a Twitter bot that runs on Heroku in Python
[Python] A memo that I tried to get started with asyncio
I made a shuffle that can be reset (reverted) with Python
I made a library that adds docstring to a Python stub file.
I tried to automatically collect images of Kanna Hashimoto with Python! !!
I made a program that automatically calculates the zodiac with tkinter
I made a program in Python that changes the 1-minute data of FX to an arbitrary time frame (1 hour frame, etc.)
[Python] A program that creates stairs with #
I made a character counter with Python
Program to get favorite images on Twitter
I made a Hex map with Python
I made a roguelike game with Python
I made a simple blackjack with Python
I made a configuration file with Python
I made a neuron simulator with Python
I made something with python that NOW LOADING moves from left to right on the terminal
I made a garbled generator that encodes favorite sentences from UTF-8 to Shift-JIS (cp932) in Python
[Python] I tried to make a simple program that works on the command line using argparse.
A story that didn't work when I tried to log in with the Python requests module
[Python] I made a decorator that doesn't seem to have any use.
I want to tweet on Twitter with Python, but I'm addicted to it
I made a tool to automatically browse multiple sites with Selenium (Python)
I tried to create a program to convert hexadecimal numbers to decimal numbers with python
I made a Discord bot in Python that translates when it reacts
I made a CLI tool to convert images in each directory to PDF
I tried to develop a Formatter that outputs Python logs in JSON
I tried to discriminate a 6-digit number with a number discrimination application made with python
Environment maintenance made with Docker (I want to post-process GrADS in Python
I made a script in python to convert .md files to Scrapbox format