[PYTHON] Get celebrity tweet history from twitter

A wall that I immediately ran into when I wanted to use IBM cloud's personality insight. A memorandum at that time.

trouble

A sentence of 3000 words or more is required (or rather desirable). So I borrowed a celebrity tweet from twitter. It is considered that there are about 15 words per tweet, and 200 tweets are acquired per person.

However, as a major premise, it is necessary to register the twitter API. I have already applied for registration, so I will skip this step here.

Implementation method

 -*- coding:utf-8 -*-

import tweepy
import re
import subprocess

# User list
import user_list
# twitter API authentication key
# Access_token, Access_secret, Consumer_key, Consumer_secret
from auth import twitter_credentials as tc


def get_twitterdata(username, rfile):

 #Authentication key reading, API setting
  auth = tweepy.OAuthHandler(tc.Consumer_key, tc.Consumer_secret)
  auth.set_access_token(tc.Access_token, tc.Access_secret)
  api = tweepy.API(auth, wait_on_rate_limit = True)

 #List to store tweets
  tweets_data =[]
 Data acquisition up to # 200 tweet
  for tweet in api.user_timeline(screen_name=username, count=200):
 # Get tweet text
    tmp_text=tweet.text
 #Continuous line breaks are combined into one
    tmp_text=re.sub('\n+','\n',tmp_text)
 #Add tweet to list
    tweets_data.append(tmp_text + '\n')

 # File output
  with open(rfile, "w",encoding="utf-8") as wf:
    wf.writelines(tweets_data)


if __name__ == '__main__':

 Get your #twitter username
  userlist=user_list.username

  for i in range(0,len(userlist)):
    username = userlist[i]
    rfile = "./data/tweet_"+str(i).zfill(3)+".csv"

    try:
      get_twitterdata(username, rfile)
 # Generate an empty file if it cannot be obtained, such as when it is set to private
    except:
      subprocess.run(["touch",rfile])

User list example

It's crazy, but the person's name is hidden. username=[ "ariyoshihiroiki", "matsu_bouzu", "takapon_jp" ]

result

The description is omitted because it may be copyrighted. tweet_000.csv tweet_001.csv tweet_002.csv Can be done.

Afterword

Since the amount of data that can be acquired every 15 minutes (?) Is limited, if you are greedy, you will have to wait a long time. You may also exclude retweets and other non-texts from the person in question.

Recommended Posts

Get celebrity tweet history from twitter
Get data from Twitter using Tweepy
Get delay information on Twitter and tweet
Get images by keyword search from Twitter
Tweet from python with Twitter Developer + Tweepy
Get images from specific users on Twitter
Get metric history from MLflow in Python
Get Twitter Trends
Get Twitter userData
[Python] Get one year's message history from Slack
Tweet from AWS Lambda
Create a correlation diagram from the conversation history of twitter
Get structural data from CHEMBLID
Get Twitter timeline with python
Twitter post from command line
Get clipboard from Maya settings