[PYTHON] Extract Twitter data with CSV

Found a good library "Tweepy (# https://github.com/tweepy/tweepy )"

The Sample code is now compatible with Python3.

Step 1: Install Tweepy

pip install tweepy

Collecting tweepy
  Downloading tweepy-3.5.0-py2.py3-none-any.whl
Requirement already satisfied: requests-oauthlib>=0.4.1 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: requests>=2.4.3 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: six>=1.7.3 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from tweepy)
Requirement already satisfied: oauthlib>=0.6.2 in /Users/aws/Documents/Anaconda/anaconda/lib/python3.6/site-packages (from requests-oauthlib>=0.4.1->tweepy)
Installing collected packages: tweepy
Successfully installed tweepy-3.5.0
# !/usr/bin/env python
# encoding: utf-8

import tweepy  
import csv

# Twitter API credentials
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

def get_all_tweets(screen_name):
    # Twitter only allows access to a users most recent 3240 tweets with this method

    # authorize twitter, initialize tweepy
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_key, access_secret)
    api = tweepy.API(auth)

    # initialize a list to hold all the tweepy Tweets
    alltweets = []

    # make initial request for most recent tweets (200 is the maximum allowed count)
    new_tweets = api.user_timeline(screen_name=screen_name, count=200)

    # save most recent tweets

    # save the id of the oldest tweet less one
    oldest = alltweets[-1].id - 1

    # keep grabbing tweets until there are no tweets left to grab
    while len(new_tweets) > 0:
        print("getting tweets before %s" % (oldest))

        # all subsiquent requests use the max_id param to prevent duplicates
        new_tweets = api.user_timeline(screen_name=screen_name, count=200, max_id=oldest)

        # save most recent tweets

        # update the id of the oldest tweet less one
        oldest = alltweets[-1].id - 1

        print("...%s tweets downloaded so far" % (len(alltweets)))

    # transform the tweepy tweets into a 2D array that will populate the csv
    outtweets = [[tweet.id_str, tweet.created_at, tweet.text.encode("utf-8")] for tweet in alltweets]

    # write the csv
    with open('%s_tweets.csv' % screen_name, 'w') as f:
        writer = csv.writer(f)
        writer.writerow(["id", "created_at", "text"])


if __name__ == '__main__':
    get_all_tweets("twitter Username")

Reference: https://gist.github.com/yanofsky/5436496

Recommended Posts

Extract Twitter data with CSV
Extract csv data and calculate
Extract sudden buzzwords with twitter streaming API
How to create sample CSV data with hypothesis
Read Python csv data with Pandas ⇒ Graph with Matplotlib
CSV output of pulse data with Raspberry Pi (CSV output)
Write CSV data to AWS-S3 with AWS-Lambda + Python
Extract data from a web page with Python
Extract data from S3
Visualize data with Streamlit
Twitter OAuth with Django
Reading data with TensorFlow
Data visualization with pandas
Data manipulation with Pandas!
Shuffle data with pandas
Data Augmentation with openCV
Csv tinkering with python
Normarize data with Scipy
Data analysis with Python
Extract EXIF with sips
LOAD DATA with PyMysql
How to extract non-missing value nan data with pandas
Process csv data with python (count processing using pandas)
Extract the band information of raster data with python
How to extract non-missing value nan data with pandas
Extract non-numeric elements with pandas.DataFrame
Sample data created with python
Read csv with python pandas
Embed audio data with Jupyter
Graph Excel data with matplotlib (1)
Artificial data generation with numpy
Twitter graphing memo with Python
Get Twitter timeline with python
Use Twitter API with Python
Notes on importing data from MySQL or CSV with Python
Get Youtube data with python
Clustering ID-POS data with LDA
Learn new data with PaintsChainer
Binarize photo data with OpenCV
Graph Excel data with matplotlib (2)
Try to extract Azure SQL Server data table with pyodbc
Save tweet data with Django
Extract numbers with regular expressions
Stylish technique for pasting CSV data into Excel with Python
Write to csv with Python
Support yourself with Twitter API
Search twitter tweets with python
Extract peak values with scipy
Successful update_with_media with twitter API
Download csv file with python
Data processing tips with Pandas
Interpolate 2D data with scipy.interpolate.griddata
Try to extract the features of the sensor data with CNN
Read json data with python
Analyzing Twitter Data | Trend Analysis
Extract bigquery dataset and table list with python and output as CSV
How to extract features of time series data with PySpark Basics
[For beginners] Script within 10 lines (3. Data acquisition / csv conversion with datareader)
[Python] Read a csv file with a large data size using a generator
Extract database tables with CSV [ODBC connection from R and python]
Save & load data with joblib, pickle