Search Twitter using Python

Purpose

Write a script to search Twitter using Python. The final deliverable can be downloaded below.

https://github.com/mima3/searchTwitter

Preparation

(1) Python 2.7 must be installed. (2) Install python_twitter

easy_install python_twitter

(3) Get the API for Twitter from the following page https://dev.twitter.com/

Please refer to the following page for the detailed acquisition method. http://support.dreamone.co.jp/Pandora/dp.do?jumpTo=DreamX&variables%28LPID%29=162

Twitter API to use

application/rate_limit_status.json https://dev.twitter.com/docs/api/1.1/get/application/rate_limit_status Get the limits for each API. This allows you to find out how many times each API can be used later and when the reset time will be.

search/tweets.json https://dev.twitter.com/docs/api/1.1/get/search/tweets API for searching.

The maximum number that can be acquired with one API is 100. The result of the search via API may be different from the search from the official page. Officially, tweets 7 days ago are also searched, but in the case of API, past tweets are not searched.

Also, the tweets obtained by searching are different depending on result_type. Get recent in chronological order popular gets popular tweets mixed is a mixture of the above.

Search character

The characters that can be specified in the search API can be the search characters used in "advanced search". https://twitter.com/search-advanced

Search example using AND, OR

(Erin OR Eirin) AND (BBA OR Auntie OR Baba)

Search for what is tweeted as "Erin" or "Eirin" and "BBA" or "Auntie" or "Babaa".

An example of searching for what a specific user tweeted

Enter the user name after "from:".

from:mima_ita

Search using from seems to be limited to 100 in API.

An example of searching for a tweet in a specific place

Specify the coordinates and range after "geocode:". The following example is a tweet with a radius of 500m from Tokyo Tower.

geocode:35.65858,139.745433,0.5km

The API seems to have a limit of 100 searches using geocode.

Implementation example

Here, an implementation example of the process of searching for a specified search character is shown. Originally, only 100 items can be obtained with one search API, so modify this so that you can get the full limit.

First, search the first search API with "result_type = recent" and get it in chronological order. At this time, only the latest 100 cases are acquired.

In the second search, try to get the oldest tweets obtained in the first search. To do this, specify "max_id = previous minimum id-1".

You can repeat this until you can't get all of them, or you can repeat until you exceed the API limit obtained by rate_limit_status.

A simple sample of this is shown below.

#!/usr/bin/python
# -*- coding: utf-8 -*-
# python_twitter 1.1
import twitter
from twitter import Api
import sys
import time
reload(sys)
sys.setdefaultencoding('utf-8')
from collections import defaultdict



maxcount=1000
maxid =0
terms=["Rin Yainaga","Eirin","Erin"]
search_str=" OR ".join(terms)

api = Api(base_url="https://api.twitter.com/1.1",
                  consumer_key='XXXXX',
                  consumer_secret='XXXXX',
                  access_token_key='XXXXX',
                  access_token_secret='XXXXX')
rate = api.GetRateLimitStatus()
print "Limit %d / %d" % (rate['resources']['search']['/search/tweets']['remaining'],rate['resources']['search']['/search/tweets']['limit'])
tm = time.localtime(rate['resources']['search']['/search/tweets']['reset'])
print "Reset Time  %d:%d" % (tm.tm_hour , tm.tm_min)
print "-----------------------------------------\n"
found = api.GetSearch(term=search_str,count=100,result_type='recent')
i = 0
while True:
  for f in found:
    if maxid > f.id or maxid == 0:
      maxid = f.id
    print f.text
    i = i + 1
  if len(found) == 0:
    break
  if maxcount <= i:
    break
  print maxid
  found = api.GetSearch(term=search_str,count=100,result_type='recent',max_id=maxid-1)

print "-----------------------------------------\n"
rate = api.GetRateLimitStatus()
print "Limit %d / %d" % (rate['resources']['search']['/search/tweets']['remaining'],rate['resources']['search']['/search/tweets']['limit'])
tm = time.localtime(rate['resources']['search']['/search/tweets']['reset'])
print "Reset Time  %d:%d" % (tm.tm_hour , tm.tm_min)

Development system

You can download the script that evolved from the above from the following. https://github.com/mima3/searchTwitter

The above script saves the search results in SQLITE. This script searches past tweets to the limit of API call restrictions. When the script is executed next, it will be as follows.

__ If you have searched all searchable past tweets __ Search for tweets newer than the tweets registered in the DB.

__ If past tweets remain __ Search for tweets older than the last acquired tweet.

With this script, a large number of search results can be easily obtained.

Recommended Posts

Search Twitter using Python
Post to Twitter using Python
Search algorithm using word2vec [python]
Search twitter tweets with python
Depth-first search using stack in Python
twitter on python3
Start using Python
Scraping using Python
Tweet using the Twitter API in Python
Python: Negative / Positive Analysis: Twitter Negative / Positive Analysis Using RNN-Part 1
[Python] LINE notification of the latest information using Twitter automatic search
Sequential search with Python
Fibonacci sequence using Python
[Python] Search (itertools) ABC167C
Data analysis using Python 0
Binary search in Python
Data cleaning using Python
[Python] Search (NumPy) ABC165C
Binary search (python2.7) memo
[Python] Binary search ABC155D
Refined search for Pokemon race values using Python
Using Python #external packages
python bit full search
WiringPi-SPI communication using Python
Linear search in Python
Binary search with python
Age calculation using python
Binary search with Python3
Name identification using python
Notes using Python subprocesses
Binary search in Python (binary search)
Try using Tweepy [Python2.7]
[Python] This is easy! Search for tweets on Twitter
Crawling with Python and Twitter API 1-Simple search function
Using the National Diet Library Search API in Python
Python notes using perl-ternary operator
Scraping using Python 3.5 async / await
[Python] BFS (breadth-first search) ABC168D
Save images using python3 requests
[S3] CRUD with S3 using Python [Python]
[Python] Try using Tkinter's canvas
Twitter graphing memo with Python
Get Twitter timeline with python
Use Twitter API with Python
In-graph path search using Networkx
Using Quaternion with Python ~ numpy-quaternion ~
Python notes using perl-special variables
[Python] Using OpenCV with Python (Basic)
Try using the Twitter API
Search for strings in Python
Scraping using Python 3.5 Async syntax
Website change monitoring using python
Start to Selenium using python
Homebrew Python --Youtube Search Program
Change python version using pyenv
Binary search in Python / C ++
python: Basics of using scikit-learn ①
# 1 [python3] Simple calculation using variables
Full bit search with Python
Try using the Twitter API
[Python] DFS (Depth-first Search) ABC157D