Get information on the 100 most influential tech Twitter users in the world with python.

This is the code to extract the handle name with Beautiful soup from the article of Business Insider called "100 Most Influential Technical Twitter Users" and get the Twitter account information with Twitter REST API.

The 100 Most Influential Tech People On Twitter

Such a page 100-influential-compressor.png

# coding: utf-8

from bs4 import BeautifulSoup
import requests,json
from requests_oauthlib import OAuth1Session
from requests.exceptions import ConnectionError, ReadTimeout, SSLError


res = requests.get("http://www.businessinsider.com/100-influential-tech-people-on-twitter-2014-4?op=1")
soup = BeautifulSoup(res.text)
count = 100

user_list = []
user = {}
for line in soup.body.get_text().split('\n'):
    if line.find('Occupation:') > -1:
        if len(user.keys()) != 0:
            user_list.append(user)
            user = {}
        print count, line
        user['rank'] = count
        user['occupation'] = line.replace('Occupation:','').replace(u'\xa0','')
        count -=1
    for c in ['@','Why:','Tech PI:','PI:']:
        if line.find(c) > -1:
            print '   ', line
            if line.find('Tech PI:') >-1 :
                user['tech_pi'] = line.replace('Tech PI:','').replace(u'\xa0','')
            elif line.find('PI:') >-1:
                user['pi'] = line.replace('PI:','').replace(u'\xa0','')
            elif line.find('@') >-1 :
                user['handle'] = line.replace('Handle:','') .replace(u'\xa0','').replace(u'@','') 
            elif line.find('Why:') >-1 :
                user['Why:'] = line.replace('Why:','').replace(u'\xa0','')
            break

handle_list = [d['handle'] for d in user_list]



KEYS = { #Set the key you got with your account
        'consumer_key':'**********',
        'consumer_secret':'**********',
        'access_token':'**********',
        'access_secret''**********',
       }

twitter = OAuth1Session(KEYS['consumer_key'],KEYS['consumer_secret'],
                        KEYS['access_token'],KEYS['access_secret'])

url = 'https://api.twitter.com/1.1/users/lookup.json?'
params = {'screen_name':','.join(handle_list)}
    
req = twitter.get(url, params = params)
user_list = json.loads(req.text)

for u in user_list:
    d_data = json.dumps(u, sort_keys=True, indent=4)
    print d_data

The output looks like this.

100 Occupation: CEO/founder of News Corporation; Creator of FOX Broadcasting
    Handle: @rupertmurdoch
    Why: See how tech fits into the greater news cycle from Rupert himself. Yeah, he writes his own tweets.
    Tech PI: 83
    PI: 86
99 Occupation: Assistant professor at the University of North Carolina, Chapel Hill with her own tech site at www.technosociology.org
    Handle: @zeynep
    Why: Catch Zeynep's musings on everything ranging from international Web policies to  social justice.
    Tech PI: 84
    PI: 77
98 Occupation: Data Scientist in Residence at Accel, Scientist Emeritus at bitly, co-founder of HackNY, co-host of DataGotham, and member of NYCResistor
    Handle: @hmason
    Why: Hilary is on top of the chatter when it comes to today's tech news.
    Tech PI: 84
    PI: 77
・
・
・

The information fetched by REST API is retained in json.

  "contributors_enabled": false, 
    "created_at": "Sat Dec 31 18:29:24 +0000 2011", 
    "default_profile": true, 
    "default_profile_image": false, 
    "description": "", 
    "entities": {
        "description": {
            "urls": []
        }
    }, 
    "favourites_count": 13, 
    "follow_request_sent": false, 
    "followers_count": 570445, 
    "following": false, 
    "friends_count": 96, 
    "geo_enabled": false, 
    "id": 451586190, 
    "id_str": "451586190", 
    "is_translation_enabled": false, 
    "is_translator": false, 
    "lang": "en", 
    "listed_count": 7145, 
    "location": "", 
    "name": "Rupert Murdoch ", 
    "notifications": false, 
    "profile_background_color": "C0DEED", 
    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png ", 
    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png ", 
    "profile_background_tile": false, 
    "profile_image_url": "http://pbs.twimg.com/profile_images/1732184156/Twitter_normal.jpg ", 
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/1732184156/Twitter_normal.jpg ", 
    "profile_link_color": "0084B4", 
    "profile_location": null, 
    "profile_sidebar_border_color": "C0DEED", 
    "profile_sidebar_fill_color": "DDEEF6", 
    "profile_text_color": "333333", 
    "profile_use_background_image": true, 
    "protected": false, 
    "screen_name": "rupertmurdoch", 
    "status": {
        "contributors": null, 
        "coordinates": null, 
        "created_at": "Fri Apr 10 12:33:22 +0000 2015", 
        "entities": {
            "hashtags": [], 
            "symbols": [], 
            "urls": [], 
            "user_mentions": []
        }, 
        "favorite_count": 63, 
        "favorited": false, 
        "geo": null, 
        "id": 586507259578032128, 
        "id_str": "586507259578032128", 
        "in_reply_to_screen_name": null, 
        "in_reply_to_status_id": null, 
        "in_reply_to_status_id_str": null, 
        "in_reply_to_user_id": null, 
        "in_reply_to_user_id_str": null, 
        "lang": "en", 
        "place": null, 
        "retweet_count": 89, 
        "retweeted": false, 
        "source": "<a href=\"http://twitter.com/#!/download/ipad\" rel=\"nofollow\">Twitter for iPad</a>", 
        "text": "Guardian today suggests my dad's expose of Gallipoli fiasco led to my anti-establishment views.  Maybe, but confirmed by many later \nevents.", 
        "truncated": false
    }, 
    "statuses_count": 1423, 
    "time_zone": null, 
    "url": null, 
    "utc_offset": null, 
    "verified": true
・
・
・

Recommended Posts

Get information on the 100 most influential tech Twitter users in the world with python.
How is the progress? Let's get on with the boom ?? in Python
[Python] Get the files in a folder with Python
[Python] Get the numbers in the graph image with OCR
Crawl the URL contained in the twitter tweet with python
Get the result in dict format with Python psycopg2
Information for controlling the motor with Python on RaspberryPi
Get Twitter timeline with python
Get Alembic information with Python
Create a list in Python with all followers on twitter
Get the desktop path in Python
Get the weather with Python requests
Get the weather with Python requests 2
Get the script path in Python
The most cited patent in the world
Get the desktop path in Python
Get the host name in Python
Access the Twitter API in Python
Get started with Python in Blender
Get weather information with Python & scraping
[Memo] Tweet on twitter with python
Get started with the Python framework Django on Mac OS X
Get the number of readers of a treatise on Mendeley in Python
PhytoMine-I tried to get the genetic information of plants with Python
Get the width of the div on the server side with Selenium + PhantomJS + Python
Collecting information from Twitter with Python (Twitter API)
Get additional data in LDAP with python
Get property information by scraping with python
Get delay information on Twitter and tweet
[Python] Get the variable name with str
How to get the date and time difference in seconds with python
Display Python 3 in the browser with MAMP
Tweet using the Twitter API in Python
Map rent information on a map with python
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7
Download files on the web with Python
[python] Get Twitter timeline for multiple users
Get and convert the current time in the system local timezone with python
Get Started with TopCoder in Python (2020 Edition)
List of language codes used in twitter (including API) (with Python dictionary). What is the most commonly used language?
Get images from specific users on Twitter
[Python] Get Python package information with PyPI API
I tried to get the movie information of TMDb API with Python
Get the EDINET code list in Python
I installed Pygame with Python 3.5.1 in the environment of pyenv on OS X
Visualize accelerometer information from the microcomputer board in real time with mbed + Python
Location information data display in Python --Try plotting with the map display library (folium)-
How to get a list of files in the same directory with python
Collecting information from Twitter with Python (Environment construction)
Load the network modeled with Rhinoceros in Python ③
Get the latest AMI information with the AWS CLI
Get the caller of a function in Python
[Automation] Extract the table in PDF with Python
Hello World with nginx + uwsgi + python on EC2
Get the X Window System window title in Python
Try working with Mongo in Python on Mac
Load the network modeled with Rhinoceros in Python ②
Introduction to Python with Atom (on the way)
Get CPU information of Raspberry Pi with Python
How to get the files in the [Python] folder
Get started with Python on macOS Big Sur