[PYTHON] Get AKB48 Google+ Posts

I made a script to get AKB48 Google+ posts. Find out how to access Google+ from Python.

It consists of 4 scripts. 1.gactget.py A script to access the google + API. Specify the user ID and MaxResults. MaxResults is how many records are fetched in one call. Since the number of API access is limited, adjust it while watching the situation.

2.settings.py Authentication information stored in another source. This is for safety. Generally, it seems to be a different source like this.

3.gidlist.py Get the member ID from AKB48's google + member list site. I'm scraping (cutting out the necessary character string) using a module called beautifulsoup.

4.gactprint.py Use the above three programs to get activities (tweets and posts in twitter).

gactget.py



#!/usr/local/pythonz/ENV/Python-2.7.3/bin/python
# coding: utf-8

import apiclient.discovery
import httplib2
import settings
import logging
import sys

#logging.basicConfig()
logging.getLogger().setLevel(getattr(logging, 'ERROR'))

def build_service(credentials, http, api_key=None):
    if ( credentials != None ):
        http = credentials.authorize(http)
    service = apiclient.discovery.build('plus', 'v1', http=http, developerKey=api_key)
    return service

def gact(ggtsid,maxr):

    httpUnauth = httplib2.Http()
    try:
        serviceUnauth = build_service(None, httpUnauth, settings.API_KEY)
    except:
        print 'build_service err'
        raise
    try:
        request = serviceUnauth.activities().list(userId=ggtsid, collection='public', maxResults = maxr)
    except:
        print 'serviceUnauth.activities().list err'
        raise
    activities = []

    try:
        activity = request.execute(httpUnauth)
    except:
        raise

    activities += activity['items']

    return activities

settings.py



import os

# 1. Go to:
#    https://code.google.com/apis/console
# 2. select choose your project.
# 3. Choose 'API Access'
# 4. If you have not generated a client id, do so.
# 5. Make your callback:
#   http://localhost:8090

CLIENT_SECRET="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
CLIENT_ID="xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
API_KEY="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

gidlist.py


#!bin/python
#coding: utf-8

from BeautifulSoup import BeautifulSoup #@UnresolvedImport
import urllib

def gidlist_make():
    soup = BeautifulSoup(urllib.urlopen("http://www.google.com/intl/ja/+/project48/").read())
    lisoup=[]
    gidlist = []
    for lisoup in soup.findAll("li"):

        try:
            gidlist.append(lisoup['data-gplusid'])
        except:
            continue

    return gidlist

gactprint.py


#!bin/python
# coding: utf-8

import sys

import logging

import apiclient.discovery
import httplib2
import settings
import re

import ggtsidset
import gactget
import gactmain
import gidlist

logging.getLogger().setLevel(getattr(logging, 'ERROR'))


if __name__=='__main__':
    for ggtsid in gidlist.gidlist_make():
        try:
            activities = gactget.gact(ggtsid,5)
        except:
            print sys.exc_info()[1]
            r = re.compile('Daily Limit Exceeded')
            m = r.search(str(sys.exc_info()[1]))
            if m is None:
                errmsg = ''
            else:
                errmsg = str(m.group(0))

            if errmsg == 'Daily Limit Exceeded':
                print 'break,Daily Limit Exceeded'
                print ggtsid
                break
            else:
                continue

        for activity in activities:
            print activity['published']
            print activity['updated']
            print activity['actor']['displayName']
            print activity['actor']['id']
            print activity['object']['content']
            if activity['object'].has_key('attachments'):
                print activity['object']['attachments'][0]['url']
            else:
                continue
            print activity['object']['replies']['totalItems']
            print activity['object']['plusoners']['totalItems']
            print activity['object']['resharers']['totalItems']

[Ingenuity]

    1. Suspend processing when API call limit is exceeded If you exceed your daily API call limit, you will get an error like this: In this case, I try to break. If you continue to process it, it will continue to exceed the limit. <HttpError 403 when requesting https://www.googleapis.com/plus/v1/people/117147321771860727748/activities/public?alt=json&key=AIzaSyDgX5DQ0s8jnuMWKyWI0IfvO-YA8pNpNb4&maxResults=5 returned "Daily Limit Exceeded">
  1. Limit log output level If you use apiclient.discovery, Warning is fine. I have added a code called logging.getLogger (). SetLevel (getattr (logging,'ERROR')) to prevent it from appearing. There is no problem with this process.

    1. Automate ID acquisition This is where I did my best. If you want to add the IDs of parties other than members, you can add the following code after the loop of for lisoup in soup.findAll ("li"): in gidlist.py. gidlist.append ('113474433041552257864') # Yasushi gidlist.append ('108897254135232129896') # Soot gidlist.append ('112435502021367429566') # Shinobu gidlist.append ('113091703821013997975') #Kijima gidlist.append ('103803814106571203433') # Kitagawa

Recommended Posts

Get AKB48 Google+ Posts
Get AKB member's Google+ ID at once
Get similar posts using Doc2Vec
Get and visualize google search trends
Get Google Fit API data in Python
Get holidays with the Google Calendar API
Get wordpress posts from the past week