[PYTHON] Get a list of GA accounts, properties, and views as vertical data using API

This is the first post of a Qiita article. I think there are some points that cannot be reached, including the first time using the API, but thank you.

-[Overview / What you want to do](#Overview / What you want to do) -[Try this API (for non-engineers)](Try using # try-this-api for non-engineers) -[Development environment / module version](#Development environment module version) -[Preliminary work contents](# Preliminary work contents) -[Implementation code](#Implementation code) -[Future outlook](#Future outlook)

Overview / What you want to do

My company uses Google Analytics (GA) to measure the numbers on the site. There were some employees and I had been using it for more than a few years, but I couldn't handle the operation and maintenance properly. Various accounts and properties have been messed up.

On the GA management screen, it is possible to scroll through the list of accounts and properties. The goal of this time is to make the results available for viewing on spreadsheets.

Try this API (for non-engineers)

By using the API below, you can get the GA account data associated with the logged-in Chrome account in JSON format. Management API> Reference> Account: list

Even if you format the execution result, you will be able to do what you want to do to some extent. Convert JSON file to Excel I want to convert a JSON file to a CSV file

The first method could not be executed due to the version of Excel, and when I actually tried the second method, a problem occurred.

  1. The format of the output is not what you intended The structure of GA is Account> Properties> View. The files after csv conversion are output for the number of lines in the account. Also, because the number of properties varies from account to account, personally unintended data was generated.
account Property_1 View_1_1 View_1_2 ・ ・ ・ Property_2 ・ ・ ・
A Property 1 associated with A View 1 associated with property 1 View 2 associated with property 1 ・ ・ ・ Property 2 associated with A ・ ・ ・

What I originally wanted was the following vertical data.

account Property View
A Property 1 associated with A View 1 associated with property 1
A Property 1 associated with A View 2 associated with property 1
A ・ ・ ・ ・ ・ ・
A A Linked property 2 View 1 associated with property 2
A ・ ・ ・ ・ ・ ・
B Property 1 associated with B View 1 associated with property 1
  1. Is there a limit to the number of accounts that can be acquired? There were 59 accounts at runtime in the company, but only 49 were acquired. I messed with the numerical values of the parameters, but I tried to write the code to get the data, so I have not investigated the cause deeply. If you have any idea, I would appreciate it if you could comment.

Development environment / module version

Work contents in advance

I am working after adding the issued service account to the GA account I want to acquire. Originally I wanted to get a GA account linked to an in-house administrator account, but I don't know how to deal with it. ..

I had a lot of accounts and it was hard, so I worked with one hand while singing and survived. Lol

Implementation code

Below is the implementation code for the main subject.

Most of the implementation code is quoted from below. First Analytics API: Python Quickstart for Service Accounts

We use API to create df for each account, property and view, and append to empty df. Management API> Reference> Web Properties: list Management API> Reference> View (old profile): list

qiita.python


from googleapiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd
import numpy as np

def get_service(api_name, api_version, scopes, key_file_location):

    credentials = ServiceAccountCredentials.from_json_keyfile_name(
            key_file_location, scopes=scopes)
    service = build(api_name, api_version, credentials=credentials)
    return service

def get_account_propaty_view_info(service):
    accounts = service.management().accounts().list().execute()

    #Store the list of accounts in df
    account_df = pd.DataFrame(columns=['account_name', 'account_id'])

    for account in accounts.get('items', []):  
        account_array = np.array([[account.get('name'), account.get('id')]])
        df_account = pd.DataFrame(data=account_array,columns=account_df.columns)
        account_df = account_df.append(df_account, ignore_index=False)

    #Store a list of properties in df
    property_df = pd.DataFrame(columns=['account_number','account_id', 'property_name', 'property_id'])

    #Get the target property list from the df of the account
    for i in range(len(account_df)):
        properties = service.management().webproperties().list(accountId=account_df['account_id'].iloc[i]).execute()
        for property in properties.get('items', []):
            property_array = np.array([[i, account_df['account_id'].iloc[i], property.get('name'), property.get('id')]])
            df_property = pd.DataFrame(data=property_array,columns=property_df.columns)
            property_df = property_df.append(df_property, ignore_index=False)
        
    #Store the list of views in df
    view_df = pd.DataFrame(columns=['account_id', 'property_id', 'view_name', 'view_id'])

    #Get the target view list from the property df
    for i in range(len(property_df)):
        views  = service.management().profiles().list(accountId=property_df['account_id'].iloc[i],
        webPropertyId=property_df['property_id'].iloc[i] ).execute()
        for view in views.get('items', []):
            view_array = np.array([[property_df['account_id'].iloc[i], property_df['property_id'].iloc[i], view.get('name'), view.get('id')]])
            df_view = pd.DataFrame(data=view_array,columns=view_df.columns)

            view_df = view_df.append(df_view, ignore_index=False)
            
    #Try to combine the list of accounts, the list of properties, and the list of views into one df
    #Account due to processing convenience at the time of joining_If you are concerned about the position of number, please change the order of the columns.
    ga_account_list = pd.merge(pd.merge(account_df, property_df, on='account_id', how='inner'), view_df, on=['account_id', 'property_id'], how='inner')
    ga_account_list.to_csv('Csv file name to create', encoding='utf_8_sig')

def main():
    scope = 'https://www.googleapis.com/auth/analytics.readonly'
    key_file_location = 'JSON key file for the issued service account'

    service = get_service(
            api_name='analytics',
            api_version='v3',
            scopes=[scope],
            key_file_location=key_file_location)

    get_account_propaty_view_info(service)

if __name__ == '__main__':
    main()

Current output example The account ID is not a neat number, so I have an account_number instead.

account_name account_id account_number property_name property_id view_name view_id
A 10 0 property_1 100 view_1 1000
A 10 0 property_1 100 view_2 1001
A ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・
A 10 0 property_2 101 view_1 1100
A ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・
B 20 1 property_1 200 view_1 2000

Future outlook

First of all, we plan to use this result to sort out unnecessary accounts and properties.

Based on that, I would like to create a group for each department in the company. Currently, GA authority is granted to individuals, but it is possible to grant authority for each group.

Mr. A, who is newly joining the company, is in the XX department, so the image is that he should raise the authority of □□. I want to aim for a future that is automated with the touch of a button.

I hope it will be helpful for those who have similar problems.

Recommended Posts

Get a list of GA accounts, properties, and views as vertical data using API
Get data using Ministry of Internal Affairs and Communications API
Twitter API: Get a list of accounts that you follow but are not followed back
Get Salesforce data using REST API
Automation of a research on geographical information such as store network using Python and Web API
Get Amazon data using Keep API # 1 Get data
I tried to get a list of AMI Names using Boto3
Get the column list & data list of CASTable
Get Youtube data in Python using Youtube Data API
[python] Get a list of instance variables
[Python] Get a list of folders only
Awareness of using Aurora Severless Data API
Get a list of articles posted by users with Python 3 Qiita API v2
[Python] Get all comments using Youtube Data API
How to get article data using Qiita API
Get a list of IAM users with Boto3
Get a list of Qiita likes by scraping
Get a large amount of Starbucks Twitter data with python and try data analysis Part 1
I made a list site of Kindle Prime Reading using Scrapy and GitHub Actions
I want to get custom data attributes of html as elements using Python Selenium
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
Get all songs of Arashi's song information using Spotify API and verify the index
Get the variable name of the variable as a character string.
List of Python libraries for data scientists and data engineers
Generate a vertical image of a novel from text data
Get LEAD data using Marketo's REST API in Python
[Python] Get insight data using Google My Business API
Get comments and subscribers with the YouTube Data API
I tried using the API of the salmon data project
Python: Get a list of methods for an object
Get a list of CloudWatch Metrics and a correspondence table for Unit units with Python boto
Get the trading price of virtual currency and create a chart with API of Zaif exchange