[PYTHON] Get comments and subscribers with the YouTube Data API

What I wanted to do

For a group of users who are moving comments to a specific channel, acquire which channel each user has registered. Is it possible to estimate the customer base by doing so? Assumption. If there are multiple users who have subscribed to the same channel, it is possible that the customer base is those who are interested in the topics of that channel. (It has not been verified whether the customer base can be estimated, so I will post it somewhere again if possible)

To explain

  1. How to get the comment attached to the video posted on the specified channel
  2. How to get the channel registered by the specified channel

I explained it before, so don't explain it this time.

Description

1. How to get the comment attached to the video posted on the specified channel

import requests
import pandas as pd

def get_comment_info(channel_id, pageToken):
    comment_url = "https://www.googleapis.com/youtube/v3/commentThreads"
    param = {
        'key': 【Browser Key】
        , 'allThreadsRelatedToChannelId': channel_id
        , 'part': 'replies, snippet'
        #If you don't need to get the reply tree, you don't need replies
        , 'maxResults': '50'
        , 'pageToken': pageToken
    }

    req = requests.get(comment_url, params=param)
    return req.json()

comment_df = pd.DataFrame([]*3, columns=["video_id", "author_name", "channel_id"])

for comment_thread in req["items"]:
    video_id = comment_thread["snippet"]["videoId"]
    
    author_name = comment_thread["snippet"]["topLevelComment"]["snippet"]["authorDisplayName"]
    author_channel = comment_thread["snippet"]["topLevelComment"]["snippet"]["authorChannelId"]["value"]
    comment_df = comment_df.append(pd.DataFrame([[video_id, author_name, author_channel]], columns=["video_id", "author_name", "channel_id"]))
    
    if "replies" in comment_thread and "comments" in comment_thread["replies"]:
        for replies in comment_thread["replies"]["comments"]:
            author_name = replies["snippet"]["authorDisplayName"]
            author_channel = replies["snippet"]["authorChannelId"]["value"]
            comment_df = comment_df.append(pd.DataFrame([[video_id, author_name, author_channel]], columns=["video_id", "author_name", "channel_id"]))

The points are the following two

Also, in the above example, only the channel name and channel_id of the person who commented are acquired, but if you want to acquire the text of the comment, you can acquire it below.

Please check this document for details or for a slightly different usage. For example, this time we specify channel_id to get comments associated with all videos posted on that channel, but it is also possible to specify only specific videos. https://developers.google.com/youtube/v3/docs/commentThreads?hl=ja

2. How to get the channel registered by the specified channel

def get_subscription_info(channel_id, pageToken):
    subscription_url = 'https://www.googleapis.com/youtube/v3/subscriptions'
    param = {
        'key': 【Browser Key】
        , 'channelId': channel_id
        , 'part': 'snippet'
        , 'maxResults': '50'
        , 'pageToken': pageToken
    }

    req = requests.get(subscription_url, params=param)
    return req.json()

subscription_df = pd.DataFrame([]*3, 
    columns=["channel_id", "subscript_channel_name", "subscript_channel_id"])

#Passing an empty string to pageToken is the same as not specifying anything
# 1.If you are running the code you got in_id_list = comment_df["channel_id"].unique()become
for channel_id in channel_id_list:

    pageToken = ""
    while True:
        req = get_subscription_info(channel_id, pageToken)

        if "items" in req:
            for item in req["items"]:
                subscript_channel_name = item["snippet"]["title"]
                subscript_channel_id = item["snippet"]["resourceId"]["channelId"]
                subscription_df = subscription_df.append(
                    pd.DataFrame(
                        [[channel_id, subscript_channel_name, subscript_channel_id]]
                        , columns=["channel_id", "subscript_channel_name", "subscript_channel_id"]
                    )
                )


        #If the number of remaining items exceeds maxResults, nextPageToken will be returned.
        if "nextPageToken" in req:
            pageToken = req["nextPageToken"]
            print(channel_id, pageToken)
        else:
            break

Use the Subscriptions API with Japanese translation as a subcategory https://developers.google.com/youtube/v3/docs/subscriptions?hl=ja

As for how to use it, I explained how to get videoId from playlistId before, but the atmosphere is similar to that. https://qiita.com/miyatsuki/items/c221b48830db2b0a9eba#12-%E3%83%97%E3%83%AC%E3%82%A4%E3%83%AA%E3%82%B9%E3%83%88id%E3%81%8B%E3%82%89%E5%8B%95%E7%94%BB%E3%81%AEid%E3%82%92%E5%8F%96%E5%BE%97

In particular,

Recommended Posts

Get comments and subscribers with the YouTube Data API
Get comments on youtube Live with [python] and [pytchat]!
Get the number of Youtube subscribers
Get Youtube data in Python using Youtube Data API
Get holidays with the Google Calendar API
Get the number of articles accessed and likes with Qiita API + Python
Get stock price data with Quandl API [Python]
Get Gmail subject and body with Python and Gmail API
[Python] Get user information and article information with Qiita API
[CGI] Run the Python program on the server with Vue.js + axios and get the output data
Call the API with python3.
Get ranking with Rakuten API
Try hitting the Twitter API quickly and easily with Python
Get data using Ministry of Internal Affairs and Communications API
Get rid of dirty data with Python and regular expressions
Solve the spiral book (algorithm and data structure) with python!
Streamline information gathering with the Twitter API and Slack bots
Get YouTube Live chat field in real time with API
Get data from analytics API with Google API Client for python
Get additional data to LDAP with python (Writer and Reader)
Make a BLE thermometer and get the temperature with Pythonista3
Get the weather using the API and let the Raspberry Pi speak!
[Introduction to Python] How to get data with the listdir function
Play with YouTube Data API v3 using Google API Python Client
Get the trading price of virtual currency and create a chart with API of Zaif exchange
Get Salesforce data using REST API
Get the number of PVs of Qiita articles you posted with API
Get the strongest environment with VS Code, Remote-Containers and remote docker-daemon
Get the complete bitflyer tick data
Get conversions and revenue with Google Analytics API and report to Slack
How to get followers and followers from python using the Mastodon API
Get the weather with Python requests
Get the weather with Python requests 2
[Python] Get economic data with DataReader
Get data from MySQL on a VPS with Python 3 and SQLAlchemy
Grant an access token with the curl command and POST the API
Get Amazon data using Keep API # 1 Get data
[Python] I tried to get various information using YouTube Data API!
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
Collect video information of "Singing with XX people" [Python] [Youtube Data API]
How to get the date and time difference in seconds with python
I tried to get the authentication code of Qiita API with Python.
[Personal memo] Get data on the Web and make it a DataFrame
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7
Send and receive image data as JSON over the network with Python
Read the graph image with OpenCV and get the coordinates of the final point of the graph
Get news from three major mobile companies using Django and the News API
I tried to get the movie information of TMDb API with Python
Get and estimate the shape of the head using Dlib and OpenCV with python
Plot multiple maps and data at the same time with Python's matplotlib
Get additional data in LDAP with python
Data pipeline construction with Python and Luigi
Access the Docker Remote API with Requests
[Note] Get data from PostgreSQL with Python
Retrieving food data with Amazon API (Python)
Get the column list & data list of CASTable
I tried using YOUTUBE Data API V3
[Python] Get the variable name with str
Get data from Cloudant with Bluemix flask
Get the minutes of the Diet via API
Get Google Fit API data in Python