Try hitting the YouTube API in Python

This article is the 25th day of ** Fukushima National College of Technology Advent Calendar 2020 **. This article is about API. Please do not abuse the contents.

Introduction

Recently, I've come to see articles and videos that describe the importance of APIs, so I decided to try it after studying Python. I'm still a beginner so I can't write smart code, but thank you.

Overview

For the time being, I would like to hit `YouTube Data API`, which seems to be relatively major, to acquire the data of YouTube channels and videos and analyze it. It is assumed that the following preparations have already been made.

Advance preparation

--Obtain the API key for YouTube Data API v3 --Install Python and prepare the development environment

Test implementation

For now, let's use some methods to lightly understand how to use the API. By the way, API Quotas (daily usage) is limited to 10000. Please note that if you repeat the test, you may reach it unexpectedly easily.

Installation of required packages

First, install the package for using the YouTube Data API in Python.

Package installation


$ pip install google-api-python-client

Get a YouTube channel containing a search query

As a starting point, implement the process to get the YouTube channel containing the search word.

getChannel.py


from apiclient.discovery import build

API_KEY = '<API_KEY>' #Obtained API key
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'

youtube = build(
    YOUTUBE_API_SERVICE_NAME,
    YOUTUBE_API_VERSION,
    developerKey = API_KEY
)

SEARCH_QUELY = input('Query >> ')
response = youtube.search().list(
    q=SEARCH_QUELY, 
    part='id,snippet', 
    maxResults=10,
    type='channel').execute()

for item in response.get('items', []):
    print(item['snippet']['title'])

When you execute the script and enter the keyword, 10 corresponding channels will be output in a list. Enter the API key you obtained in `<API_KEY>`.

response = youtube.search().list(
    q=SEARCH_QUELY, 
    part='id,snippet', 
    maxResults=10,
    type='channel').execute()

Is it here that is the key? You can set the information you want to get by giving each parameter to the argument of the search (). list () method. It seems that you can get not only channels but also videos and playlists by setting parameters.

for item in response.get('items', []):
    print(item['snippet']['title'])

Since the data is returned in json format, use get to extract the necessary information. Check the YouTube Data API Reference for the detailed format of the parameters and return values.

Get video data of the specified channel

You can get the video information of a specific channel by specifying the ID of that channel.

getVideos.py


from apiclient.discovery import build

API_KEY = '<API key>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
CHANNEL_ID = '<Channel ID>'

youtube = build(
    YOUTUBE_API_SERVICE_NAME,
    YOUTUBE_API_VERSION,
    developerKey=API_KEY
)

response = youtube.search().list(
    part = "snippet",
    channelId = CHANNEL_ID,
    maxResults = 5,
    order = "date",
    type ='video'
    ).execute()

for item in response.get("items", []):
    print(item['snippet']['title'])

I just modified the previous code a little. The parameters of the `search (). list ()` method are increasing. If you specify `channelId```, you can get the video information of the corresponding channel up to the maximum value ``` maxResults```. You can specify how to sort the response with `order```. date is in chronological order.

Get comment on video

This is the process to get the comment of a specific video.

getComments.py


import json
import requests
from apiclient.discovery import build

URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
VIDEO_ID = '<Video ID>'

params = {
    'key': API_KEY,
    'part': 'snippet',
    'videoId': VIDEO_ID,
    'order': 'relevance',
    'textFormat': 'plaintext',
    'maxResults': 100,
}

response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()

for item in resource['items']:
    name = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
    like_cnt = item['snippet']['topLevelComment']['snippet']['likeCount']
    text = item['snippet']['topLevelComment']['snippet']['textDisplay']
    print('User name: {}\n{}\n good number: {}\n'.format(name, text, like_cnt))

You can get the comment of a specific video by specifying the ID as well as the channel.

python


response = requests.get(URL + 'commentThreads', params=params)
resource = response.json()

`` `Request``` is done by connecting the parameters specified in the URL.

python


for item in resource['items']:
    name = item['snippet']['topLevelComment']['snippet']['authorDisplayName']
    like_cnt = item['snippet']['topLevelComment']['snippet']['likeCount']
    text = item['snippet']['topLevelComment']['snippet']['textDisplay']
    print('User name: {}\n{}\n good number: {}\n'.format(name, text, like_cnt))

As in the example, the response is json, so the necessary information is extracted. This time, the user name, text, and good number of comments are obtained, but the number of replies and child comments can also be obtained.

Data analysis

I will try light data analysis when I can understand how to use the API. Data analysis is as simple as getting a comment for a specific video based on the above code and outputting it as CSV. In particular

--Enter a search word to get related channels --Specify a channel to get a video --Specify a video and get a comment --Export to CSV

I will try to implement such a process. Create with youtube_api.py.

Get channel

Enter a search word to number the related channel titles and display them in a list.

youtube_api.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import json
import requests
import pandas as pd
from apiclient.discovery import build

URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
SEARCH_QUELY =''

youtube = build(
    YOUTUBE_API_SERVICE_NAME,
    YOUTUBE_API_VERSION,
    developerKey = API_KEY
)

def getChannel():
    channel_list = []
    num = 0
    search_res = youtube.search().list(
        q=SEARCH_QUELY, 
        part='id,snippet', 
        maxResults=10,
        type='channel',
        order='rating'
    ).execute()

    for item in search_res.get('items', []):
        num += 1
        channel_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'channelId':item['snippet']['channelId']}
        channel_list.append(channel_dict)
    
    print('***Channel list***')
    for data in channel_list:
        print("Channel " + data["num"] + " : " + data["title"])
    print('******************')

    return getId(input('Channel Number>> '),channel_list)

It is a light commentary. The parameter of `search ()` is set to acquire 10 related channels in descending order of resource evaluation. The resource title and Channel ID are stored in the dictionary type. `` `num``` is a number to specify a specific channel from the list. Store the dictionary in a list. Enter the number of the channel you want to select and it will return the Channel ID.

Video acquisition

Next, add the code to get the video from the specified Channel ID and display it.

youtube_api.py


def getVideos(_channelId):
    video_list = []
    num = 0
    video_res = youtube.search().list(
        part = 'snippet',
        channelId = _channelId,
        maxResults = 100,
        type = 'video',
        order = 'date'
    ).execute()
    
    for item in video_res.get("items",[]):
        num += 1
        video_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'videoId':item['id']['videoId']}
        video_list.append(video_dict)

    print('***Video list***')
    for data in video_list:
        print("Video " + data["num"] + " : " + data["title"])
    print('****************')

    return getId(input('Video Number>> '),video_list)

I'm just doing the same thing, so I'll omit the explanation.

Get comments

Add more code to get comments from the video.

youtube_api.py


def getComments(_videoId):
    global API_KEY
    comment_list = []
    params = {
        'key': API_KEY,
        'part': 'snippet',
        'videoId': _videoId,
        'order': 'relevance',
        'textFormat': 'plaintext',
        'maxResults': 100,
    }

    response = requests.get(URL + 'commentThreads', params=params)
    resource = response.json()

    for item in resource['items']:
        text = item['snippet']['topLevelComment']['snippet']['textDisplay']
        comment_list.append([item['snippet']['topLevelComment']['snippet']['authorDisplayName'],
                             item['snippet']['topLevelComment']['snippet']['likeCount'],
                             item['snippet']['topLevelComment']['snippet']['textDisplay']])
    return comment_list

The user name, text, and good number of 100 comments from the video specified by VideoID are stored in the list.

CSV output

It is just stored in a DataFrame and output.

youtube_api.py


def dataList(_comment_list):
    if(_comment_list != []):
        param=['User name', 'Like count', 'text']
        df = pd.DataFrame(data = _comment_list,columns=param)
        df.to_csv("comments.csv")
        print('Output csv')
    else:
        print('None comment')

Whole code

youtube_api.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import json
import requests
import pandas as pd
from apiclient.discovery import build

URL = 'https://www.googleapis.com/youtube/v3/'
API_KEY = '<API_KEY>'
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
SEARCH_QUELY =''

youtube = build(
    YOUTUBE_API_SERVICE_NAME,
    YOUTUBE_API_VERSION,
    developerKey = API_KEY
)

def run():
    global SEARCH_QUELY
    SEARCH_QUELY = input('Search word>> ')
    dataList(getComments(getVideos(getChannel())))

def getId(_num,_items):
    for data in _items:
        if data['num'] == _num:
            if data['type'] == 'youtube#channel':
                return data['channelId']
            else:
                return data['videoId']
    return ''

def getChannel():
    channel_list = []
    num = 0
    search_res = youtube.search().list(
        q=SEARCH_QUELY, 
        part='id,snippet', 
        maxResults=10,
        type='channel',
        order='rating'
    ).execute()

    for item in search_res.get('items', []):
        num += 1
        channel_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'channelId':item['snippet']['channelId']}
        channel_list.append(channel_dict)
    
    print('***Channel list***')
    for data in channel_list:
        print("Channel " + data["num"] + " : " + data["title"])
    print('******************')

    return getId(input('Channel Number>> '),channel_list)

def getVideos(_channelId):
    video_list = []
    num = 0
    video_res = youtube.search().list(
        part = 'snippet',
        channelId = _channelId,
        maxResults = 100,
        type = 'video',
        order = 'date'
    ).execute()
    
    for item in video_res.get("items",[]):
        num += 1
        video_dict = {'num':str(num),'type':item['id']['kind'],'title':item['snippet']['title'],'videoId':item['id']['videoId']}
        video_list.append(video_dict)

    print('***Video list***')
    for data in video_list:
        print("Video " + data["num"] + " : " + data["title"])
    print('****************')

    return getId(input('Video Number>> '),video_list)

def getComments(_videoId):
    global API_KEY
    comment_list = []
    params = {
        'key': API_KEY,
        'part': 'snippet',
        'videoId': _videoId,
        'order': 'relevance',
        'textFormat': 'plaintext',
        'maxResults': 100,
    }

    response = requests.get(URL + 'commentThreads', params=params)
    resource = response.json()

    for item in resource['items']:
        text = item['snippet']['topLevelComment']['snippet']['textDisplay']
        comment_list.append([item['snippet']['topLevelComment']['snippet']['authorDisplayName'],
                             item['snippet']['topLevelComment']['snippet']['likeCount'],
                             item['snippet']['topLevelComment']['snippet']['textDisplay']])
    return comment_list

def dataList(_comment_list):
    if(_comment_list != []):
        param=['User name', 'Like count', 'text']
        df = pd.DataFrame(data = _comment_list,columns=param)
        df.to_csv("comments.csv")
        print('Output csv')
    else:
        print('None comment')

#Run
run()

Run

Let's move it now. Run `` `youtube_api.py```. Try entering an appropriate word.

execute1 Specify the channel number. ↓ execute2.png Specify the video number. ↓ execute3.png If you can output to CSV safely, it is successful. Thank you for your hard work. This time it was a simple process of extracting comments, but it seems interesting to graph the channel and video data. Also, if you use another API, it seems that you can perform sentiment analysis of comment sentences and find out anti-comments. The code used this time is on GitHub, so please refer to that.

Editor's Note

Actually I wanted to write another content, but due to time constraints, it became a thin content just by hitting the API. However, I wonder if there is any loss in improving the skills to use the API. If you have any improvements or advice in the content of this article, thank you.

References

GitHub https://github.com/Milkly-D/youtube_API.git

Recommended Posts

Try hitting the YouTube API in Python
Try using the Kraken API in Python
Try hitting the Spotify API in Django.
Try using the BitFlyer Ligntning API in Python
Tips for hitting the ATND API in Python
Try using the DropBox Core API in Python
Getting the arXiv API in Python
Hit the Sesami API in Python
Hit the web API in Python
Access the Twitter API in Python
Tweet using the Twitter API in Python
Quickly try Microsoft's Face API in Python
Play by hitting the Riot Games API in Python First half
Try gRPC in Python
C API in Python 3
Try 9 slices in Python
[Cloudian # 7] Try deleting the bucket in Python (boto3)
Try implementing the Monte Carlo method in Python
Hit the Firebase Dynamic Links API in Python
Try accessing the YQL API directly from Python 3
Try using ChatWork API and Qiita API in Python
Hit Mastodon's API in Python
Get YouTube Comments in Python
Find the difference in Python
Initial settings when using the foursquare API in python
Try using the Twitter API
Try LINE Notify in Python
Try using the Twitter API
Try using the PeeringDB 2.0 API
Try implementing Yubaba in Python 3
Blender Python API in Houdini (Python 3)
Using the National Diet Library Search API in Python
I tried hitting the API with echonest's python client
Try to build a pipeline to store the result in Bigquery by hitting the Youtube API regularly using Cloud Composer
Second half of the first day of studying Python Try hitting the Twitter API with Bottle
Try scraping the data of COVID-19 in Tokyo with Python
A note about hitting the Facebook API with the Python SDK
Python in the browser: Brython's recommendation
Save the binary file in Python
Try the Python LINE Pay SDK
[Python] Hit the Google Translation API
Get the desktop path in Python
Try using Pleasant's API (python / FastAPI)
Try using LevelDB in Python (plyvel)
Get the script path in Python
Let's try Fizz Buzz in Python
Try to calculate Trace in Python
Try PLC register access in Python
Hit the Etherpad-lite API with Python
Create Gmail in Python without API
Memorize the Python commentary on YouTube.
Use the Flickr API from Python
Try using Python argparse's action API
I wrote the queue in Python
Calculate the previous month in Python
Get the desktop path in Python
Try using the Python Cmd module
Quickly implement REST API in Python
Cython to try in the shortest
Get the host name in Python
Try using Leap Motion in Python