Get Youtube data in Python using Youtube Data API

Introduction

Recently, I'm investigating whether various data can be acquired using API. I researched and practiced how to get information such as the number of views and likes of videos using the Youtube Data API, so I wrote it as a memorandum.

reference

I referred to the following when using the Youtube Data API.

-[Youtube video search with Python](https://ossyaritoori.hatenablog.com/entry/2018/01/22/Python%E3%81%A7Youtube%E5%8B%95%E7%94%BB%E6%A4 % 9C% E7% B4% A2) -Overview of YouTube Data API -Youtube Data API Reference

Ready to use the API

Youtube Data API registration

To get the Youtube Data API, you first need a Google account. Follow the steps below to register the Yotube Data API and get the API key.

-Create a new project after accessing Google Cloud Platform --Move from "APIs and Services" to "Dashboard" with the newly created project checked. --From the moved screen, move to "API Library" and from the search screen, "YouTube Data API v3" Search for and move to the next page --Enable the API of "YouTube Data API v3" on the screen after moving --Click the "Create Authentication Information" button on the "Authentication Information" screen to get the API key

Get the library to get information from API in Python

Get the library for here. It can be installed with pip as shown below.

pip install google-api-python-client

You are now ready to use the Youtube API.

Get video information with Youtube Data API

In the following, we will acquire video information using the Youtube Data API. Please check Youtube Data API Reference for specific information that can be obtained.

Get videos with titles containing specific keywords in order of number of views

Recently, I'm addicted to board games, so I'd like to get board game-related video information in descending order of the number of views.

from apiclient.discovery import build

YOUTUBE_API_KEY = 'Enter your API key'

youtube = build('youtube', 'v3', developerKey=YOUTUBE_API_KEY)

search_response = youtube.search().list(
part='snippet',
#Specify the character string you want to search
q='Board game',
#Obtained in descending order of views
order='viewCount',
type='video',
).execute()

It is possible to get video information in JSON format using the above script. Let's take a look at the information of the most played video of the board game.

search_response['items'][0]
{'kind': 'youtube#searchResult',
 'etag': '"p4VTdlkQv3HQeTEaXgvLePAydmU/0dlj0cjWp5akSv64R8VxJM--3Ok"',
 'id': {'kind': 'youtube#video', 'videoId': 'ASusE5qjoAg'},
 'snippet': {'publishedAt': '2019-05-31T11:58:15.000Z',
  'channelId': 'UCutJqz56653xV2wwSvut_hQ',
  'title': '[Aim for commercialization] Counter-literature! Gachinko board game making showdown!',
  'description': 'I no longer say "Please subscribe to the channel" or "Thank you for your high evaluation" on Tokai OnAir, but I said after all....',
  'thumbnails': {'default': {'url': 'https://i.ytimg.com/vi/ASusE5qjoAg/default.jpg',
    'width': 120,
    'height': 90},
   'medium': {'url': 'https://i.ytimg.com/vi/ASusE5qjoAg/mqdefault.jpg',
    'width': 320,
    'height': 180},
   'high': {'url': 'https://i.ytimg.com/vi/ASusE5qjoAg/hqdefault.jpg',
    'width': 480,
    'height': 360}},
  'channelTitle': 'Tokai OnAir',
  'liveBroadcastContent': 'none'}}

The video of Tokai OnAir was number one. After all it is very popular ... However, the above script can only get 5 pieces of information at a time, not the specific number of video views.

Get information on many videos at once

I created a function that acquires a large number of videos at once, extracts only the necessary information from the return value, and drops it in a data frame. That is here.


#Get the information of the number x 5 in num
#Other parameters are the same as the parameters to get information from API
def get_video_info(part, q, order, type, num):
    dic_list = []
    search_response = youtube.search().list(part=part,q=q,order=order,type=type)
    output = youtube.search().list(part=part,q=q,order=order,type=type).execute()
    
    #Since only 5 items can be acquired at a time, it is executed over and over again.
    for i in range(num):        
        dic_list = dic_list + output['items']
        search_response = youtube.search().list_next(search_response, output)
        output = search_response.execute()
    
    df = pd.DataFrame(dic_list)
    #Get a unique videoId for each video
    df1 = pd.DataFrame(list(df['id']))['videoId']
    #Get a unique videoId for each video Get only the video information you need
    df2 = pd.DataFrame(list(df['snippet']))[['channelTitle','publishedAt','channelId','title','description']]
    ddf = pd.concat([df1,df2], axis = 1)
    
    return ddf

Let's execute the above function. This time, I will try to get 100 board game related videos in descending order of the number of views.

get_video_info(part='snippet',q='Board game',order='viewCount',type='video',num = 20)

In this way, we were able to put 100 video information into a data frame.

スクリーンショット 2019-12-16 20.00.14.png

Get the number of views of a video

Then get the number of times the video has been played. You need to use a different method than the one you used earlier. Get the number of times the video has been played and attach it to the data frame you just created.

#Create a function to get the specific number of views and likes of the video by entering the videoId
def get_statistics(id):
    statistics = youtube.videos().list(part = 'statistics', id = id).execute()['items'][0]['statistics']
    return statistics

df_static = pd.DataFrame(list(df['videoId'].apply(lambda x : get_statistics(x))))

df_output = pd.concat([df,df_static], axis = 1)

df_output

With this kind of feeling, I was able to get the number of views, likes, comments, etc. of the video.

スクリーンショット 2019-12-16 20.06.04.png

Try to visualize

Let's visualize it easily. I tried to graph the cumulative number of views for each channel in the videos within the top 100 views. Click here for the results.

df_output.groupby('channelTitle').sum().sort_values(by = 'viewCount', ascending = False).plot( kind='bar', y = 'viewCount', figsize = (25,10), fontsize = 20)

スクリーンショット 2019-12-16 20.30.14.png

After all, one popular Youtuber video has an outstanding number of views, so the top is a series of well-known channels. If you graph the number of posted videos for each channel in the top 100 most viewed videos, you will see a different view.

スクリーンショット 2019-12-16 20.30.37.png

An unfamiliar channel called "Gorgeous Video" came in first. "Gorgeous Video" is like that entertainer's gorgeous Youtube channel. It seems that he is energetically giving videos of board games.

Next You can get various interesting data like this. I would like to play around with it using the Youtube Data API.

Recommended Posts

Get Youtube data in Python using Youtube Data API
[Python] Get all comments using Youtube Data API
Get LEAD data using Marketo's REST API in Python
Get Google Fit API data in Python
Get YouTube Comments in Python
Get Youtube data with python
Get image URL using Flickr API in Python
[Python] I tried to get various information using YouTube Data API!
[Python] Get insight data using Google My Business API
Get Leap Motion data in Python.
Get Salesforce data using REST API
Data acquisition using python googlemap api
Get data from Quandl in Python
Get Amazon data using Keep API # 1 Get data
Play with YouTube Data API v3 using Google API Python Client
Get additional data in LDAP with python
Mouse operation using Windows API in Python
Try using the Kraken API in Python
I tried using YOUTUBE Data API V3
Get mail using Gmail API in Java
Tweet using the Twitter API in Python
Try hitting the YouTube API in Python
Creating Google Spreadsheet using Python / Google Data API
Data analysis using Python 0
Data cleaning using Python
Get date in Python
C API in Python 3
Get time series data from k-db.com in Python
Try using the BitFlyer Ligntning API in Python
Get stock price data with Quandl API [Python]
Let's judge emotions using Emotion API in Python
How to get article data using Qiita API
Try using ChatWork API and Qiita API in Python
Try using the DropBox Core API in Python
Hit Mastodon's API in Python
Handle Ambient data in Python
Create a data collection bot in Python using Selenium
Get last month in python
Upload JPG file using Google Drive API in Python
Display UTM-30LX data in Python
Collectively register data in Firestore using csv file in Python
Initial settings when using the foursquare API in python
Get data from GPS module at 10Hz in Python
Get Terminal size in Python
Explicitly get EOF in python
OpenVINO using Inference Engine Python API in PC environment
Get comments and subscribers with the YouTube Data API
Get data via salesforce API (Bulk API) in Python and load it into BigQuery
Blender Python API in Houdini (Python 3)
Using the National Diet Library Search API in Python
Get Evernote notes in Python
Data analysis using python pandas
Translate using googletrans in Python
Using Python mode in Processing
Get Japanese synonyms in Python
Upload as open data using CKAN API in Python & automatically link with Github Actions
I tried to search videos using Youtube Data API (beginner)
Get data using Ministry of Internal Affairs and Communications API
Inflating text data by retranslation using google translate in Python
Output Excel data in separate writing using Python3 + xlrd + mecab
Hit REST in Python to get data from New Relic