[PYTHON] I tried to search videos using Youtube Data API (beginner)

Introduction

Input is important for studying data analysis, but practice is the best, so I thought that there would be good data to practice. To be honest, I can't judge whether the Youtube data is good. However, I often watch Youtube, and since it is an area of interest, I would like to summarize how to use it with the goal of being able to extract data for analysis using ** "Youtube Data API" **. I used the following page (API reference) to learn the API. https://developers.google.com/youtube/v3/docs?hl=ja

Search process

This time, as a starting point, search the video under the following conditions and output the result to a csv file.

--Search videos with specified keywords (keywords are specified with the first argument) --Search results are displayed in descending order by the number of views

In addition, the video of the search result is frequency-distributed to which channel it belongs to and output to a csv file.

Source code

The source is as follows. For the variable "DEVELOPER_KEY" in the program, enter your own API key. The method of issuing the API key is omitted here.

searchKeyword.py


# import library
from apiclient.discovery import build
from apiclient.errors import HttpError
import argparse
import numpy as np
import pandas as pd

# Set Yotube Data API key
DEVELOPER_KEY = "YOUR API KEY!!!"
YOUTUBE_API_SERVICE_NAME = "youtube"
YOUTUBE_API_VERSION = "v3"

def searchKeyword(options):
    #Keyword search process
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, 
                    developerKey=DEVELOPER_KEY)
    searchResults = youtube.search().list(q=options.sw,
                                        type="video",
                                        part="id,snippet",
                                        maxResults=options.max_results,
                                        order="viewCount"
                                        ).execute()
    
    #Search result classification processing
    videos = []
    others = []
    for searchResult in searchResults["items"]:
        if (searchResult["id"]["kind"] == "youtube#video"):
            videos.append(searchResult)
        else :
            others.append(searchResult)

    #Video, channel information formatting, csv file output
    videoTitles = []
    viewCounts = []
    likeCounts = []
    dislikeCounts = []
    favoriteCounts = []
    commentCounts =[]
    videoChannelTitles = []
    stat_list = [viewCounts, likeCounts, dislikeCounts, favoriteCounts, commentCounts]
    stat_keywords = ['viewCount', 'likeCount', 'dislikeCount', 'favoriteCount', 'commentCount']
    for video in videos:
        videoDetail = youtube.videos().list( part="statistics, snippet",
                                            id = video["id"]["videoId"]
                                            ).execute()
        channelDetail = youtube.channels().list(part="snippet", 
                                                id=videoDetail["items"][0]["snippet"]["channelId"]
                                                ).execute()
        
        videoTitles.append(videoDetail["items"][0]["snippet"]["title"])
        for stat, stat_keyword in zip(stat_list, stat_keywords):
            try:
                stat.append(videoDetail["items"][0]["statistics"][stat_keyword])
            except KeyError:
                stat.append(0)
        videoChannelTitles.append(channelDetail["items"][0]["snippet"]["title"])

    df_videos = pd.DataFrame({"title":videoTitles, "ViewCount":viewCounts, 
                            "channelTitle":videoChannelTitles,"likeCount":likeCounts,
                            "dislikeCount":dislikeCounts, "favoriteCount":favoriteCounts,
                            "commentCount":commentCounts})
    df_videos.to_csv("Search_result_{}.csv".format(options.sw),encoding="utf-8_sig")
    df_videos_countbyChannel = df_videos["channelTitle"].value_counts()
    df_videos_countbyChannel.to_csv("ChannelTitle_{}.csv".format(options.sw),encoding="utf-8_sig")

    return df_videos, df_videos_countbyChannel
    


if __name__ == "__main__":
    # parse Argument
    parser = argparse.ArgumentParser("search Youtube Program...")
    parser.add_argument("sw", help="search Keyword in Youtube")
    parser.add_argument("--max_results", type=int, help="max of search results",
                        default=50)
    options = parser.parse_args()

    searchKeywordResults = searchKeyword(options)

I tried to run

I actually moved it. This time, specify "quantum computer" as the search keyword and execute.

$ python searchKeyword.py "Quantum computer"

"Search_result_quantum computer.csv" and "ChannelTitle_quantum computer.csv" are created in the directory where "searchKeyword.py" is placed. Let's check the contents of these two files.

--Search_result_ Quantum computer.csv (only the beginning part is described)

No title ViewCount channelTitle likeCount dislikeCount favoriteCount commentCount
0 Quantum Computers Explained – Limits of Human Technology 12915763 Kurzgesagt – In a Nutshell 310808 3405 0 16871
1 [Mine Craft]Pseudo qubit computer[The fastest in the world in theory?] 4483432 Miki Tanabe 60153 2057 0 9898
2 What makes a quantum computer different from a normal computer? [Japanese science information] [Science and technology] 622469 Japanese scientific information 8019 435 0 647
3 What is a "quantum computer" that changes the world? Horiemon explains![NewsPicks collaboration] 232913 Takafumi Horie Horiemon 1443 121 0 275
4 This world is a simulation⁉ If a quantum computer is completed...【urban legend】 211623 I want to drink milk tea 2722 142 0 411
5 [Amazing] Impact of quantum computer "Unimaginable misunderstanding" 144126 Ichizero system 1898 121 0 199
6 Far surpassing supercomputers! Domestic quantum computer announcement(17/11/20) 121514 ANNnewsCH 1085 47 0 0
7 [Quantum mechanics] Learn "quantum computer" and "Stern-Gerlach experiment" 117389 Ikeda University 1311 214 0 95
8 [Challenge] "Quantum computer" that can be understood in 10 minutes 110234 NEX industry 1579 178 0 187
9 [Quantum computer] 1st "superposition with qubit" (10 minutes) 105738 Quantum coin 0 0 0 58
10 Bitcoin collapses⁉ What will Google do with quantum computer development? Explanation of blockchain safety, etc. 99405 Mofumofu Real Estate 1675 121 0 192

It seems that I was able to get the video information well.

--ChannelTitle_Quantum computer.csv (only the beginning part is listed)

Channel name Count
Quantum coin 7
Keio University Keio University 5
DENSO Official Channel 2
Shino TV 2
Press SAMURAI 2
Mofumofu Real Estate 2
jstsciencechannel 1
EE Times Japan 1
I want to drink milk tea 1
Bright side Bright Side Japan

It seems that I was able to get the information well here.

Finally

If you apply this, you can do various interesting things. I will expand the functions little by little so that I can do a little more.

Recommended Posts

I tried to search videos using Youtube Data API (beginner)
I tried using YOUTUBE Data API V3
[Python] I tried to get various information using YouTube Data API!
Upload videos using YouTube API
I tried using the API of the salmon data project
I tried to create Quip API
I tried to touch Tesla's API
[Python] I tried collecting data using the API of wikipedia
I tried to analyze scRNA-seq data using Topological Data Analysis (TDA)
I tried using the checkio API
I tried to get data from AS / 400 quickly using pypyodbc
I tried to analyze my favorite singer (SHISHAMO) using Spotify API
I tried to visualize BigQuery data using Jupyter Lab on GCP
I tried to get data from AS / 400 quickly using pypyodbc Preparation 1
I tried using Azure Speech to Text.
I tried using Twitter api and Line api
I tried to classify text using TensorFlow
I tried using Selective search as R-CNN
Get Youtube data in Python using Youtube Data API
I tried using UnityCloudBuild API from Python
I tried to touch the COTOHA API
I tried to make a Web API
How to download youtube videos using pytube3
Try to download Youtube videos using Pytube
I tried using the BigQuery Storage API
I tried to predict Covid-19 using Darts
I tried to summarize various sentences using the automatic summarization API "summpy"
I tried to perform a cluster analysis of customers using purchasing data
I tried to make PyTorch model API in Azure environment using TorchServe
I tried using AWS Rekognition's Detect Labels API
I tried using Remote API on GAE / J
[Python] Get all comments using Youtube Data API
I tried to save the data with discord
I tried to synthesize WAV files using Pydub.
I tried using the Google Cloud Vision API
I tried to touch the API of ebay
I tried to get CloudWatch data with Python
How to get article data using Qiita API
How to search HTML data using Beautiful Soup
I tried DBM with Pylearn 2 using artificial data
I tried to make a ○ ✕ game using TensorFlow
I tried to make a castle search API with Elasticsearch + Sudachi + Go + echo
I tried to make a suspicious person MAP quickly using Geolonia address data
I tried using parameterized
I tried using argparse
I tried using mimesis
I tried using aiomysql
I tried using Summpy
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried using ESPCN
I tried using openpyxl
I tried using Ipython
I tried to debug.
I tried using PyCaret
I tried using cron
I tried using ngrok
I tried using face_recognition