YouTube Terms of Service Use the code below at your own risk.
Recently, there is a story about spoofing in the YouTube chat column and a leak in one column of the person who spascha. As one of the solutions, I wrote below how to get the chat field using YouTube API.
output
[by username 1 channel URL 1]
Comment 1
[by username 2 channel URL 2]
Comment 2
[by username 3 channel URL 3]
Amount from username 3: "Comment 3"← Spacha looks like this
start : 2020-06-07T05:21:49.871000Z ← The first time of the acquired chat group: +9 hours in Japan time
end : 2020-06-07T05:22:02.846000Z ← "the last time
--Official API
-Obtain according to Get YouTube API key (as of March 25, 2020)
--Consume what is called a quota. The upper limit is 10000 quota / day, and increment application is required for more than that.
-quota calculator does not have quota consumption of API used in the following code, so it is unknown how much it consumes.
--2 quota for chat ID acquisition, 3.5 quota for chat acquisition?
--When I tried to get chat every 15 seconds 240 times (variable slp_time = 15
, ʻiter_times = 240 in the
mainfunction), the consumption was
842. --To check quota consumption,
Google Cloud Platform> Project created by API key acquisition> Something like" three "in the upper left> IAM and management> Assignment> Service is narrowed down to YouTube Data API v3 --Can be obtained before and after broadcasting ――If the acquisition frequency is too low, is there an omission of acquisition? ―― Below is the code to record all chats including spacha. Rewrite according to the application. --Spacha is
msg ='Amount from Username:" Message "' --Enter
supChat for spachas and
supSticfor stickers. Just erase the
#` and pull out as much as you need. What's inside here
-Get API KEY according to Get YouTube API key (as of March 25, 2020)
--Get video ID from YouTube Live URL
--https://www.youtube.com/watch?v=***********`` ***********
--Get chat ID from video ID
-Use this API
- {'key': YT_API_KEY, 'id': video_id, 'part': 'liveStreamingDetails'}
--Acquire chat fields repeatedly based on chat ID
-Use this API
- {'key': YT_API_KEY, 'liveChatId': chat_id, 'part': 'id,snippet,authorDetails', 'pageToken': pageToken}
--Set pageToken
to None
for the first time
--From the second time onward, specify the previous nextPageToken
for pageToken
--You can get the difference with this
--You can loop by specifying the time, or you can while True
until you throw an error (until ctrl + C).
record_chat.py
import time
import requests
import json
#Pre-acquired YouTube API key
YT_API_KEY = '***************************************'
def get_chat_id(yt_url):
'''
https://developers.google.com/youtube/v3/docs/videos/list?hl=ja
'''
video_id = yt_url.replace('https://www.youtube.com/watch?v=', '')
print('video_id : ', video_id)
url = 'https://www.googleapis.com/youtube/v3/videos'
params = {'key': YT_API_KEY, 'id': video_id, 'part': 'liveStreamingDetails'}
data = requests.get(url, params=params).json()
liveStreamingDetails = data['items'][0]['liveStreamingDetails']
if 'activeLiveChatId' in liveStreamingDetails.keys():
chat_id = liveStreamingDetails['activeLiveChatId']
print('get_chat_id done!')
else:
chat_id = None
print('NOT live')
return chat_id
def get_chat(chat_id, pageToken, log_file):
'''
https://developers.google.com/youtube/v3/live/docs/liveChatMessages/list
'''
url = 'https://www.googleapis.com/youtube/v3/liveChat/messages'
params = {'key': YT_API_KEY, 'liveChatId': chat_id, 'part': 'id,snippet,authorDetails'}
if type(pageToken) == str:
params['pageToken'] = pageToken
data = requests.get(url, params=params).json()
try:
for item in data['items']:
channelId = item['snippet']['authorChannelId']
msg = item['snippet']['displayMessage']
usr = item['authorDetails']['displayName']
#supChat = item['snippet']['superChatDetails']
#supStic = item['snippet']['superStickerDetails']
log_text = '[by {} https://www.youtube.com/channel/{}]\n {}'.format(usr, channelId, msg)
with open(log_file, 'a') as f:
print(log_text, file=f)
print(log_text)
print('start : ', data['items'][0]['snippet']['publishedAt'])
print('end : ', data['items'][-1]['snippet']['publishedAt'])
except:
pass
return data['nextPageToken']
def main(yt_url):
slp_time = 10 #sec
iter_times = 90 #Times
take_time = slp_time / 60 * iter_times
print('{}Scheduled to end in minutes'.format(take_time))
print('work on {}'.format(yt_url))
log_file = yt_url.replace('https://www.youtube.com/watch?v=', '') + '.txt'
with open(log_file, 'a') as f:
print('{}Record the chat field of.'.format(yt_url), file=f)
chat_id = get_chat_id(yt_url)
nextPageToken = None
for ii in range(iter_times):
#for jj in [0]:
try:
print('\n')
nextPageToken = get_chat(chat_id, nextPageToken, log_file)
time.sleep(slp_time)
except:
break
if __name__ == '__main__':
yt_url = input('Input YouTube URL > ')
main(yt_url)
Recommended Posts