I'm a VTuber limit geek. In December, "Virtual to Live" was held at Ryogoku Kokugikan, and it was also distributed by Nico Nama. The loser couldn't go to the site and even see it in real time, but he couldn't wait until the disc came out. That's why I bought an internet ticket and saw the time shift. Well, let's die. I wanted to save it before I could see the time shift, so I started to analyze Nico Nico with Python in my hand.
By the way, my strength is about "what is websocket?", "Hls?", "Selenium?".
** * Note: Information as of December 17, 2019. The specifications of Nico Nama may change in the future. ** **
The analysis leveraged Chrome's DevTools.
You can display it by pressing F12
on Chrome.
First, let's open a suitable distribution.
If you press F12 after opening the distribution, reload it once.
HLS
On the Network
tab of DevTools, you can see the communication logs for that website. (If you check Disable cache
, the log will not disappear even if the page moves)
For video distribution, I searched for the communication with the largest Size
.
Then
https://{???}.dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Delivery ID} _main_ {Hexadecimal} / 4 / ts / {Numeric} .ts?
I was downloading a large amount of data from a URL like this.
The {number}
seems to be increasing in order.
And with a similar URL
https://{???}.dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Distribution ID} _main_ {Hexadecimal} / 4 / ts / playlist.m3u8?
With the response of
playlist
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:5
(Omission)
Data like this was sent.
When I checked with .m3u8
and .ts
, it seems that the file is used by the protocol called HLS (HTTP Live Streaming)
.
This HLS
seems to be easy to download using ffmpeg
, so all you have to do is get the URL.
Where did you get this URL?
Press Ctrl + F
on the Network
tab of DevTools.
There, I found https: //{???}. Dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Delivery ID} _main_ {Hexadecimal}
Let's search by typing.
Apparently, what comes out is "communication to URL", and "communication with URL" is not found.
As a result of various trials such as "What do you mean?", It seems that the communication in the websocket cannot be searched.
So, search with wss:
.
Then you can find 3 websockets.
4012 /
seems to be sending binary data to each other. Leave it for the time being.
websocket
receives chat information in JSON format from the server. You can leave this as well.
The remaining websocket (here timeshift
) is not obvious at first glance. Let's search for the URL you are looking for.
Then, there was a hit communication.
It seems that this communication is receiving JSON.
I found the URL I was looking for in ʻuri` in this JSON.
Are you still watching? Well, it's natural to say that, but when the server decides "I haven't seen this guy anymore", the URL seems to be invalid. Therefore, it is not possible to say "get the URL and leave it alone".
Then, what should I do ... I came up with the idea
It was a method of "leaving it to Nico Nico".
If you leave the delivery URL open in Selenium, the client will automatically send "I'm still watching" to the server.
In the meantime, ffmpeg
will DL, which is a simple story.
However, this method is fine if you just download it, but I think that it can not be used when it comes to "I want to play it with my own application!", So in that case you need to send "I'm still watching" by yourself. There is.
Also, Nico Nico can only see the delivery in one window at a time (I'm not familiar with Nico Nico, so I don't know the details).
Please note that if you open the distribution during DL, DL will stop.
shell
python dl.py {live_id} {id} {pass}
You can do it with.
dl.py
from selenium import webdriver
import chromedriver_binary
import json
import time
import sys
import subprocess
lid = sys.argv[1]
id = sys.argv[2]
pa = sys.argv[3]
options = webdriver.ChromeOptions()
options.add_argument('--headless')
caps = webdriver.DesiredCapabilities.CHROME
caps['goog:loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(
options=options, desired_capabilities=caps, service_log_path='NUL')
driver.get('https://account.nicovideo.jp/login')
fid = driver.find_element_by_xpath('//*[@id="input__mailtel"]')
fpa = driver.find_element_by_xpath('//*[@id="input__password"]')
fid.clear()
fid.send_keys(id)
fpa.clear()
fpa.send_keys(pa)
fpa.submit()
driver.get('https://live2.nicovideo.jp/watch/' + lid)
# setting_button = driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[3]/button[4]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/div/div/div[2]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/section[2]/ul/div[2]').click()
# time.sleep(3)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/button').click()
time.sleep(3)
log = [json.loads(i['message']) for i in driver.get_log(
'performance') if json.loads(i['message'])['message']['method'] == 'Network.webSocketFrameReceived']
log = [json.loads(i['message']['params']['response']['payloadData'])
for i in log if i['message']['params']['response']['payloadData'][0] == '{']
log = [i['body'] for i in log if 'body' in i.keys()]
uri = ''
quality = 6
for i in log:
if 'command' in i.keys():
if i['command'] == 'currentstream':
if 0 != i['currentStream']['qualityTypes'].index(i['currentStream']['quality']) < quality:
uri = i['currentStream']['uri']
quality = i['currentStream']['qualityTypes'].index(
i['currentStream']['quality'])
if quality == 0:
break
subprocess.run(['ffmpeg', '-i', uri, '-c', 'copy', 'output.mp4'])
driver.quit()
options = webdriver.ChromeOptions()
options.add_argument('--headless')
caps = webdriver.DesiredCapabilities.CHROME
caps['goog:loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(
options=options, desired_capabilities=caps, service_log_path='NUL')
You can run it without displaying the window by adding the --headless
option.
caps ['goog: loggingPrefs'] = {'performance':'ALL'}
is the setting to see the communication log.
I'm not sure about the details.
driver.get('https://account.nicovideo.jp/login')
fid = driver.find_element_by_xpath('//*[@id="input__mailtel"]')
fpa = driver.find_element_by_xpath('//*[@id="input__password"]')
fid.clear()
fid.send_keys(id)
fpa.clear()
fpa.send_keys(pa)
fpa.submit()
Nico students cannot be seen without logging in. Therefore, log in here once.
driver.get('https://live2.nicovideo.jp/watch/' + lid)
# setting_button = driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[3]/button[4]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/div/div/div[2]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/section[2]/ul/div[2]').click()
# time.sleep(3)
# driver.find_element_by_xpath(
# '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/button').click()
time.sleep(3)
log = [json.loads(i['message']) for i in driver.get_log(
'performance') if json.loads(i['message'])['message']['method'] == 'Network.webSocketFrameReceived']
log = [json.loads(i['message']['params']['response']['payloadData'])
for i in log if i['message']['params']['response']['payloadData'][0] == '{']
log = [i['body'] for i in log if 'body' in i.keys()]
uri = ''
quality = 6
for i in log:
if 'command' in i.keys():
if i['command'] == 'currentstream':
if i['currentStream']['qualityTypes'].index(i['currentStream']['quality']) < quality:
uri = i['currentStream']['uri']
quality = i['currentStream']['qualityTypes'].index(
i['currentStream']['quality'])
if quality == 0:
break
You can get the performance log with driver.get_log ('performance')
.
ʻIf json.loads (i ['message']) ['message'] ['method'] =='Network.webSocketFrameReceived'extracts only websocket reception. From there, we'll leave only the ones that are "JSON format and have a'body'". I'm saving the URL with ʻuri = i ['currentStream'] ['uri']
.
Here, I try to get the "URL with the best quality" from the communication.
In the comment out part, we select the highest quality video and stop the video.
I commented this out as it is very likely to change in future updates.
We recommend that you investigate how to use selenium and write it yourself.
subprocess.run(['ffmpeg', '-i', uri, '-c', 'copy', 'output.mp4'])
driver.quit()
I'm running ffmpeg
withsubprocess.run (['ffmpeg','-i', uri,'-c','copy','output.mp4'])
.
ffmpeg
must already be installed.
Close chrome with'driver.quit ()'.
Even if you analyze a web service and publish it to the world, it often becomes unusable due to changes in the service specifications. This time, rather than sharing the "current specifications", I wanted to share "Isn't it possible to analyze the specifications in this way?", So this article was written like this. As I said at the beginning, I share it without knowing anything. If you think "is it strange here?", Please point it out.
Recommended Posts