[PYTHON] A story that analyzed the delivery of Nico Nama.

Introduction

I'm a VTuber limit geek. In December, "Virtual to Live" was held at Ryogoku Kokugikan, and it was also distributed by Nico Nama. The loser couldn't go to the site and even see it in real time, but he couldn't wait until the disc came out. That's why I bought an internet ticket and saw the time shift. Well, let's die. I wanted to save it before I could see the time shift, so I started to analyze Nico Nico with Python in my hand.

By the way, my strength is about "what is websocket?", "Hls?", "Selenium?".

Contents of Nico Nama

** * Note: Information as of December 17, 2019. The specifications of Nico Nama may change in the future. ** **

The analysis leveraged Chrome's DevTools. You can display it by pressing F12 on Chrome. First, let's open a suitable distribution. If you press F12 after opening the distribution, reload it once.

HLS On the Network tab of DevTools, you can see the communication logs for that website. (If you check Disable cache, the log will not disappear even if the page moves) For video distribution, I searched for the communication with the largest Size. Then https://{???}.dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Delivery ID} _main_ {Hexadecimal} / 4 / ts / {Numeric} .ts? I was downloading a large amount of data from a URL like this. The {number} seems to be increasing in order. And with a similar URL https://{???}.dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Distribution ID} _main_ {Hexadecimal} / 4 / ts / playlist.m3u8? With the response of

playlist


#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:5

(Omission)

Data like this was sent. When I checked with .m3u8 and .ts, it seems that the file is used by the protocol called HLS (HTTP Live Streaming). This HLS seems to be easy to download using ffmpeg, so all you have to do is get the URL.

URL acquisition

Where did you get this URL?

Press Ctrl + F on the Network tab of DevTools. There, I found https: //{???}. Dmc.nico/hlsarchive/ht2_nicolive/nicolive-hamster-{Delivery ID} _main_ {Hexadecimal} Let's search by typing. Apparently, what comes out is "communication to URL", and "communication with URL" is not found.

As a result of various trials such as "What do you mean?", It seems that the communication in the websocket cannot be searched.

So, search with wss:. Then you can find 3 websockets. image.png

4012 / seems to be sending binary data to each other. Leave it for the time being. websocket receives chat information in JSON format from the server. You can leave this as well.

The remaining websocket (here timeshift) is not obvious at first glance. Let's search for the URL you are looking for. Then, there was a hit communication. image.png It seems that this communication is receiving JSON. I found the URL I was looking for in ʻuri` in this JSON.

Are you still watching? Well, it's natural to say that, but when the server decides "I haven't seen this guy anymore", the URL seems to be invalid. Therefore, it is not possible to say "get the URL and leave it alone".

Then, what should I do ... I came up with the idea It was a method of "leaving it to Nico Nico". If you leave the delivery URL open in Selenium, the client will automatically send "I'm still watching" to the server. In the meantime, ffmpeg will DL, which is a simple story. However, this method is fine if you just download it, but I think that it can not be used when it comes to "I want to play it with my own application!", So in that case you need to send "I'm still watching" by yourself. There is. Also, Nico Nico can only see the delivery in one window at a time (I'm not familiar with Nico Nico, so I don't know the details). Please note that if you open the distribution during DL, DL will stop.

Write a script in python

code

shell


python dl.py {live_id} {id} {pass}

You can do it with.

dl.py


from selenium import webdriver
import chromedriver_binary
import json
import time
import sys
import subprocess

lid = sys.argv[1]
id = sys.argv[2]
pa = sys.argv[3]


options = webdriver.ChromeOptions()
options.add_argument('--headless')
caps = webdriver.DesiredCapabilities.CHROME
caps['goog:loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(
    options=options, desired_capabilities=caps, service_log_path='NUL')

driver.get('https://account.nicovideo.jp/login')
fid = driver.find_element_by_xpath('//*[@id="input__mailtel"]')
fpa = driver.find_element_by_xpath('//*[@id="input__password"]')
fid.clear()
fid.send_keys(id)
fpa.clear()
fpa.send_keys(pa)
fpa.submit()

driver.get('https://live2.nicovideo.jp/watch/' + lid)
# setting_button = driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[3]/button[4]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/div/div/div[2]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/section[2]/ul/div[2]').click()
# time.sleep(3)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/button').click()

time.sleep(3)
log = [json.loads(i['message']) for i in driver.get_log(
    'performance') if json.loads(i['message'])['message']['method'] == 'Network.webSocketFrameReceived']
log = [json.loads(i['message']['params']['response']['payloadData'])
       for i in log if i['message']['params']['response']['payloadData'][0] == '{']
log = [i['body'] for i in log if 'body' in i.keys()]

uri = ''
quality = 6
for i in log:
    if 'command' in i.keys():
        if i['command'] == 'currentstream':
            if 0 != i['currentStream']['qualityTypes'].index(i['currentStream']['quality']) < quality:
                uri = i['currentStream']['uri']
                quality = i['currentStream']['qualityTypes'].index(
                    i['currentStream']['quality'])
            if quality == 0:
                break

subprocess.run(['ffmpeg', '-i', uri, '-c', 'copy', 'output.mp4'])
driver.quit()

Commentary

chrome settings

options = webdriver.ChromeOptions()
options.add_argument('--headless')
caps = webdriver.DesiredCapabilities.CHROME
caps['goog:loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(
    options=options, desired_capabilities=caps, service_log_path='NUL')

You can run it without displaying the window by adding the --headless option. caps ['goog: loggingPrefs'] = {'performance':'ALL'} is the setting to see the communication log. I'm not sure about the details.

Login

driver.get('https://account.nicovideo.jp/login')
fid = driver.find_element_by_xpath('//*[@id="input__mailtel"]')
fpa = driver.find_element_by_xpath('//*[@id="input__password"]')
fid.clear()
fid.send_keys(id)
fpa.clear()
fpa.send_keys(pa)
fpa.submit()

Nico students cannot be seen without logging in. Therefore, log in here once.

websocket monitoring

driver.get('https://live2.nicovideo.jp/watch/' + lid)
# setting_button = driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[3]/button[4]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/div/div/div[2]').click()
# time.sleep(1)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/div[4]/section[2]/ul/div[2]').click()
# time.sleep(3)
# driver.find_element_by_xpath(
#     '/html/body/div/div/div[4]/div[3]/div/div/div[1]/div[3]/div[1]/div[2]/button').click()

time.sleep(3)
log = [json.loads(i['message']) for i in driver.get_log(
    'performance') if json.loads(i['message'])['message']['method'] == 'Network.webSocketFrameReceived']
log = [json.loads(i['message']['params']['response']['payloadData'])
       for i in log if i['message']['params']['response']['payloadData'][0] == '{']
log = [i['body'] for i in log if 'body' in i.keys()]

uri = ''
quality = 6
for i in log:
    if 'command' in i.keys():
        if i['command'] == 'currentstream':
            if i['currentStream']['qualityTypes'].index(i['currentStream']['quality']) < quality:
                uri = i['currentStream']['uri']
                quality = i['currentStream']['qualityTypes'].index(
                    i['currentStream']['quality'])
            if quality == 0:
                break

You can get the performance log with driver.get_log ('performance'). ʻIf json.loads (i ['message']) ['message'] ['method'] =='Network.webSocketFrameReceived'extracts only websocket reception. From there, we'll leave only the ones that are "JSON format and have a'body'". I'm saving the URL with ʻuri = i ['currentStream'] ['uri']. Here, I try to get the "URL with the best quality" from the communication. In the comment out part, we select the highest quality video and stop the video. I commented this out as it is very likely to change in future updates. We recommend that you investigate how to use selenium and write it yourself.

Save and exit

subprocess.run(['ffmpeg', '-i', uri, '-c', 'copy', 'output.mp4'])
driver.quit()

I'm running ffmpeg withsubprocess.run (['ffmpeg','-i', uri,'-c','copy','output.mp4']). ffmpeg must already be installed. Close chrome with'driver.quit ()'.

Finally

Even if you analyze a web service and publish it to the world, it often becomes unusable due to changes in the service specifications. This time, rather than sharing the "current specifications", I wanted to share "Isn't it possible to analyze the specifications in this way?", So this article was written like this. As I said at the beginning, I share it without knowing anything. If you think "is it strange here?", Please point it out.

Recommended Posts

A story that analyzed the delivery of Nico Nama.
A story that reduces the effort of operation / maintenance
The story of writing a program
A story that struggled to handle the Python package of PocketSphinx
The story of creating a site that lists the release dates of books
The story of blackjack A processing (python)
The story of making a module that skips mail with python
A story that visualizes the present of Qiita with Qiita API + Elasticsearch + Kibana
The story of developing a web application that automatically generates catchphrases [MeCab]
The story of making a package that speeds up the operation of Juman (Juman ++) & KNP
The story of making a lie news generator
The story of making a mel icon generator
The story of sys.path.append ()
The story of making a box that interconnects Pepper's AL Memory and MQTT
The story of making a web application that records extensive reading with Django
The story of Django creating a library that might be a little more useful
Make a BOT that shortens the URL of Discord
# Function that returns the character code of a string
A story that struggled with the common set HTTP_PROXY = ~
Generate that shape of the bottom of a PET bottle
A story about changing the master name of BlueZ
The story that the return value of tape.gradient () was None
Zip 4 Gbyte problem is a story of the past
[Python] A program that compares the positions of kangaroos.
The story of Linux that I want to teach myself half a year ago
The story of building Zabbix 4.4
[Apache] The story of prefork
A tool that automatically turns the gacha of a social game
The story of creating a VIP channel for in-house chatwork
The story of a Django model field disappearing from a class
The story of creating a database using the Google Analytics API
The story of making a question box bot with discord.py
A Python script that compares the contents of two directories
A memo that reproduces the slide show (gadget) of Windows 7 on Windows 10.
When incrementing the value of a key that does not exist
A story stuck with the installation of the machine learning library JAX
The story that the version of python 3.7.7 was not adapted to Heroku
The story of Python and the story of NaN
pandas Fetch the name of a column that contains a specific character
The story that a hash error came out when using Pipenv
A formula that simply calculates the age from the date of birth
A story that verified whether the number of coronas is really increasing rapidly among young people
The story of making a standard driver for db with python.
A function that measures the processing time of a method in python
The story of the release work of the application that Google does not tell
The story of the "hole" in the file
A story that is a little addicted to the authority of the directory specified by expdp (for beginners)
I made a slack bot that notifies me of the temperature
The story of making a tool that runs on Mac and Windows at the game development site
The story of remounting the application server
[python] A note that started to understand the behavior of matplotlib.pyplot
The story of creating a bot that displays active members in a specific channel of slack with python
[Python] A program that rotates the contents of the list to the left
The story of making a slackbot that outputs as gif or png when you send the processing code
[AtCoder for beginners] A story about the amount of calculation that you want to know very roughly
A story about creating a program that will increase the number of Instagram followers from 0 to 700 in a week
The story of a Parking Sensor in 10 minutes with GrovePi + Starter Kit
[Python] A program that calculates the number of chocolate segments that meet the conditions
The story of making a university 100 yen breakfast LINE bot with Python
I made a calendar that automatically updates the distribution schedule of Vtuber
[Python] A program that calculates the number of socks to be paired