[PYTHON] Statistical interpretation of Anigasaki character turn time

background

Last week, Nijigasaki Gakuen School Idol Club TV Anime ** 1st Season ** ended, and I was watching the impressions and discoveries of other otaku on Twitter and threads, but I got an interesting lesson with a certain thread on the love live board. discovered.

122 A story that comes true without a name (Takoyaki) 2020/12/28 (Mon) 02: 09: 09.16 ID: s1PSbHRM I tried to convert the number of seconds with reference to the data of \ >> 1 and csv separated by commas Please analyze other people

[1] I tried to find out how biased the appearance of Nijigasaki animation is

A fierce man who aggregated the conversation time of each character and posted it in CSV format appeared during the animation work. Humans in the information science and engineering field are ill to analyze when they see the data string posted in CSV format, so I tried to process the data with Python as usual. Since it is left as a memo, there is almost no explanation. (If you feel like it, you may summarize the principles and features of each analysis method.)

Analysis content

Obvious if you look at the source.

Implementation

~~ It was troublesome ~~ I implemented it with Seaborn so that even beginners can easily understand it. The code is as follows.

plot.py


# -*- coding: utf-8 -*-

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import rcParams


# import csv files.
df_data = pd.read_csv('data.csv', index_col=0)
df_clr = pd.read_csv('color.csv')

# convert to minutes.
df_data = df_data/60

# general plot settings
rcParams['font.family'] = 'sans-serif'
rcParams['font.sans-serif'] = ['Hiragino Maru Gothic Pro', 'Yu Gothic', 'Meirio',
                               'Takao', 'IPAexGothic', 'IPAPGothic', 'VL PGothic', 'Noto Sans CJK JP']
sns.set_palette(df_clr['Color'])

# heatmap part
plt.figure()
g1 = sns.heatmap(df_data.T, square=True, cbar_kws={
                 'label': 'Conversation time(Minutes)'}, cmap='viridis')
g1.set(xlabel='Talk', ylabel='character')
plt.savefig('heatmap.png', format="png", dpi=300, bbox_inches="tight")

# clustering part
g2 = sns.clustermap(df_data, metric='correlation',
                    z_score=0, cmap='viridis', cbar_kws={'label': 'Row Z-score'})
plt.savefig('clustermap.png', format="png", dpi=300, bbox_inches="tight")

# Allign the dataframe for violin plot
df_melt = df_data.melt()

# violin plot part
ax = plt.subplots(figsize=(12, 9))
ax = sns.violinplot(x='variable', y="value", data=df_melt,
                    inner="quartile", color="0.85")
sns.swarmplot(x=df_melt['variable'], y=df_melt['value'])
ax.set(xlabel='character', ylabel='Conversation time(Minutes)')
plt.savefig('violin.png', format="png", dpi=300, bbox_inches="tight")

# bar plot part
ax = plt.subplots(figsize=(12, 9))
ax = sns.barplot(x='variable', y="value", data=df_melt, capsize=.2)
ax.set(xlabel='character', ylabel='Conversation time(Minutes)')
plt.savefig('bar.png', format="png", dpi=300, bbox_inches="tight")

The CSV file being read is data.csv with reference to the thread and color.csv that summarizes the color information of the character. The contents of color.csv are as follows.

No Member Color
1 Yu Takasaki #BCBCBC
2 Ayumu Uehara #F2A6B7
3 Kasumi Nakasu #DFD00D
4 Sakurazaka drop #01B7ED
5 Asaka Fruit Forest #485EC6
6 Ai Miyashita #FF5800
7 Beyond Omi #A664A0
8 Yuuki Setuna #D81C2F
9 Emma Verde #84C36E
10 Tennoji Rina #9CA5B9

Execution result

The output figures are the following four types. violin.jpgbar.jpg heatmap.jpg clustermap.jpg

Recommended Posts

Statistical interpretation of Anigasaki character turn time
measurement of time
Measurement of execution time