This article is the 9th day article of Furukawa Lab Advent_calendar. This article was written by a student at Furukawa Lab as part of his studies. The content may be ambiguous or the expression may be slightly different.

Introduction

I wanted to draw an article to introduce Beverage Preference Data Set, but the program is not running yet. , I will edit it from time to time.

Beverage Preference Data Set

Beverage Preference Data Set is the actual data of related data published by Furukawa Laboratory. Please refer to the link for detailed rules.

Data from a survey of 604 users on how to evaluate 14 types of drinking water in each of 11 situations.

In other words, it is the relational data observed by the combination of the elements of the three populations (person) x (drinking water) x (situation).

import The steps to import the Beverage Preference Data Set are as follows: The download_file and zip_extract methods Python Tips: I want to download a zip file from the Internet and use it I borrowed from.

import pandas as pd
import numpy as np

filename = download_file('http://www.brain.kyutech.ac.jp/~furukawa/beverage-e/BeveragePreferenceDataset.zip')
zip_extract(filename)
df = pd.read_table('./BeveragePreferenceDataset/Beverage604.txt', header=None, delim_whitespace=True)
df.shape
# (8456, 11)

Convert this Dataframe to 3rd order tensor data.


X = np.zeros((604, 14, 11))
for i in range(X.shape[0]):
  Before = i * 14
  X[i] = df.iloc[Before:(14*(i+1))].values
X.shape
# (604, 14, 11)

Tensor decomposition

CP decomposition

About CP decomposition Pioneer (tensor decomposition with pytorch (CP decomposition)) is here, so I will explain it lightly.

CP decomposition is a straightforward generalization of matrix factorization, which decomposes the cubic tensor $ X $ using three vectors as follows.

X = \sum_{r=1}^R u_r \circ v_r \circ w_r

R=1

R=2

U (user) is sprayed in an oval shape, and V (drinking water) is likely to be different from the others by only two types.

Summary

I'd like to try HOSVD and Tucker as well. I'll try again when I have time. This time I tried a linear tensor decomposition method, but there is also a * Tensor SOM * that corresponds to a non-linear tensor decomposition. If you are interested, please try playing with the link below.

TensorSOM3 Viewer (drinking water data) ver Japanese

Recommended Posts

I tried to visualize the Beverage Preference Dataset by tensor decomposition.

I tried to visualize the spacha information of VTuber

[Python] I tried to visualize the follow relationship of Twitter

[TF] I tried to visualize the learning result using Tensorboard

I tried to explain Pytorch dataset

I tried to move the ball

I tried to estimate the interval.

I tried to summarize the commands used by beginner engineers today

I tried to predict by letting RNN learn the sine wave

I tried to visualize Boeing of violin performance by pose estimation