DirectLiNGAM in Python

DirectLiNGAM (with bootstrapping) in Python

Notes & memorandums

table of contents

◆ Introduction ◆ Environment ◆ Procedure ◆ 3 variables --Preparation --Data generation --Bootstrap --Confirmation of orientation --Confirmation of DAG ◆ 7 variables --Preparation --Data generation --Bootstrap --Confirmation of orientation --Confirmation of DAG ◆ Reference

Introduction

I estimated the simulation data using the lingam package implemented last time.

LiNGAM in Python https://qiita.com/kumalpha/items/f05bd031cf9daac464a0

environment

OS: Mojave (version; 10.14.6) Python: 3.7.6 JupyterLab: 1.2.6

procedure

  1. Preparation
  2. Data generation
  3. Bootstrap
  4. Confirmation of orientation
  5. Confirmation of DAG

3 variables

Preparation

# DirectLiNGAM
# Import and sets
import numpy as np
import pandas as pd
import graphviz
import lingam
from lingam.utils import make_dot

print([np.__version__, pd.__version__, graphviz.__version__, lingam.__version__])

np.set_printoptions(precision=3, suppress=True)
np.random.seed(0)
['1.18.1', '1.0.1', '0.13.2', '1.2.1']

Data generation

# Create test data
x0 = np.random.uniform(size=10000)
x1 = 3.0*x0 + np.random.uniform(size=10000)
x2 = 5.0*x0 + 0.5*x1 + np.random.uniform(size=10000)
X = pd.DataFrame(np.array([x0, x1, x2]).T,columns=['x0', 'x1', 'x2'])
X.head()
	x0	x1	x2
0	0.758125	2.643633	5.420089
1	0.503319	1.721282	3.439239
2	0.177017	1.007955	2.377846
3	0.832537	2.579844	6.171695
4	0.516825	1.788134	4.235760
# Visualize the test data
m = np.array([[0.0, 0.0, 0.0],
			[3.0, 0.0, 0.0],
			[5.0, 0.5, 0.0]])
make_dot(m)

This causal relationship will be estimated using the bootstrap method.

Bootstrap

model = lingam.DirectLiNGAM()
result = model.bootstrap(X, 3000) # Number of bootstrapping samples
cdc = result.get_causal_direction_counts(n_directions=10, min_causal_effect=0.1)

Confirmation of orientation

from lingam.utils import print_causal_directions
print_causal_directions(cdc, 3000)
x1 <--- x0  (100.0%)
x2 <--- x0  (100.0%)
x2 <--- x1  (100.0%)

Since this time is simple data, the causal relationship is clear at this point.

Confirmation of DAG

In the previous stage, we were just looking at the one-to-one relationship between variables. This time, we will integrate them into a DAG.

dagc = result.get_directed_acyclic_graph_counts(n_dags=5, min_causal_effect=0.1)
from lingam.utils import print_dagc
print_dagc(dagc, 3000)
DAG[0]: 100.0%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 

I was able to estimate the orientation well. I had set it to output up to 5 DAG candidates, but probably because it was 100.0%, the rest were not output.

# Get the probability of bootstrapping.
prob = result.get_probabilities(min_causal_effect=0.1)
print(prob)
[[0. 0. 0.]
 [1. 0. 0.]
 [1. 1. 0.]]

The probability was also output as 1.

7 variables

This causal relationship was estimated with some complexity.

Preparation

# DirectLiNGAM
# Import and sets
import numpy as np
import pandas as pd
import graphviz
import lingam
from lingam.utils import make_dot

print([np.__version__, pd.__version__, graphviz.__version__, lingam.__version__])

np.set_printoptions(precision=3, suppress=True)
np.random.seed(0)
['1.18.1', '1.0.1', '0.13.2', '1.2.1']

Data generation

# Create test data
x0 = np.random.uniform(size=10000)
x6 = np.random.uniform(size=10000)
x1 = -5.0*x0 + np.random.uniform(size=10000)
x2 = -2.5*x0 + 3.0*x1 + np.random.uniform(size=10000)
x5 = 6.0*x6 + np.random.uniform(size=10000)
x3 = 4.0*x2 + 7.0*x5 + np.random.uniform(size=10000)
x4 = 1.0*x1 + 2.0*x2 + 8.0*x6 +np.random.uniform(size=10000)

X = pd.DataFrame(np.array([x0, x1, x2, x3, x4, x5, x6]).T,columns=['x0', 'x1', 'x2', 'x3', 'x4', 'x5', 'x6'])
X.head()
	x0	x1	x2	x3	x4	x5	x6
0	0.548814	-2.351895	-7.669592	3.641327	-10.776980	4.858864	0.748268
1	0.715189	-3.534790	-11.889026	-38.446302	-24.968283	1.292542	0.180203
2	0.602763	-2.090516	-7.601441	-9.739672	-13.753596	2.811044	0.389023
3	0.544883	-2.318181	-7.484214	-27.062919	-16.475002	0.307835	0.037600
4	0.423655	-1.173992	-4.064288	-13.340880	-8.625065	0.308386	0.011788
# Visualize the test data
m = np.array([[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
			[-5.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
			[-2.5, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0],
			[0.0, 0.0, 4.0, 0.0, 0.0, 7.0, 0.0],
			[0.0, 1.0, 2.0, 0.0, 0.0, 0.0, 8.0],
             [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 6.0],
             [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]])
make_dot(m)

This causal relationship will be estimated using the bootstrap method.

Bootstrap

model = lingam.DirectLiNGAM()
result = model.bootstrap(X, 3000) # Number of bootstrapping samples
cdc = result.get_causal_direction_counts(n_directions=10, min_causal_effect=0.1)

Confirmation of orientation

from lingam.utils import print_causal_directions
print_causal_directions(cdc, 3000)
x1 <--- x0  (100.0%)
x2 <--- x0  (100.0%)
x2 <--- x1  (100.0%)
x3 <--- x5  (100.0%)
x4 <--- x1  (100.0%)
x4 <--- x2  (100.0%)
x5 <--- x6  (100.0%)
x4 <--- x6  (97.4%)
x3 <--- x2  (96.1%)
x3 <--- x4  (4.8%)

Confirmation of DAG

dagc = result.get_directed_acyclic_graph_counts(n_dags=5, min_causal_effect=0.1)
from lingam.utils import print_dagc
print_dagc(dagc, 3000)
DAG[0]: 88.5%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 
	x3 <--- x2 
	x3 <--- x5 
	x4 <--- x1 
	x4 <--- x2 
	x4 <--- x6 
	x5 <--- x6 
DAG[1]: 3.9%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 
	x3 <--- x4 
	x3 <--- x5 
	x4 <--- x1 
	x4 <--- x2 
	x4 <--- x6 
	x5 <--- x6 
DAG[2]: 2.7%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 
	x3 <--- x2 
	x3 <--- x5 
	x4 <--- x0 
	x4 <--- x1 
	x4 <--- x2 
	x4 <--- x6 
	x5 <--- x6 
DAG[3]: 2.6%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 
	x3 <--- x2 
	x3 <--- x5 
	x4 <--- x1 
	x4 <--- x2 
	x4 <--- x3 
	x5 <--- x6 
DAG[4]: 0.9%
	x1 <--- x0 
	x2 <--- x0 
	x2 <--- x1 
	x3 <--- x2 
	x3 <--- x4 
	x3 <--- x5 
	x4 <--- x1 
	x4 <--- x2 
	x4 <--- x6 
	x5 <--- x6 

An accurate causal relationship could be estimated at 88.5%.

prob = result.get_probabilities(min_causal_effect=0.1)
print(prob)
[[0.    0.    0.    0.    0.    0.    0.   ]
 [1.    0.    0.    0.    0.    0.    0.   ]
 [1.    1.    0.    0.    0.    0.    0.   ]
 [0.006 0.    0.961 0.    0.048 1.    0.007]
 [0.027 1.    1.    0.026 0.    0.    0.974]
 [0.    0.    0.    0.    0.    0.    1.   ]
 [0.    0.    0.    0.    0.    0.    0.   ]]

reference

・ LiNGAM (ICA version) to understand with mathematical formulas and Python https://qiita.com/k-kotera/items/6d7f5598464e18afaa7c ・ Causal reasoning by structural equation model: Recent developments in causal structure search https://www.slideshare.net/sshimizu2006/bsj2012-tutorial-finalweb ・ LiNGAM in Python https://qiita.com/kumalpha/items/f05bd031cf9daac464a0 ・ LiNGAM docs https://lingam.readthedocs.io/en/latest/index.html ・ Lingam GitHub (examples) https://github.com/cdt15/lingam/tree/master/examples

Recommended Posts

DirectLiNGAM in Python
Quadtree in Python --2
Python in optimization
CURL in python
Geocoding in python
SendKeys in Python
Meta-analysis in Python
Unittest in python
Epoch in Python
Discord in Python
Sudoku in Python
DCI in Python
quicksort in python
nCr in python
N-Gram in Python
Programming in python
Plink in Python
Constant in python
FizzBuzz in Python
Sqlite in python
StepAIC in Python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Reflection in Python
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Quad-tree in Python
Reflection in Python
Chemistry in Python
Hashable in python
LiNGAM in Python
Flatten in python
flatten in python
Sorted list in Python
Daily AtCoder # 36 in Python
Clustering text in Python
Daily AtCoder # 2 in Python
Daily AtCoder # 32 in Python
Daily AtCoder # 6 in Python
Daily AtCoder # 18 in Python
Edit fonts in Python
Singleton pattern in Python
Read DXF in python
Daily AtCoder # 53 in Python
Use config.ini in Python
Daily AtCoder # 33 in Python
Solve ABC168D in Python
Logistic distribution in Python
Daily AtCoder # 7 in Python
LU decomposition in Python
One liner in Python
Simple gRPC in Python
Daily AtCoder # 24 in Python
Solve ABC167-D in Python
Daily AtCoder # 37 in Python