Lognormal probability plot with Python, matplotlib

Introduction

The report contains the following data.

Non-exceedance
probbility
Return period
(years)
Peak discharge
(m3/s)
0.50 2 950
0.80 51650
0.90 102190
0.95 202780
0.98 503610
0.99 1004330
0.995 2005090
0.998 5006190
0.99910007090
Note: based on 2-parameter lognormal distribution.

The stochastic flood discharge seems to be calculated by a two-variable lognormal distribution. Using this data, we will try to do the following.

It will be extrapolated to the data, but it is better than giving the numbers by intuition in the limited information. Well, it's a program practice, so keep an eye out for statistical rigor.

Flow of examination

Preparing for regression analysis

In preparation for regression analysis, we calculate the percentage points corresponding to the non-exceeding probabilities in the standard normal distribution and the common logarithmic value of the flow rate.

ppp=norm.ppf(Pin, loc=0, scale=1)
qqq=np.log10(Qin)

regression analysis

Next, linear regression is performed with the objective variable as the logarithmic value of the flow rate and the explanatory variable as the percentage point of the non-exceeding probability, and the flood flow rate corresponding to the 10,000-year return period is estimated.

Calculation of regression coefficient and correlation coefficient by linear regression using scipy: optimize.leastsq

parameter0 = [0.0,0.0]
result = optimize.leastsq(func,parameter0,args=(ppp,qqq))
aa=result[0][0]
bb=result[0][1]
rr=np.corrcoef(ppp,qqq)

Function for optimize.leastsq (expressed in a format that is equal to 0)

def func(parameter,x,y):
    a = parameter[0]
    b = parameter[1]
    residual = y-(a*x+b)
    return residual

Estimating the 1000-year return period using regression results

pp=0.9999
xq=norm.ppf(pp, loc=0, scale=1)
Qpp=10**(aa*xq+bb)

Plot on lognormal probability paper

Plot on lognormal probability paper to check if it follows the lognormal distribution. Here, the horizontal axis of the graph is the flood flow rate (logarial value), and the vertical axis is the probability value. Therefore, the default axis scale is deleted, and both the horizontal axis and the vertical axis are set independently.

Remove axis tick marks and tick line

plt.tick_params(labelbottom='off')
plt.tick_params(labelleft='off')
plt.tick_params(which='both', width=0)

The graph is saved as a png image, but the margins are deleted in consideration of pasting it on Qiita etc.

plt.savefig('fig_flood_p.png', bbox_inches="tight", pad_inches=0.2)

program

py_qq.py


import numpy as np
from scipy.stats import norm # normal distribution
from scipy import optimize
import matplotlib.pyplot as plt


# function for optimize.leastsq
def func(parameter,x,y):
    a = parameter[0]
    b = parameter[1]
    residual = y-(a*x+b)
    return residual

#==============================
# data analysis
#==============================
# flood discharge data
Qin=np.array([950,1650,2190,2780,3610,4330,5090,6190,7090])
# non-exceedance probability data
Pin=np.array([0.50,0.80,0.90,0.95,0.98,0.99,0.995,0.998,0.999])

ppp=norm.ppf(Pin, loc=0, scale=1)
qqq=np.log10(Qin)

# least square method
parameter0 = [0.0,0.0]
result = optimize.leastsq(func,parameter0,args=(ppp,qqq))
aa=result[0][0]
bb=result[0][1]
rr=np.corrcoef(ppp,qqq)

# estimation of 10,000 years return period flood
pp=0.9999
xq=norm.ppf(pp, loc=0, scale=1)
Qpp=10**(aa*xq+bb)

print('log10(Q) = a * x + b')
print('a={aa:10.6f} b={bb:10.6f} r={rr[0][1]:10.6f}'.format(**locals()))
print('Q={Qpp:10.3f} (pp={pp:0.4f})'.format(**locals()))


#==============================
# plot by matplotlib
#==============================
fig = plt.figure(figsize=(7,8))
xmin=np.log10(100)
xmax=np.log10(20000)
ymin=norm.ppf(0.0001, loc=0, scale=1)
ymax=norm.ppf(0.9999, loc=0, scale=1)
plt.xlim([xmin,xmax])
plt.ylim([ymin,ymax])
plt.tick_params(labelbottom='off')
plt.tick_params(labelleft='off')
plt.tick_params(which='both', width=0)

# data plot
plt.plot(qqq,ppp,'o')
# straight line by regression analysis
plt.plot([xmin, xmax], [(xmin-bb)/aa, (xmax-bb)/aa], color='k', linestyle='-', linewidth=1)

# setting of x-y axes
_dy=np.array([0.0001,0.001,0.01,0.1,0.5,0.9,0.99,0.999,0.9999])
dy=norm.ppf(_dy, loc=0, scale=1)
plt.hlines(dy, xmin, xmax, color='grey')
_dx=np.array([100,200,300,400,500,600,700,800,900,1000,2000,3000,4000,5000,6000,7000,8000,9000,10000,20000])
dx=np.log10(_dx)
plt.vlines(dx, ymin, ymax, color='grey')
fs=16
for i in range(2,5):
    plt.text(float(i), ymin-0.1, str(10**i), ha = 'center', va = 'top', fontsize=fs)
for i in range(0,9):
    plt.text(xmin-0.01, dy[i], str(_dy[i]), ha = 'right', va = 'center', fontsize=fs)
plt.text(0.5*(xmin+xmax), ymin-0.5, 'Flood discharge (m$^3$/s)', ha = 'center', va = 'center', fontsize=fs)
plt.text(xmin-0.25,0.5*(ymin+ymax),'Non-exceedance probability', ha = 'center', va = 'center', fontsize=fs, rotation=90)

# image saving and image showing
plt.savefig('fig_flood_p.png', bbox_inches="tight", pad_inches=0.2)
plt.show()

Output result

$ python3 py_qq.py
log10(Q) = a * x + b
a=  0.282460 b=  2.978647 r=  0.999996
Q= 10693.521 (pp=0.9999)

fig_flood_p.png

Reference site

Scipy statistical functions http://kaisk.hatenadiary.com/entry/2015/02/17/192955

Linear regression case with Scipy http://www2.kaiyodai.ac.jp/~kentaro/materials/new_HP/python/15fit_data3.html

Remove image margins with matplotlib https://mzmttks.blogspot.my/2012/01/pylab-2.html

Recommended Posts

Lognormal probability plot with Python, matplotlib
Create plot animation with Python + Matplotlib
2-axis plot with Matplotlib
Heatmap with Python + matplotlib
Create 3D scatter plot with SciPy + matplotlib (Python)
Stackable bar plot with matplotlib
[Python] How to draw a scatter plot with Matplotlib
A python graphing manual with Matplotlib.
[Python] font family and font with matplotlib
Heatmap with Dendrogram in Python + matplotlib
Continuously color with matplotlib scatter plot
Draw Lyapunov Fractal with Python, matplotlib
When matplotlib doesn't work with python2.7
[Python] Let's make matplotlib compatible with Japanese
FizzBuzz with Python3
Scraping with Python
Statistics with python
Scraping with Python
#Python basics (#matplotlib)
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
python starts with ()
Animation with matplotlib
My matplotlib (python)
Japanese with matplotlib
Plot CSV of time series data with unixtime value in Python (matplotlib)
Bingo with python
Zundokokiyoshi with python
Animation with matplotlib
Histogram with matplotlib
Animate with matplotlib
Excel with Python
Microcomputer with Python
Cast with python
[Python] limit axis of 3D graph with Matplotlib
Read Python csv data with Pandas ⇒ Graph with Matplotlib
Plot ROC Curve for Binary Classification with Matplotlib
1. Statistics learned with Python 2-1. Probability distribution [discrete variable]
Visualize grib2 on a map with python (matplotlib)
[Python] Customize Colormap when drawing graphs with matplotlib
[Scientific / technical calculation by Python] Plot, visualize, matplotlib 2D data with error bars
Draw a line / scatter plot on the CSV file (2 columns) with python matplotlib
Serial communication with Python
Django 1.11 started with Python3.6
Primality test with Python
Python with eclipse + PyDev.
Socket communication with Python
Data analysis with python 2
Scraping with Python (preparation)
Try scraping with Python.
Learning Python with ChemTHEATER 03
"Object-oriented" learning with python
Run Python with VBA
Handling yaml with python
Solve AtCoder 167 with python
Serial communication with python
[Python] Use JSON with Python
Learning Python with ChemTHEATER 05-1
Learn Python with ChemTHEATER