Preface

One day meeting, Kyo "T-kun, can't you fit this graph a little better?" "Can't we change the fit range or weight it?" T "... I'll try"

He wants to modify the fitting. It is often required to change the analysis procedure on a whim, and it is difficult to deal with it. I want to be able to do it on the spot.

This time, instead of scipy.optimize.curve_fit, I will introduce a new library called lmfit. Fitting using scipy was performed in Analysis of measurement data ①-Memorandum of understanding for scipy fitting-. Coding is done on Jupyter & Jupyter recommended.

I hope I can introduce lmfit well.

GithHub also has a notebook and sample data. From here

About lmfit

lmfit is developed as upward compatible with scipy.optimize. You can constrain parameters and weight errors when fitting. It seems that it works like adding various information to the object and finally executing the fit command to optimize it. I think it's more convenient than scipy once you get used to it.

This experiment

Pulse data was obtained from the measuring instrument. First, make a histogram with the peak value of the pulse. After that, fitting is performed and the sharpness of the spectrum is evaluated. In addition, we will try to improve the sharpness by using the lmfit weighting option.

Data reading

Read the pulse data swept from the measuring instrument with pandas. Again, use tkinter to connect the file paths.

`.ipynb`


import tkinter 
from tkinter import filedialog
import numpy as np
import pandas as pd

root = tkinter.Tk()
root.withdraw()
file = filedialog.askopenfilename(
title='file path_',
filetypes=[("csv","csv")])
if file:
    pulse = pd.read_csv(file,engine='python',header=0,index_col=0)

pulse.iloc[:,[0,1,2]].plot()

Let's plot only 3 data. It was swept out so that the maximum peak value could be obtained near the 50th point.

Histogram creation

Create a histogram with the maximum peak value. When creating a histogram, you need to determine the step size on the horizontal axis. This time, according to the Sturges formula, the number of bins is set to: k = 1 + Log2 (n).

* Np.histogram returns width as many as the frequency (count).

`.ipynb`


bins = int(1+np.log2(pulse.shape[1]))#Sturges official
count, level = np.histogram(pulse.max(),bins=bins)
level = np.array(list(map(lambda x:(level[x]+level[x+1])/2,range(level.size-1))))
#Align frequency and size
level = np.array(list(map(lambda x:(level[x]+level[x+1])/2,range(level.size-1))))
import matplotlib.pyplot as plt
plt.plot(level,count,marker='.',linestyle='None')
plt.show()

There was a little mountain on the right side. Let's separate the data after 50 and make a distribution again for the mountain on the right side.

`.ipynb`


mask = pulse.max()>50#Create an array of bool values
pulse = pulse.loc[:,mask]#Masking pulse data

bins = int(1+np.log2(pulse.shape[1]))#Sturges official
count, level = np.histogram(pulse.max(),bins=bins)
level = np.array(list(map(lambda x:(level[x]+level[x+1])/2,range(level.size-1))))
plt.plot(level,count,marker='o',linestyle='None')
plt.show()

Something like a normal distribution has appeared.

Fitting with lmfit

As you can see from import ~, lmfit uses some class objects for fitting. The fitting function is Gaussian and is as follows. The center of the peak is given by $ \ mu $, and the variation in distribution is given by $ \ sigma . amp is the amplitude term. $f_{\left(x\right)}=\frac{amp}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{\left(x-\mu \right)^2}{2\sigma^2}\right)$$ First, give a function that fits the Model class. Next, create a Parameters class and give it an initial value. The Model class has a .make_params () method, which creates a Parameters object without permission after the second argument of the given function.

`.ipynb`


from lmfit import Model, Parameters, Parameter
def gaussian(x,amp,mu,sigma):
    out = amp / (sigma * np.sqrt(2 * np.pi)) * np.exp(-(x - mu)**2 /(2 * sigma**2))
    return out
model = Model(gaussian)#Substitute the fitting function and create a model object
pars = model.make_params()#Parameters object generation
pars

The created Parameters object is a set of Parameter objects. Each parameter can be accessed like a dictionary. From there with the .set method ・ Initial value value ・ Upper and lower limits of parameters, max, min: -np.inf ~ np.inf -Whether to change the value before and after fitting (constraint) vary = True or False You can play with each of them.

`.ipynb`


pars['mu'].set(value=70)
pars['amp'].set(value=1)
pars['sigma'].set(value=1)
pars

It's very rough, but I gave the initial value. The parameters can be set in detail and it is displayed in a DataFrame-like table. It's cool ...

Perform fitting

`.ipynb`


result = model.fit(data=count,params=pars,x=level)
result

The result is displayed like this. It seems that a new ModelResult object is created when .fit is executed. The generated object had a .plot_fit () method. Visualize with this function.

You can retrieve the optimized parameters in a dictionary by using the .best_values method.

`.ipynb`


opt = result.best_values
opt['mu']/opt['sigma']

The sharpness was evaluated by $ \ frac {\ mu} {\ sigma} $. It was 23.67.

I tried changing the step size on the horizontal axis

・ Scott's official $h = \frac{3.49\sigma}{ n^\frac{1}{3} }$ I tried to define the step size with. Since it's a big deal, I will use the one I found in the fitting for $ \ sigma $.

`.ipynb`


h = 3.49 * opt['sigma'] / (pulse.shape[1] **(1/3))#Scott's official
r = pulse.max().max()-pulse.max().min()
bins = int(r/h)

count, level = np.histogram(pulse.max(),bins=bins)
level = np.array(list(map(lambda x:(level[x]+level[x+1])/2,range(level.size-1))))
plt.plot(level,count,marker='o',linestyle='None')
plt.show()

The step size has become narrower, and the maximum value of the fraction has become smaller, but the shape of the distribution seems to be okay. Fit by exchanging only the values of data and x to be substituted. Sharpness is ... 24.44. Improved by 0.77.

I tried to weight the error

This is the case from here.

He wanted to strengthen the range of $ \ mu \ pm \ sigma $ and fit weights less than 63 to 0. It seems that weighting can be done by specifying the weights option in .fit.

`.ipynb`


s = (level>opt['mu']-opt['sigma']) * (level<opt['mu']+opt['sigma'])
weight = (np.ones(level.size) + 4*s)*(level>63)
result3 = model.fit(data=count,params=pars,x=level,weights=weight)

↓ Weight & result It doesn't look much different, but if you look closely, it feels like it's being pulled down a little. The sharpness is ... 25.50. Only 0.06 became sharper.

Post-fit points can be called as an array type using the .best_fit method. It was convenient when making csv. Safely solve the problem ...

Summary

This time, the parameters were automatically generated with Model.make_params (), but there is also a method to directly generate an empty class with Parameters () and add variables to it with the .add method.

I was confused at first because I used multiple objects for fitting. It was easy to get confused about where and which method corresponds. However, as you get used to it, you will appreciate Parameter (s).

The fact that there were few sites that explained lmfit in Japanese was also the reason I started writing this article. Is there a better library ...? I would appreciate it if you could teach me. In the future, juniors will not be in trouble. It was the first time I wrote the code while reading the official reference (laughs). There is no choice but to do it because of the research results.

In the future, I would like to learn a little more technology that seems to be connected to my life, and I also want to be able to utilize the technology I have in various fields. I want to be creative.

That's all for python-related progress. I couldn't help reporting it to the professor (no one is doing python), so I would like to take this opportunity to report it. I am deeply grateful to the professor for waiting patiently for my data analysis.

[PYTHON] Analysis of measurement data ②-Histogram and fitting, lmfit recommendation-

Preface

About lmfit

This experiment

Data reading

.ipynb

Histogram creation

* Np.histogram returns ** width ** as many as the frequency (count).

.ipynb

.ipynb

Fitting with lmfit

.ipynb

.ipynb

.ipynb

.ipynb

I tried changing the step size on the horizontal axis

.ipynb

I tried to weight the error

.ipynb

Summary

`.ipynb`

* Np.histogram returns width as many as the frequency (count).

`.ipynb`

`.ipynb`

`.ipynb`

`.ipynb`

`.ipynb`

`.ipynb`

`.ipynb`

`.ipynb`