RegressionAutomaticSearch
I created a program that performs regression analysis with a model whose parameters have been changed by arbitrary machine learning. Now let's find the best model and parameters. This time, I will try to predict the house price in Boston. The model uses LinearRegression, DecisionTree, RandomForest, and AdaBoost.
Create venv
C:\RegressionAutomaticSearch>py -m venv venv
Apply venv
C:\RegressionAutomaticSearch>.\venv\Scripts\activate.bat
(venv) C:\RegressionAutomaticSearch>
Package update
(venv)C:\RegressionAutomaticSearch>python -m pip install --upgrade pip
Bulk installation of required packages
(venv) C:\RegressionAutomaticSearch>pip install -r requirements.txt
Change to the path of the dataset you want to regress.
###########################
# read datasets
#
#If there is an index on the far left
df = pd.read_csv('./datasets/boston_datasets.csv', index_col=0)
The contents are like this. Reads a file with the same explanatory variable and objective variable.
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT MONEY
0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0 15.3 396.90 4.98 24.0
1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0 17.8 396.90 9.14 21.6
2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0 17.8 392.83 4.03 34.7
3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0 18.7 394.63 2.94 33.4
4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0 18.7 396.90 5.33 36.2
Drop the explanatory variable from the data frame.
#To the explanatory variable"Everything except MONEY"use
boston_X = df.drop("MONEY", axis=1)
X = boston_X.values
Adjust the parameters in the list passed to the model.
def model_import(self):
models_names = [ self.LinearRegression(),
self.DecisionTreeRegressor(list(range(2, 30, 2))),
self.RandomForestRegressor(list(range(2, 30, 2)), list(range(20, 200, 20))),
self.AdaBoostRegressor(list(range(20, 200, 20)))]
models = []
names = []
for model_, name_ in models_names:
if isinstance(model_, list):
models.extend(model_)
names.extend(name_)
else:
models.append(model_)
names.append(name_)
return models, names
Execute main.py.
(venv) C:\RegressionAutomaticSearch>python main.py
The result image, error, and coefficient of determination csv are output to the result folder.
Put the source code on github at the link below. There are still more areas to adjust, so I will update it from time to time.
https://github.com/upamasaki/RegressionAutomaticSearch
Recommended Posts