[PYTHON] Kaggle House Prices ③ ~ Forecast / Submission ~

Use the model created below to predict test data and submit a submission file. Kaggle House Prices② ~ Model Creation ~

Loading the library

import numpy as np
from sklearn.externals import joblib

Data reading

def load_x_test() -> pd.DataFrame:
    """Read the features of the test data created in advance

    :return:Test data features
    """
    return joblib.load('test_x.pkl')

def load_model(i_fold):
    """Load a pre-made model

    :return:Model of target fold
    """
    return joblib.load(f'model-{i_fold}.pkl')

def load_pred_test():
    """Read the prediction result of the test data created in advance

    :return:Predicted results of test data
    """
    return joblib.load('pred-test.pkl')

Predict test data

#Predict test data by averaging the models of each fold learned by cross-validation
test_x = load_x_test()
preds = []
n_fold = 4

#Make predictions with each fold model
for i_fold in range(n_fold):
    print(f'start prediction fold:{i_fold}')
    model = load_model(i_fold)
    pred = model.predict(test_x)
    preds.append(pred)
    print(f'end prediction fold:{i_fold}')

#Get the mean of the forecast
pred_avg = np.mean(preds, axis=0)

#Saving prediction results
joblib.dump(pred_avg, 'pred-test.pkl')

Create a submission file for submission

pred = load_pred_test()
print(len(pred))
print(load_x_test())
submission = pd.DataFrame(pd.read_csv('test.csv')['Id'])
submission['SalePrice'] = np.exp(pred)
submission.to_csv(
    'submission.csv',
    index=False
)

Submission result

image.png

Recommended Posts

Kaggle House Prices ③ ~ Forecast / Submission ~
Challenge Kaggle [House Prices]
Kaggle ~ House Price Forecast ② ~
Kaggle House Prices ② ~ Model Creation ~
Kaggle House Prices ① ~ Feature Engineering ~
Kaggle ~ Home Price Forecast ~
How to check for missing values (Kaggle: House Prices)
House Prices: Advanced Regression Techniques