Python hand play (Pandas / DataFrame beginning)

What is this article?

As I wrote in the previous article, I started Pandas. Considering static storage at shutdown as csv, DataFrame is an easy-to-use on-memory storage location. So, let's record the grammar of SELECT and WHERE in SQL. By the way, I would like to give an example of using PLS.

A word before the code ...

Well, it's a grammar check, so I thought it was a code without a twist, but just a little. I often use the following as a way to easily confirm that "prediction is working".

(formula) y = f(x1, x2) = x1 + x2 * 2

(data)

#7 rows x 3 columns
csvpath = 'SimplePrediction.csv'
# ,y,x1,x2
# 1,5,1,2
# 2,11,3,4
# 3,15,3,6
# 4,16,2,7
# 5,11,1,5
# 6,0,5,2
# 7,0,1,2

Well, you might be asked, "Is this machine learning?", But isn't it annoying for people to manually chase after an example of how NN or SVM worked? It would be easy to check. It is useful. Yes.

code

This is the first one. An example of how to manage a DataFrame.

import pandas as pd

#7 rows x 3 columns
csvpath = 'SimplePrediction.csv'
# ,y,x1,x2
# 1,5,1,2
# 2,11,3,4
# 3,15,3,6
# 4,16,2,7
# 5,11,1,5
# 6,0,5,2
# 7,0,1,2


def main():
    df = pd.read_csv(csvpath)

    print('--Original shape--')
    print(df)

    print('--First 5 lines--')
    print(df[:5])

    print('--Last 2 lines--')
    print(df[-2:])

    print('--2nd row only--')
    print(df.iloc[:, 1:2])

    print('--Last 2 columns--')
    print(df.iloc[:, -2:])

    print('--Save the first 5 rows and the last 2 columns--')
    print(df.iloc[:5, -2:])
    df.iloc[:5, -2:].to_csv('X.csv', index=False)

    print('--Save only the second column of the first 5 rows--')
    print(df.iloc[:5, 1:2])
    df.iloc[:5, 1:2].to_csv('y.csv', index=False)


if __name__ == '__main__':
    main()


This is the second one. Make a model using the above. Make a file. Read it and make a prediction.

import pandas as pd

#5 rows x 2 columns
Xpath = 'X.csv'
# x1,x2
# 1,2
# 3,4
# 3,6
# 2,7
# 1,5

#5 rows x 1 column
ypath = 'y.csv'
# y
# 5
# 11
# 15
# 16
# 11


def get_xy():
    X = pd.read_csv(Xpath)
    y = pd.read_csv(ypath)

    return X, y


def save_model():
    X, y = get_xy()

    #Modeling
    from sklearn.cross_decomposition import PLSRegression
    model = PLSRegression(n_components=2)
    model.fit(X, y)

    #Save
    from sklearn.externals import joblib
    joblib.dump(model, 'pls.pickle')


def use_model():

    X, y = get_xy()

    #Read
    pls = joblib.load('pls.pickle')

    y_pred = pls.predict(X)

    # y = f(x1, x2) = x1 + x2 *Since 2 is prepared, confirm the exact match with PLS
    print(y_pred)


def main():
    # save_model()
    use_model()


if __name__ == '__main__':
    main()


Impressions

Well, especially.

Recommended Posts

Python hand play (Pandas / DataFrame beginning)
Python application: Pandas # 3: Dataframe
Python hand play (division)
Python hand play (two-dimensional list)
Python hand play (argparse minimum code)
[Python] Operation memo of pandas DataFrame
Python hand play (RDKit descriptor calculation: SDF to CSV using Pandas)
[Python] What is pandas Series and DataFrame?
Python hand play (calculated full of mordred)
Python hand play (descriptor calculation: serious version)
Python application: Pandas Part 4: DataFrame concatenation / combination
[Python] Add total rows to Pandas DataFrame
Python hand play (CSV is applied with Pandas DataFrame, but only full-column Insert from CSV to DB?)
My pandas (python)
Play Python async
Play with 2016-Python
python pandas notes
Python3 Beginning Part 1
Python hand play (let's get started with AtCoder?)
[Python pandas] Create an empty DataFrame from an existing DataFrame
Python pandas: Search for DataFrame using regular expressions
Python hand play (one line notation of if)
[python] Create table from pandas DataFrame to postgres
[Python] Sort the table by sort_values (pandas DataFrame)
Python hand play (interoperability between CSV and PostgreSQL)
Basic operation of Python Pandas Series and Dataframe (1)
3D plot Pandas DataFrame
Play youtube in python
Summary of pre-processing practices for Python beginners (Pandas dataframe)
Installing pandas on python2.6
[Python] Summary of table creation method using DataFrame (pandas)
Python hand play (get column names from CSV file)
Python Basic --Pandas, Numpy-
Convert from Pandas DataFrame to System.Data.DataTable using Python for .NET
[Python] Random data extraction / combination from DataFrame using random and pandas
Read csv with python pandas
Python application: Pandas Part 1: Basic
Python application: Pandas Part 2: Series
[Python] Convert list to Pandas [Pandas]
Python pandas strip header space
Formatted display of pandas DataFrame
[Python] Change dtype with pandas
Install pandas 0.14 on python3.4 [on Mac]
[Python] Play with Discord's Webhook.
python pandas study recent summary
Play RocketChat with API / Python
Memorandum @ Python OR Seminar: Pandas
100 Pandas knocks for Python beginners
Beginning with Python machine learning
Data analysis using python pandas
Export pandas dataframe to excel
The Power of Pandas: Python
python / pandas / dataframe / How to get the simplest row / column / index / column
[Python] How to add rows and columns to a table (pandas DataFrame)