[PYTHON] For Pandas users to practice SQL easily

background

Pandas are more accustomed to it and get stuck in SQL at work ⇨ I want an environment to practice easily (local, Python) ⇨pandasql

Remarks

Install package

pip install pandasql

Code example

Just put the variable name of the data frame in the table name and write the SQL You can issue SQL for data frames that you always touch with Pandas

import pandas as pd
from pandasql import sqldf, load_meat, load_births


# get data
df_meat = load_meat()
#df_births = load_births()


# check data (if you want)
if False: # just check
    df_meat.shape
    df_meat.head(2).T
    df_meat.dtypes
    df_meat.duplicated().sum()
    df_meat.isnull().sum()
    df_meat.nunique()
    desc = df_meat.describe().T
    desc[['min','25%','50%','75%','max']]
    desc[['mean','std']]


# sql scripts 1
sql = '''
    SELECT
        *
    FROM  
        df_meat
    LIMIT 
        10;
'''
# execute sql 1
res = sqldf(sql, locals())
res


# sql scripts 2
sql = '''
    SELECT
        other_chicken,
        avg(beef) as avg_beef
    FROM  
        df_meat
    GROUP BY
        other_chicken 
    ORDER BY
        avg_beef DESC
    LIMIT
        10
    ;
'''
# execute sql 2
res = sqldf(sql, locals())
res

Reference link

pypi

Recommended Posts

For Pandas users to practice SQL easily
Convert 202003 to 2020-03 with pandas
Library "apywrapper" to easily develop a wrapper for RESTful API
Convert from Pandas DataFrame to System.Data.DataTable using Python for .NET
Easy-to-understand [Pandas] practice / data confirmation method for high school graduates