I usually study only machine learning by myself, but I often see people who need knowledge of sql on twitter etc. So, by learning little by little in parallel, I would like to make a function from the point of extracting data first.
In this article we will:
I'm sorry if you already have it. SQL is a complete beginner so please take a look (-_-;)
import sqlite3
import pandas as pd
import pandas.io.sql as sql
Create a trial DB using pandas. I would like to create a dataframe as follows. The value is appropriate.
city = ["tokyo","kyoto","oosaka"]
num = [30,20,28]
df_1 = pd.DataFrame({"city":city,"number_of_city":num})
path = "tamesi.db"
conn = sqlite3.connect(path)
cur = conn.cursor()
sql.to_sql(df_1,"study",conn,if_exists="replace",index=None)
cur.close()
con.close()
This time, I created it in a database called tamesi with a table called study.
I made.
def ret_column(db,table,column=""):
assert bool(column), "no column is spcified"
conn= sqlite3.connect(db)
cur = conn.cursor()
cur.execute("select {} from {}".format(column,table))
val = cur.fetchall()
cur.close()
conn.close()
return val
Put what you want to take out in the column and take it out. I tried to return an error if there is nothing in the column. All I'm doing is simply calling the sql statement with db and table specified.
Let's create a function that retrieves all the data. I made.
def ret_all(db,table):
conn = sqlite3.connect(db)
cur = conn.cursor()
cur.execute("select * from {}".format(table))
val = cur.fetchall()
cur.close()
conn.close()
return val
This was easier because there was no column designation.
path = "tamesi.db"
table="study"
val_1 = ret_column(path,table,column="city")
val_2 = ret_all(path,table)
print(val_1)
print(val_2)
#Output below
[('tokyo',), ('kyoto',), ('oosaka',)]
[('tokyo', 30), ('kyoto', 20), ('oosaka', 28)]
I was able to confirm that it was taken out properly.
I tried to implement it easily with class.
class ret_from_db():
def __init__(self,db):
self.db = db
def ret_column(self,table,column=""):
assert bool(column), "no column is spcified"
conn= sqlite3.connect(self.db)
cur = conn.cursor()
cur.execute("select {} from {}".format(column,table))
val = cur.fetchall()
cur.close()
conn.close()
return val
def ret_all(self,table):
conn = sqlite3.connect(self.db)
cur = conn.cursor()
cur.execute("select * from {}".format(table))
val = cur.fetchall()
cur.close()
conn.close()
return val
I just stuck the top two together, nothing special.
get_db = ret_from_db(db="tamesi.db")
val_1 = get_db.ret_column(table="study",column="city")
val_2 = get_db.ret_all(table="study")
print(val_1)
print(val_2)
#Output below
[('tokyo',), ('kyoto',), ('oosaka',)]
[('tokyo', 30), ('kyoto', 20), ('oosaka', 28)]
I was able to do it properly.
I don't know if this is practical because I haven't used it (-_-;) Next time, if it can be read by pandas, it will be easier to use for sklean etc., so I would like to make it.
Recommended Posts