[PYTHON] DB table insertion process using sqlalchemy

ex1) Insert multiple records into an arbitrary table (1)

Reluctant at the beginning, but foolish? The record was inserted into the table one row at a time by repeating the process of setting the value of each column in myobject and session.add (myobject) in a for loop.

sql_insert_old.py



Base = sqlalchemy.ext.declarative.declarative_base()

class Hogega_tmp(Base):
    __tablename__ = 'hogega_tmps'
    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    y = sqlalchemy.Column(sqlalchemy.String(10))
    m = sqlalchemy.Column(sqlalchemy.String(10))
    d = sqlalchemy.Column(sqlalchemy.String(10))
    dim4 = sqlalchemy.Column(sqlalchemy.String(100))
    pv = sqlalchemy.Column(sqlalchemy.String(50))
    rsf = sqlalchemy.Column(sqlalchemy.String(10))
    fld = sqlalchemy.Column(sqlalchemy.String(10))

def orm(data,fld):
    
    #SQLAlchemy initialization
    CONNECT_INFO = 'mssql+pyodbc://hogehoge'
    engine = sqlalchemy.create_engine(CONNECT_INFO, encoding='utf-8')

    #Create session
    Session = sqlalchemy.orm.sessionmaker(bind=engine)
    session = Session()

    #With or without random sampling function
    if data["containsSampledData"]:
       rsf=1
    else:
       rsf=0
    cnt=data["totalResults"]
    print(ref)

    #Insert records into the table one by one by loop processing
    if cnt!=0: 
       for i,n in enumerate(data["rows"]): 
           print(i)
           if fld.find("source")!= -1: 
               myobject = Hogega_tmp(y=n[0],m=n[1],d=n[2],dim4=n[3],pv=n[4]
                                     rsf=rsf,fld=fld)
       
           session.add(myobject) #Insert row
           if fld.find("source")!= -1 and i>=24: #Limited number of insertions
                  break
       session.commit() #commit
    session.close() #Session close

Note) In the above, for example, it is assumed that the result of executing the API of Google Analytics is stored in SQL Server.

ex2) Insert multiple records into an arbitrary table (Part 2)

It is not cool to repeat line insertion one by one, or even with SQL, multiple data can be inserted together with one Insert statement, so the following script searched for something that could not be realized with Sqlalchemy.

sql_insert_new.py



def orm2(data,fld):
    
    #SQLAlchemy initialization
    CONNECT_INFO = 'mssql+pyodbc://hogehoge'
    engine = sqlalchemy.create_engine(CONNECT_INFO, encoding='utf-8')

    #Create session
    Session = sqlalchemy.orm.sessionmaker(bind=engine)
    session = Session()

    #With or without random sampling function
    if data["containsSampledData"]:
       rsf=1
    else:
       rsf=0


    clm =["y","m","d","dim4","pv","rsf","fld"]
    ex=[rsf,fld]
    cnt=data["totalResults"]

    if cnt!=0: 
       rws = data["rows"]   
       if fld.find("source")!= -1: 
          rws = rws[:25] #Limited number of insertions

       #Insert row
       lst = [dict(zip(clm,r+ex)) for r in rws ] #value clause generation
       ins_statement = Hogega_tmp.__table__.insert().values(lst)
       session.execute(ins_statement)   
       session.commit()
       session.close()

Supplement

If you write the following "list containing multiple dictionaries" in the argument (lst) part of the Insert statement, you can batch multiple data without repeating the insertion process one by one in the for loop. It seems that you can insert a table row with. (That is, it corresponds to the Values clause of "Insert Into ... Values ..." in SQL)

sql_insert_apdx1.py


[{y:2016,m:5,d:1,dim4:yahoo,pv:100},
 {y:2016,m:5,d:1,dim4:google,pv:200},
    ...
 {y:2016,m:5,d:1,dim4:smartnews,pv:300}]

In the above sql_insert_new.py, lst in list comprehension becomes the following image when the meaning of processing is broken down.

sql_insert_apdx2.py


lst = list()
for r in rws:
    vle = r+ex
    y = dict(zip(clm,vle)
    lst.append(y)

(Vle corresponds to the value stored in the table, clm corresponds to the table column name, and those two sequentials are listed by ZIP () and made into dictionary type y, and the process of sequentially adding to the list lst is repeated in a for loop. Meaning)

Recommended Posts

DB table insertion process using sqlalchemy
DB settings when using Django + SQLAlchemy + Alembic
How to read dynamically generated table definitions using SQLAlchemy
Regression using Gaussian process
Table definition in SQLAlchemy
sqlalchemy table definition tips
Create a dictionary by searching the table using sqlalchemy
Generate SQLAlchemy table definition from existing MySQL server using sqlacodegen
Gaussian process regression using GPy
Get table dynamically with sqlalchemy
SQLAlchemy + DB migration by Alembic
Try using SQLAlchemy + MySQL (Part 1)
Try using SQLAlchemy + MySQL (Part 2)
Process on GPU using chainer.cuda.elementwise