Python hand play (CSV is applied with Pandas DataFrame, but only full-column Insert from CSV to DB?)

What is this article?

It is said that CSV is applied with Pandas DataFrame, but only full column Insert is applied from CSV to DB? So, I made a function to update (UPDATE) DataFrame to DB.

For the first time before, I started using Pandas DataFrame seriously. Then, I knew that to_csv could suddenly convert the value of DB to CSV, and said, "Oh, awesome!". .. ..

Then, what is the opposite? For example, what if you want to export from DB to CSV, and some external tool edits the CSV value and reflects it again?

Well, if you think about it, what do you think about Primary Key? With that in mind, the common specifications that are made into a library by OSS do not support it. Well, maybe it's a story of honestly applying SQL.

So I made a niche process.

In other words, what did you make?

Process to update by giving Pandas DataFrame, primary key, and columns to be updated.

A list type that has one character string type as the primary key and multiple character strings as the update target column group. It feels like creating and committing SQL that specifies the big win record with the character string specified as the primary key and makes the other columns the target column of SET of UPDATE.

Well ... that's it. Well, it's a defeated version, so there are various restrictions.

Restrictions

・ Primary key consists of one column ・ Various columns are fixed (primary key is character, others are float) ・ DB is PostgreSQL ・ Other ...

Completion



def get_connection_string():
    hostname = localhost
    port = '5432'
    database = 'db01'
    username = 'user01'
    password = 'manager'
    return f'postgresql://{username}:{password}@{hostname}:{port}/{database}'


def get_cur():
    conn = psycopg2.connect(get_connection_string())
    cur = conn.cursor()
    return conn, cur
    
#update
# I :Data to be updated
#Primary key column
#Columns to be updated
def update_data(df, column_key, columns):
    import pandas as pd
    conn, cur = get_cur()

    for i in range(len(df)):
        paras = list()
        col_sqls = list()
        for column in columns:
            paras.append(float(df.at[i, column]))
            col_sqls.append(f' {column} = %s ')
        sql = f'''
            UPDATE
                    RDKIT
            SET
                    {','.join(col_sqls)}
            WHERE
                    SampleName = '{df.at[i, 'samplename']}'
        '''
        print(sql)
        print(paras)
        cur.execute(sql, paras)

    conn.commit()


#Write from CSV to DB
def test05():
    import pandas as pd
    df = pd.read_csv('abc.csv')
    # df = pd.read_csv('edited.csv')

    column_key = 'samplename'
    columns = ['maxestateindex', 'maxabsestateindex', 'heavyatommolwt', 'exactmolwt', 'qed', 'molwt']
    update_data(df, column_key, columns)


#Export from DB to CSV
def test06():
    import pandas as pd

    sql = f'''
        SELECT
                SampleName
              , MaxEStateIndex, MaxAbsEStateIndex, HeavyAtomMolWt, ExactMolWt, qed, MolWt
        FROM
                RDKIT
    '''
    conn, cur = get_cur()
    df = pd.read_sql(sql=sql, con=conn)
    df.to_csv('exported.csv')


def main():
    test05()
    # test06()


if __name__ == '__main__':
    main()

Impressions

Hmm ... not good enough. There is a feeling of defeat. Let's think a little more at a later date. .. ..

Recommended Posts

Python hand play (CSV is applied with Pandas DataFrame, but only full-column Insert from CSV to DB?)
Python hand play (Pandas / DataFrame beginning)
Python hand play (RDKit descriptor calculation: SDF to CSV using Pandas)
Read CSV file with Python and convert it to DataFrame as it is
[python] Create table from pandas DataFrame to postgres
Generate an insert statement from CSV with Python.
How to convert JSON file to CSV file with Python Pandas
[Python] A memo to write CSV vertically with Pandas
Python hand play (get column names from CSV file)
Convert from Pandas DataFrame to System.Data.DataTable using Python for .NET
Read csv with python pandas
[Python] Write to csv file with Python
Create folders from '01' to '12' with python
Output to csv file with Python
Convert list to DataFrame with python
Bulk Insert Pandas DataFrame with psycopg2
[Data science basics] I tried saving from csv to mysql with python
[Python / Pandas] A bug occurs when trying to replace a DataFrame with `None` with` replace`
[Python] How to convert db file to csv
[Python] What is pandas Series and DataFrame?
Python to remember only with hello, worlds
Convert from PDF to CSV with pdfplumber
Fractal to make and play with Python
I want to play with aws with python
[Python] Add total rows to Pandas DataFrame
[Part1] Scraping with Python → Organize to csv!
Python> Output numbers from 1 to 100, 501 to 600> For csv
Load csv with pandas and play with Index
How to batch insert from CSV file to Tableau hyper file with PostgreSQL-like COPY command