[PYTHON] Extract the Azure SQL Server data table with pyodbc and try to make it numpy array / pandas dataframe

From the place where you can pull out the data table in Azure SQL Server with the python library called pyodbc, you can convert the table to numpy's ʻarray format or pandas's DataFrame` format. This is a memo.

Definition of library, environment and function is written here. ,, So please refer to that first.

Preparation for array / DataFrame format

## libraries
import pyodbc
import numpy as np
import pandas as pd

## initial setting
##Please change the set value as appropriate
server = 'abc_server.database.windows.net'  
database = 'abc_database'  
username = 'abc_user'  
password = 'abc_password'  

##Function definition
###Define DB connection
def db_connection(sv=server, db=database, un=username, pw=password):    
    cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+sv+';DATABASE='+db+';UID='+un+';PWD='+ pw)
    return cnxn.cursor()

###Issue SQL
def query_output(sql):
    cursor.execute(sql)
    row = cursor.fetchone()
    while row:  
        print row[0]  
        row = cursor.fetchone()

→ For the meaning of the settings, see Trial Premise Etc.

Case 2: Push the contents of the table into the pandas data frame

###DB connection
cursor = db_connection()

###SQL settings
sql = 'select * from SalesLT.Customer;'

cursor.execute(sql)
rows = cursor.fetchall()

Check the contents of rows around here:

>>> rows

[(1, False, u'Mr.', u'Orlando', u'N.', u'Gee', None, u'A Bike Store', u'adventure-works\\pamela0', u'[email protected]', u'245-555-0173', u'L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w=', u'1KjXYs4=', u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F', datetime.datetime(2005, 8, 1, 0, 0)),
 (2, False, u'Mr.', u'Keith', None, u'Harris', None, u'Progressive Sports', u'adventure-works\\david8', u'[email protected]', u'170-555-0127', u'YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw=', u'fs1ZGhY=', u'E552F657-A9AF-4A7D-A645-C429D6E02491', datetime.datetime(2006, 8, 1, 0, 0)),
 (3, False, u'Ms.', u'Donna', u'F.', u'Carreras', None, u'Advanced Bike Components', u'adventure-works\\jillian0', u'[email protected]', u'279-555-0130', u'LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk=', u'YTNH5Rw=', u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D', datetime.datetime(2005, 9, 1, 0, 0)),
 :

It's like an array inside an array (likely, in the exact format pyodbc.Row), so I will try plunging into numpy's array

array_format = np.array(rows)
dataframe_format = pd.DataFrame(array_format)

Then the numpy ʻarray` format is ...


>>>array_format
array([[1, False, u'Mr.', ..., u'1KjXYs4=',
        u'3F5AE95E-B87D-4AED-95B4-C3797AFCB74F',
        datetime.datetime(2005, 8, 1, 0, 0)],
       [2, False, u'Mr.', ..., u'fs1ZGhY=',
        u'E552F657-A9AF-4A7D-A645-C429D6E02491',
        datetime.datetime(2006, 8, 1, 0, 0)],
       [3, False, u'Ms.', ..., u'YTNH5Rw=',
        u'130774B1-DB21-4EF3-98C8-C104BCD6ED6D',
        datetime.datetime(2005, 9, 1, 0, 0)],
       ..., 
       [30116, False, u'Ms.', ..., u'HDCU1Bk=',
        u'EC409609-D25D-41B8-9D15-A1AA6E89FC77',
        datetime.datetime(2007, 7, 1, 0, 0)],
       [30117, False, u'Mr.', ..., u'iES3IZA=',
        u'6F08E2FB-1CD3-4F6E-A2E6-385669598B19',
        datetime.datetime(2005, 8, 1, 0, 0)],
       [30118, False, u'Ms.', ..., u'QhHP+y8=',
        u'2495B4EB-FE8B-459E-A1B6-DBA25C04E626',
        datetime.datetime(2006, 9, 1, 0, 0)]], dtype=object)
>>> array_format.shape
(847L, 15L)

Also, the DataFrame of pandas


>>>dataframe_format.head()
  0      1    2        3     4           5     6                           7   \
0  1  False  Mr.  Orlando    N.         Gee  None                A Bike Store   
1  2  False  Mr.    Keith  None      Harris  None          Progressive Sports   
2  3  False  Ms.    Donna    F.    Carreras  None    Advanced Bike Components   
3  4  False  Ms.    Janet    M.       Gates  None       Modular Cycle Systems   
4  5  False  Mr.     Lucy  None  Harrington  None  Metropolitan Sports Supply   

                         8                             9             10  \
0   adventure-works\pamela0  [email protected]  245-555-0173   
1    adventure-works\david8    [email protected]  170-555-0127   
2  adventure-works\jillian0    [email protected]  279-555-0130   
3  adventure-works\jillian0    [email protected]  710-555-0173   
4      adventure-works\shu0     [email protected]  828-555-0186   

                                             11        12  \
0  L/Rlwxzp4w7RWmEgXX+/A7cXaePEPcp+KwQhl2fJL7w=  1KjXYs4=   
1  YPdtRdvqeAhj6wyxEsFdshBDNXxkCXn+CRgbvJItknw=  fs1ZGhY=   
2  LNoK27abGQo48gGue3EBV/UrlYSToV0/s87dCRV7uJk=  YTNH5Rw=   
3  ElzTpSNbUW1Ut+L5cWlfR7MF6nBZia8WpmGaQPjLOJA=  nm7D5e4=   
4  KJqV15wsX3PG8TS5GSddp6LFFVdd3CoRftZM/tP0+R4=  cNFKU4w=   

                                     13                   14  
0  3F5AE95E-B87D-4AED-95B4-C3797AFCB74F  2005-08-01 00:00:00  
1  E552F657-A9AF-4A7D-A645-C429D6E02491  2006-08-01 00:00:00  
2  130774B1-DB21-4EF3-98C8-C104BCD6ED6D  2005-09-01 00:00:00  
3  FF862851-1DAA-4044-BE7C-3E85583C054D  2006-07-01 00:00:00  
4  83905BDC-6F5E-4F71-B162-C98DA069F38A  2006-09-01 00:00:00  

>>> dataframe_format.shape
(847, 15)

It looks like it's okay.

Reference -Try to extract Azure SQL Server data table with pyodbc

Recommended Posts

Extract the Azure SQL Server data table with pyodbc and try to make it numpy array / pandas dataframe
Try to extract Azure SQL Server data table with pyodbc
Try to extract the features of the sensor data with CNN
[Personal memo] Get data on the Web and make it a DataFrame
Try converting to tidy data with pandas
(Diary 1) How to create, reference, and register data in the SQL database of Microsoft Azure service with python
Try to aggregate doujin music data with pandas
Try to extract Azure document DB document with pydocumentdb
Try to create a battle record table with matplotlib from the data of "Schedule-kun"
I just wanted to extract the data of the desired date and time with Django
It is easy to execute SQL with Python and output the result in Excel
How to extract null values and non-null values with pandas
How to extract non-missing value nan data with pandas
Extract the maximum value with pandas and change that value
How to extract non-missing value nan data with pandas
Return the image data with Flask of Python and draw it to the canvas element of HTML
Try to make it using GUI and PyQt in Python
[pyqtgraph] Add region to the graph and link it with the graph region
Access the host SQL Server with python27 / pyodbc on the container
Put the lists together in pandas to make a DataFrame
Build a Python environment and transfer data to the server
Extract array elements and indexes in descending order with numpy
Tokyo Corona: Try to make a simple prediction from open data with the exponential function curve_fit
[Python] How to add rows and columns to a table (pandas DataFrame)
Try to solve the shortest path with Python + NetworkX + social data
Read the data of the NFC reader connected to Raspberry Pi 3 with Python and send it to openFrameworks with OSC
Upload data to s3 of aws with a command and update it, and delete the used data (on the way)