[PYTHON] Display and analyze only some columns in CASTable

SAS Viya is an AI platform. It is available through languages such as Python, Java and R. A table object called CASTable is used in SAS Viya (CAS stands for Cloud Analytic Services). This time, I will introduce how to extract only some columns and display the information in CASTable.

Get a table from the database

First, connect to SAS Viya.

import swat
conn = swat.CAS('server-name.mycompany.com', 5570, 'username', 'password')

Then get the CASTable. This time, I will use CSV of IRIS data.

tbl = conn.loadtable('data/iris.csv', caslib='casuser').casTable

Get the column

You can retrieve the column by specifying the key with tbl as the dict.

col = tbl['sepal_width']
col

The output looks like this:

CASColumn('DATA.IRIS', caslib='CASUSER(username)')['sepal_width'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])

View data

If you use the head method of a column, only the value of that column will be output.

col.head()
0    3.8
1    2.6
2    2.8
3    3.0
4    3.8
Name: sepal_width, dtype: float64

Get multiple columns

Similarly, if you give the key as an array, you can get multiple columns.

widths = tbl[['sepal_width', 'petal_width', 'species']]

The contents are as follows.

sepal_width petal_width species
0 3.8 2.0 virginica
1 2.6 2.3 virginica
2 2.8 2.0 virginica
3 3.0 2.3 virginica
4 3.8 2.2 virginica

You can also check the summary of the data.

widths.describe()
sepal_width petal_width
count 150.000000 150.000000
mean 3.054000 1.198667
std 0.433594 0.763161
min 2.000000 0.100000
25% 2.800000 0.300000
50% 3.000000 1.300000
75% 3.300000 1.800000
max 4.400000 2.500000

The column information is displayed in the same way.

widths.columninfo()
Column ID Type RawLength FormattedLength NFL NFD
0 sepal_width 2 double 8 12 0 0
1 petal_width 4 double 8 12 0 0
2 species 5 varchar 10 10 0 0

Summary

If you get some columns, you can narrow down the analysis to only the ones you need even if the table has many columns. It can be used when there is too much numerical data and you do not know where to analyze.

SAS for Developers | SAS

Recommended Posts

Display and analyze only some columns in CASTable
How to use calculated columns in CASTable
[Python] Swapping rows and columns in Numpy data
Display serial number columns and variables with Bottle template
Read the csv file and display it in the browser