SAS Viya is an AI platform. It is available through languages such as Python, Java and R. A table object called CASTable is used in SAS Viya (CAS stands for Cloud Analytic Services). This time, I will try to get the column information of CASTable by various methods.
First, connect to SAS Viya.
import swat
conn = swat.CAS('server-name.mycompany.com', 5570, 'username', 'password')
Then get the CASTable. This time, I will use CSV of IRIS data.
tbl = conn.loadtable('data/iris.csv', caslib='casuser').casTable
It is OK to get only the column name with for in
.
for col in tbl:
print(col)
The output is as follows.
sepal_length
sepal_width
petal_length
petal_width
species
If you want to get the index in addition to the column name, use the ʻenumerate` function.
for i, col in enumerate(tbl):
print(i, col)
The output is as follows.
0 sepal_length
1 sepal_width
2 petal_length
3 petal_width
4 species
Use the zip
function to get the type in addition to the column name.
for col, dtype in zip(tbl, tbl.dtypes):
print(col, dtype)
The output is as follows.
sepal_length double
sepal_width double
petal_length double
petal_width double
species varchar
Use the ʻiteritemsmethod to get it as a
CASColumn` where you can get more detailed information.
for col, obj in tbl.iteritems():
print(col, obj)
print('')
The output is as follows.
sepal_length CASColumn('DATA.IRIS', caslib='CASUSER(username)')['sepal_length'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])
sepal_width CASColumn('DATA.IRIS', caslib='CASUSER(username)')['sepal_width'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])
petal_length CASColumn('DATA.IRIS', caslib='CASUSER(username)')['petal_length'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])
petal_width CASColumn('DATA.IRIS', caslib='CASUSER(username)')['petal_width'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])
species CASColumn('DATA.IRIS', caslib='CASUSER(username)')['species'].sort_values(['sepal_length', 'sepal_width'], ascending=[False, True])
Next is how to get the data row by row. The first is when using the ʻiterrows` method.
for row in tbl.iterrows():
print(row)
The output is as follows. There are 150 lines in total.
(0, sepal_length 7.9
sepal_width 3.8
petal_length 6.4
petal_width 2
species virginica
Name: 0, dtype: object)
(1, sepal_length 7.7
sepal_width 2.6
petal_length 6.9
petal_width 2.3
species virginica
:
Name: 148, dtype: object)
(149, sepal_length 4.3
sepal_width 3
petal_length 1.1
petal_width 0.1
species setosa
Name: 149, dtype: object)
Next is the case of using the ʻitertuples` method.
for row in tbl.itertuples():
print(row)
The result is as follows, only the value is returned.
(0, 7.9000000000000004, 3.7999999999999998, 6.4000000000000004, 2.0, 'virginica')
(1, 7.7000000000000002, 2.6000000000000001, 6.9000000000000004, 2.2999999999999998, 'virginica')
:
(148, 4.4000000000000004, 3.2000000000000002, 1.3, 0.20000000000000001, 'setosa')
(149, 4.2999999999999998, 3.0, 1.1000000000000001, 0.10000000000000001, 'setosa')
There are various ways to get column information. Please use properly according to your needs.
Recommended Posts