Field order nodes change the order of columns in SPSS Modeler. Let's rewrite this with Python pandas.
This is done using the following time-series sensor data.
■COND2n.csv Time: Uptime Power: Power Temperature: Temperature Pressure: Pressure Uptime: Uptime Status: Status code Outcome: error code
Column order of "COND2n.csv" Time,Power,Temperature,Pressure,Uptime,Status,Outcome From the order of Time,Uptime,Power,Temperature,Pressure,Outcome,Status Change to.
Select Time and Uptime in the field order and set them to the top two. Next, set Outcome and Status to the two from the bottom.
The point is that there is an item "----- [Other fields] ------", and the other fields are automatically placed while maintaining the column order in the base table.
It is often used when you want to handle the same type of data such as Power, Temperature, Pressure as a set. This time there are only 3 columns, but when there are a lot of columns, it is very convenient because you do not have to specify 1 column or 1 column for all columns. Also, if a column for the sensor is added, it does not need to be modified.
There are several possible ways. The first simple way is to give a list of all columns after the change.
#Swap columns 1: List all columns
df1_1=df1[['Time','Uptime','Power','Temperature','Pressure','Outcome','Status']]
df1_1
However, this method can be tricky if you have a lot of columns, and it's less readable. Consider how to treat some columns as a set like Modeler.
The second method is to divide the dataframe into three parts. The point is that df1.loc [:,'Power':'Pressure'] creates dataframes for 3 columns of'Power','Temperature', and'Pressure'.
#Swap columns 2: Make 3 DFs and connect them
df1_2=pd.concat([df1[['Time','Uptime']],
df1.loc[:,'Power':'Pressure'],
df1[['Outcome','Status']]], axis=1, join='inner')
df1_2
However, this method creates a lot of dataframes in the middle, and I think that it is quite inefficient when the number of data is large.
The third method is After creating a list of column names with columns.tolist () collist [collist.index ('Power'): collist.index ('Pressure') +1] makes a list of column names of "'Power','Temperature','Pressure'".
#Swap columns 3: Create a list object of columns
collist=df1.columns.tolist()
df1_3=df1[['Time', 'Uptime']+
collist[collist.index('Power'):collist.index('Pressure')+1]+
['Outcome','Status']]
df1_3
This method is lighter than the second method. However, it is a program that is not very readable.
The fourth method is a combination of the second and third methods. df1.loc [0: 0,'Power':'Pressure'] creates a dataframe of'Power', only one record of'Temperature','Pressure', and columns.tolist () only lists the columns. I'm taking it out.
#Swap columns 4: Create a column list object from the cut out DF
collist= df1.loc[0:0,'Power':'Pressure'].columns.tolist()
df1_3=df1[['Time', 'Uptime']+
collist+
['Outcome','Status']]
df1_3
It's not very readable yet, but I think it's lighter than the second method and easier to read than the third method.
The sample is placed below.
stream https://github.com/hkwd/200611Modeler2Python/raw/master/fieldreorder/fieldreorder.str notebook https://github.com/hkwd/200611Modeler2Python/blob/master/fieldreorder/fieldreorder.ipynb
data https://raw.githubusercontent.com/hkwd/200611Modeler2Python/master/data/COND2n.csv
■ Test environment Modeler 18.2.2 Windows 10 64bit Python 3.7.9 pandas 1.0.5
Field order node https://www.ibm.com/support/knowledgecenter/ja/SS3RA7_18.2.2/modeler_mainhelp_client_ddita/clementine/reorder_overview.html
Recommended Posts