Rewrite the record addition node of SPSS Modeler with Python.

The record addition node adds data vertically in SPSS Modeler. This is a processing process that corresponds to UNION ALL in SQL. Let's rewrite this with Python pandas.

0. original data

This is done using the following two time-series sensor data. Similar data items, but with different column names or only one column.

■ Data 1: Cond4n_e104.csv M_CD: Machine code UP_TIIME: Uptime POWER: Power TEMP: Temperature ERR_CD: Error code

image.png

■ Data 2: COND2n.csv Time: Uptime Power: Power Temperature: Temperature Pressure: Pressure Uptime: Uptime Status: Status code Outcome: error code

image.png

1m. Addition of record Modeler version

Add data 2 "COND2n.csv" according to the column of data 1 "Cond4n_e104.csv".

image.png First, use the filter node to match the column of data 2 to the column name of data 1.

image.png

Then connect the record addition node. Since the column corresponding to M_CD does not exist in COND2n.csv of data 2, NULL is entered. image.png

Data 2 has been added to data 1 as shown below. image.png

By the way, in the record addition node, the default field match criterion is "name", but you can add it based on the column position even if the name is different. Also, if you want to add Pressure etc. that is included only in the data 2 to be added, you can add it by selecting "All datasets" in the field input source. It is also possible to add a tag string that indicates which data came from.

1p. Add record pandas version

Use rename and drop to perform the process corresponding to the filter node. Use rename to align the column name with data 1, and drop to delete unnecessary columns.

#Align the column of data 2 with the column name of data 1.
df2_1=df2.rename(columns={'Time': 'UP_TIME', 'Power': 'POWER', 'Temperature': 'TEMP', 'Outcome': 'ERR_CD'})\
    .drop(['Pressure','Uptime','Status'],axis=1)
df2_1

image.png

Next, record addition processing corresponding to the record addition node is performed. There are two methods, append and concat. The result is the same in both cases. When combining 3 or more data, I think it is easier to understand how to write concat.

#How to use append
df1.append(df2_1)
#How to use concat
pd.concat([df1,df2_1])

image.png

2. Sample

The sample is placed below.

stream https://github.com/hkwd/200611Modeler2Python/raw/master/append/append.str notebook https://github.com/hkwd/200611Modeler2Python/blob/master/append/append.ipynb data https://raw.githubusercontent.com/hkwd/200611Modeler2Python/master/data/Cond4n_e104.csv https://raw.githubusercontent.com/hkwd/200611Modeler2Python/master/data/COND2n.csv

■ Test environment Modeler 18.2.2 Windows 10 64bit Python 3.7.9 pandas 1.0.5

4. Reference information

Duplicate record node https://www.ibm.com/support/knowledgecenter/ja/SS3RA7_18.2.1/modeler_mainhelp_client_ddita/clementine/distinct_settingstab.html

Recommended Posts

Rewrite the record addition node of SPSS Modeler with Python.
Rewrite the sampling node of SPSS Modeler with Python (2): Layered sampling, cluster sampling
Rewrite the sampling node of SPSS Modeler with Python ①: First N cases, random sampling
Rewrite the field creation node of SPSS Modeler with Python. Feature extraction from time series sensor data
Rewrite field order nodes in SPSS Modeler with Python.
Change node settings in supernodes with SPSS Modeler Python scripts
Rewrite duplicate record nodes in SPSS Modeler in Python. ① Identify the item you purchased first. (2) Identification of the top-selling item in the product category
Using Python with SPSS Modeler extension node (2) Model creation using Spark MLlib
Check the existence of the file with python
[Python3] Rewrite the code object of the function
Rewrite SPSS Modeler filter nodes in Python
Rewrite SPSS Modeler reconfigure node in Python. Aggregation by purchased product category
Prepare the execution environment of Python3 with Docker
2016 The University of Tokyo Mathematics Solved with Python
Calculate the total number of combinations with python
Check the date of the flag duty with Python
Rewrite the name of the namespaced tag with lxml
Convert the character code of the file with Python3
[Python] Determine the type of iris with SVM
Extract the table of image files with OneDrive & Python
Learn Nim with Python (from the beginning of the year).
Destroy the intermediate expression of the sweep method with Python
Visualize the range of interpolation and extrapolation with python
Calculate the regression coefficient of simple regression analysis with python
Summary of the basic flow of machine learning with Python
Record of the first machine learning challenge with Keras
[Python] How to rewrite the table style with python-pptx [python-pptx]
Get the operation status of JR West with Python
Extract the band information of raster data with python
Version control of Node, Ruby and Python with anyenv
Try scraping the data of COVID-19 in Tokyo with Python
I tried "gamma correction" of the image with Python + OpenCV
Towards the retirement of Python2
The story of implementing the popular Facebook Messenger Bot with python
Unify the environment of the Python development team starting with Poetry
Visualize the results of decision trees performed with Python scikit-learn
Record global IP with python
Calculate the square root of 2 in millions of digits with python
I wrote the basic grammar of Python with Jupyter Lab
Tank game made with python About the behavior of tanks
Run the intellisense of your own python library with VScode.
About the ease of Python
I evaluated the strategy of stock system trading with Python.
Check the scope of local variables with the Python locals function.
Let's touch the API of Netatmo Weather Station with Python. #Python #Netatmo
Using Python with SPSS Modeler extension nodes ① Setup and visualization
Addition with Python if statement
The story of rubyist struggling with python :: Dict data with pycall
[Homology] Count the number of holes in data with Python
Try to automate the operation of network devices with Python
Call the API with python3.
Estimate the attitude of AR markers with Python + OpenCV + drone
About the features of Python
Play with the password mechanism of GitHub Webhook and Python
The Power of Pandas: Python
Get the source of the page to load infinitely with python.
I compared the speed of Hash with Topaz, Ruby and Python
I tried scraping the ranking of Qiita Advent Calendar with Python
Save the result of the life game as a gif with python
March 14th is Pi Day. The story of calculating pi with python
Color extraction with Python + OpenCV solved the mystery of the green background