[PYTHON] Is R's do.call () a classical higher-order function? Learn how to use

Looking at Kaggle kernels, which introduces Kaggle's participant code, I saw an R code that makes heavy use of do.call (). Since do.call () was almost new to me, I looked it up and found that it is a relatively classical function and is not difficult to use. Make a note below so that you do not forget it.

Overview of do.call ()

First, I will quote from the CRAN manual.

do.call - Execute a Function Call

Description

do.call constructs and executes a function call from a name or a function and a list of arguments to be passed to it.

Usage do.call(what, args, quote = FALSE, envir = parent.frame())

Arguments

  • what either a function or a non-empty character string naming the function to be called.
  • args a list of arguments to the function call. The names attribute of args gives the argument names.
  • quote a logical value indicating whether to quote the arguments.
  • envir an environment within which to evaluate the call. This will be most useful if what is a character string and the arguments are symbols or quoted expressions.

As a function, it is a "Function Call". The R language has a rich set of Apply functions, so it seems that it is famous, but it seems that this do.call () is also used depending on the case. It seems to take four arguments as described above, but the first two are required, the function object "what" and the argument "args" to be passed to it. "args" must be a list variable.

Here are some usage examples.

First, define the function.

# define my own function
myrange <- function (larg) {
    nv <- unlist(larg)
    rg <- max(nv) - min(nv)
    return(rg)
}

Here, we use "iris" which can be referred to immediately by R.

# Data.Frame example
head(iris)

Table 1. Iris Dataset r_do_call1.PNG

Do.call () the defined function "myrange".

do.call(myrange, list(iris$Sepal.Length))
# Out: 3.6

As expected, the maximum value of Sepal.Lengh-the minimum value (3.6) was output. For the time being, when calculated with the R built-in range (), it was 4.3, 7.9 (minimum value, maximum value), so the solution is in agreement with 3.6 (= 7.9 --4.3) above.

Let's check another example. First, prepare a function that normalizes the numerical value. Prepare the input data sample and execute do.call () as follows.

normalize <- function(x, m=mean(x), s=sd(x)) {
    (x - m) /s
}

myseq = list(c(1, 3, 6, 10, 15))
do.call(normalize, myseq)

# -1.0690449676497 -0.712696645099798 -0.17817416127495 0.534522483824849 1.4253932901996

The average and standard deviation of the output numerical list are

mean of normalized =
[1] -5.572799e-18
standard deviation = 
[1] 1

Since it is a value near 0 and 1 as shown in, it can be seen that the expected normalize can be executed.

Compare with apply () in Python Pandas

It seems that R's do.call () is similar to the Python built-in function map (), but I don't use it much personally, so this time I will compare it with Pandas' apply (). (Reference: "Python for Data Analysis" --O'reilly media) First, prepare sample data.

# Sample Data
frame = pd.DataFrame(np.random.randn(4,3), columns=list('bde'),
                    index=['Utah', 'Ohio', 'Texas', 'Oregon'])
frame

** Table 2. Data Example** do_call_py2.PNG

Prepare a function to calculate the range (maximum value-minimum value) and apply () it to pd.DataFrame.

# define lambda function
f = lambda x: x.max() - x.min()
frame[['d']].apply(f)
# if I execute frame['d'].apply(f), error is raised. "apply()" is for pd.DataFrame

This is the expected behavior.

Out: d    4.016529
dtype: float64

If you want to specify the column numerically, use iloc [] as follows.

frame.iloc[:, [2]].apply(f)

# Out: e    2.160329
# dtype: float64

Note that since we want to put the sequence into a given function, we have to specify columns in a list like frame [['d']] or frame.iloc [:, [2]]. Is. (If this is set to frame ['d'], frame.iloc [:, 2], it will be interpreted as apply () for the pd.Series object and processing for each scalar element, resulting in an error.)

With this, the same operation as R and do.call () was realized.

Summary

do.call () is a rare function (only for me?), But it seems to be used in the situation of "processing data.frame and then putting it together". However, the Apply functions are more convenient, and do.call () seems to be written in a "classical" way. Personally, I don't want to use do.call () positively, but when I see do.call () in human code, I want to understand it properly without rushing.

I can't find anything that corresponds to do.call () in Python, but it seems that the desired operation can be achieved by performing processing using Pandas' apply () or list comprehension (with data separated).

(R used ver. 3.3.1 (on jupyter notebook), Python used ver. 3.5.2 (on jupyter notebook).)

References

Recommended Posts

Is R's do.call () a classical higher-order function? Learn how to use
How to call a function
How to use the zip function
How to make a recursive function
How to use python zip function
Learn how to use Docker through building a Django + MySQL environment
[Pandas] What is set_option [How to use]
How to use is and == in Python
[Python] Explains how to use the range function with a concrete example
A simple example of how to use ArgumentParser
How to create a function object from a string
[Python] How to use hash function and tuple.
[Go] How to write or call a function
How to Mock a Public function in Pytest
How to use xml.etree.ElementTree
How to use Python-shell
How to use tf.data
How to use pip, a package management system that is indispensable for using Python
How to use Seaboan
How to use image-match
How to use shogun
A memo of how to use AIST supercomputer ABCI
How to use Pandas 2
How to use Virtualenv
How to use a library that is not originally included in Google App Engine
How to use numpy.vectorize
How to use pytest_report_header
A memorandum on how to use keras.preprocessing.image in Keras
How to use partial
How to use Bio.Phylo
How to use any or all to check if it is in a dictionary (Hash)
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
How to use list []
How to use python-kabusapi
How to use OptParse
Scraping with Python-Selenium is old! ?? ・ ・ ・ How to use Pyppeteer
How to use return
How to use dotenv
How to use pyenv-virtualenv
[Python] How to call a c function from python (ctypes)
How to use Go.mod
How to use imutils
How to use import
[Python] What is a tuple? Explains how to use without tuples and how to use it with examples.
How to use GitHub on a multi-person server without a password
How to use Fujifilm X-T3 as a webcam on Ubuntu 20.04
How to print characters as a table with Python's print function
How to use cuML SVC as a Gridsearch CV classifier
Comparison of how to use higher-order functions in Python 2 and 3
How to use a file other than .fabricrc as a configuration file
VIM is good to use — at least a sneak peek
How to use Qt Designer