Recently, about Basic data plotting and File input / output method by R explained.
There is a tendency to do everything with Python as a data analysis language, but it is still a huge past Assets from R are attractive and not so easily truncated.
A common case is when you want to use R for partial data analysis, but want to write the entire programming in Python. You may also want to do only plotting with R. In such a case, it would be convenient if Python and R could be linked to solve the problem at once.
It seems that a library called RPy2 was used in the past, but recently it has been used and the mainstream is [PypeR](http://www.webarray. org / softwares / PypeR /).
Installation is easy. Install with the package manager pip.
pip install pyper
Consider the following code for R (scatter.R).
png("image.png ", width = 480, height = 480,
pointsize = 12, bg = "white", res = NA)
plot(data$WRAIN, data$LPRICE2, pch=16,
xlab="Rainfall from October to March of the previous year of harvest",
ylab="Wine price")
dev.off()
This is a simple R source code that extracts two columns from data, plots them, and writes them to a .png file.
Let's pass data to this R in Python and then retrieve the object to Python. The original data is a CSV file that stores the price of wine and can be viewed from here.
import pyper
import pandas as pd
#Read CSV data with Python
wine = pd.read_csv("wine.csv")
#Create an instance of R
r = pyper.R(use_pandas='True')
#Pass a Python object to R
r.assign("data", wine)
#Run the R source code
r("source(file='scatter.R')")
The contents of wine.csv read in Python are now passed to R and successfully plotted.
On the other hand, you may want to retrieve the result of R processing in Python. In such cases, you can retrieve the R object with the r.get method.
#Execute R code
r("res1 = cor.test(data$WRAIN, data$LPRICE2)")
r("data1 = subset(data, LPRICE2 < 0)")
r("res2 = cor.test(data1$WRAIN, data1$LPRICE2)")
#Read R objects in Python
res1 = pd.Series(r.get("res1"))
res2 = pd.Series(r.get("res2"))
print(res1)
#=>
#alternative two.sided
#conf.int [-0.258366126613384, 0.489798400688013]
#data.name data$WRAIN and data$LPRICE2
#estimate 0.1348919
#method Pearson's product-moment correlation
#null.value 0
#p.value 0.5023297
#parameter 25
#statistic 0.6806807
#dtype: object
print(res2)
#=>
#alternative two.sided
#conf.int [-0.409535600260672, 0.364710477639889]
#data.name data1$WRAIN and data1$LPRICE2
#estimate -0.02636626
#method Pearson's product-moment correlation
#null.value 0
#p.value 0.8982662
#parameter 24
#statistic -0.1292127
#dtype: object
This time I used pandas, but this is not required. But being able to interact directly with R-like pandas objects is very helpful.
You can pass the data to R only when you need it, and then return the result as an object back to Python. It's much more convenient than writing the data to an external file once and running it separately. The assets of R can be used from Python, making Python more and more useful as a glue language.
Try using R from Python with Python + PypeR http://mia-0032.hatenablog.jp/entry/2013/08/30/000000
I want to use R from Python-but RPy2 is no good- http://d.hatena.ne.jp/dichika/20130213/1360718736
Recommended Posts