(However, the Beaker server is required for actual execution, so refer to the procedure below.)
In Part 1, a concrete example of creating an embedded custom JS widget with Jupyter and Beaker Notebook I introduced you. This time, I would like to take a look at the procedure for actually using Beaker.
In most cases, "plain Python / R / JS" is rarely the answer. For example, when preparing data, you will often want to use pandas, and you will often want to draw with matplotlib. __ Notebook-type applications are good for keeping the process of work, but it is not possible to record the environment in which the work was performed __.
Therefore, container technologies such as Docker can further improve reproducibility by using them in combination with notebook-type applications. As mentioned above, the problem with visualization / analysis work in multilingual use is that it is difficult to keep a record of __what kind of environment the work was done in. Of course, there is a virtual machine image, but this kind of thing can be recorded in the form of one container environment for one notebook, and it can be inherited to easily build a new working environment. The compatibility between work and Docker is good for the situation.
The Beaker Notebook project also uses the supported languages such as Python, Node.js, R, Julia immediately Docker image is available .. This time, we will work on expanding this image. Since this theme is not learning Docker, I will omit the details about Docker, but after the release of Docker Toolbox, even those who have almost no prerequisite knowledge It has become quite easy to use, so please try it.
The image on the official page comes with a set of programming languages in advance, but let's add an additional package here. Basically, just call the commands required for installation with `` `RUN```.
Dockerfile to be used this time
FROM beakernotebook/beaker
# Add Extra library for data analysis
# R Libraries
RUN Rscript -e "install.packages('igraph',,'http://cran.us.r-project.org')"
# Python 3 libraries
RUN /home/beaker/py3k/bin/pip install requests networkx py2cytoscape
# Add a new directory for user's notebooks
RUN mkdir /home/beaker/notebooks
In this example, igraph, which is often used for graph analysis in R language, NetworkX, which is also used in Python, and I have py2cytoscape installed, which includes a data conversion utility to Cytoscape.js.
Beaker, like Jupyter, manages notebooks as JSON-formatted files. For this reason, it is very convenient to put Docker files and notebooks on your local file system and run the container with them mounted.
Build image
docker build -t keiono/beaker .
Image execution
docker run -p 8800:8800 -v $PWD:/home/beaker/notebooks -t keiono/beaker
The notebook server will now run with the current directory mounted in the container.
After this, the login password will be displayed, so use it to access via https. If the Docker host is 192.168.99.100
, you will be prompted to enter the password when you access the following address with a browser, so enter it displayed on the terminal.
https://192.168.99.100:8800
The Dockerfile used in this example is also in here, so you can freely add commands after forking to create the environment required for your work. Please remodel.
Open the sample notebook (graph-final.bkr) in the Repository here.
Do this. The cell execution method is the same as Jupyter, CTR + Return. If you run it in order, you'll see the last cell look like this:
This means that the actual visualization cell is running, so if you go back up one cell, you should see something like this.
If you click on a node in it, the display should change and only those that are directly connected to that node will appear.
This interactive cell was created on a notebook with Cytoscape.js embedded. In this example, we created a custom network rendering cell, but it doesn't really matter what you use. D3.js is prepared from the beginning, so you can start writing code in JavaScript cells suddenly.
From now on, let's take a closer look at the actual process.
In this example, data is exchanged between three languages, R, Python3, and JavaScript, to emphasize the main feature of Beaker, Autotranslation. This is a unique feature of Beaker, which allows you to use this feature between __multi-languages to exchange data directly. __
In this example, we are using R in the first cell to have the dataframe read the edge list.
df1 <- read.table('/home/beaker/notebooks/fbnet.edges', header=FALSE, sep=' ')
colnames(df1) <- c('source', 'target')
# Save it as a shared beaker object
beaker::set('df1', df1)
You may have an unfamiliar notation on the last line, but this is all about Beaker's Auto translation. In short, by passing an object or value from each language to this globally exposed beaker object, it is automatically translated into JSON, and data can be exchanged between each cell.
If you want to access this data from Python, you can use the automatically converted pandas DataFrame.
R DataFrame → Pandas DataFrame → NetworkX Graph → Cytoscape.js JSON
df2 = beaker.df1 #This alone will automatically convert it to a Pandas DF
df2[["source", "target"]] = df2[["source", "target"]].astype('str')
#Convert that dataframe to a NetworkX object
g2 = nx.from_pandas_dataframe(df2, "source", "target")
# Calculate Betweenness Centrarity of the graph
cent_1 = nx.betweenness_centrality(g2)
deg = nx.degree(g2)
nx.set_node_attributes(g2, 'b-cent', cent_1)
nx.set_node_attributes(g2, 'deg', deg)
# Convert networkX data into Cytoscape.js JSON
net1 = nxutil.from_networkx(g2)
# Save it into Beaker object
beaker.net1 = net1 #Finally, register the graph including the frequency etc. again
# Save range
beaker.net1_deg_range = (min(deg.values()), max(deg.values()))
In the last part, I used a utility I made earlier to convert NetworkX objects into a form that can be rendered in Cytoscape.js. Now you have data that can be used for visualization.
Once you have the data, create a cell in HTML to draw it. That said, it should usually be a simple CSS and a simple one that contains only the tags you want to draw:
<style>
#cyjs {
width: 100%;
height: 600px;
background-color: #000000;
}
h1 {
color: #555555;
}
</style>
<h1>Anonymized Facebook Network Data from Stanford Network Analysis Project</h1>
<div id="cyjs"></div>
<h3>(Cytoscape.js Demo)</h3>
At this point, all you have to do is visualize the data through trial and error. The details of Cytoscape.js are out of the scope of this topic, so I won't go into it in depth, but the point is here:
elements: beaker.net1["elements"]
The one converted to Cytoscape.js format using Python is directly imported through the beaker object. With this, even if you want to include new data in the visualization, you can reprocess the data in the previous Python or R cell, return to the JS cell as it is, and continue coding the visualization part again. I will. This is something that Jupyter can't do at the moment, and I think it's the biggest advantage of Beaker.
By using Docker and Beaker Notebook in this way, it is possible to record everything from data processing to visualization in a form that can be reproduced with one notebook while using multiple languages. If you create it in a browser and editor, you often have to change the data in another application, but with this you can do it all with just a notebook.
There are few usage examples yet, but if you want to use multiple languages, or if you want to visualize with complex JS components, please use it.
Recommended Posts