This article is the first day of the Cloud Analytics advent calendar.
We handle analysis, machine learning, AI, etc. with the theme of Analytics on the Cloud. This time, when starting the calendar, first prepare the analysis environment. The following is available for free for 30 days, so please touch it according to the calendar. Also, why not give it a try if you are currently launching a Data Scientist team?
Today, I will give an overview of the environment to be used and create the first notebook.
Data Science Experience DataScienceExperience is a data science platform on the Cloud provided by IBM. The tools required to perform Data Science, including the Jupyter Notebook A complete set is available and to promote data science in the enterprise A platform with team development functions.
In DataScienceExperience
Jupyter Notebook and R Studio are currently available.
Below is the Jupyter Notebook.
Below is RStudio.
The interface is the same as the notebook and RStudio that you usually use.
DataSrouce How you get your data is important when you start Data Science. DataScienceExperience comes with 5GB of Object Storage for free. In addition, it can be connected to each storage of Bluemix with GUI, especially Cloudant (CouchDB) and It has good connectivity with DashDB. Below is the connection creation screen.
Other connection information such as S3 and Impala is required, but it can be used as a Data Source.
On the DataScienceExperience, create a project and create a notebook. Easily share your Notebook by adding other users to your project You can go and share the DataSource.
The following is the edit screen of Collaborator.
You can set Admin, Viewer, Editor, etc.
Notebooks and Data Sources can also be shared for collaborative editing.
First, create a project.
In the image below, some projects have already been created, Here, we will create a new project. Click the create project button on the upper right to jump to the project creation screen.
The image below is the project creation screen.
About the Spark Service and Object Storage fields Here, select Spark Service and Object Storage to which Project can connect, but you need to create Spark Service only for the first time. For Object Storage, you can select the one that comes with Spark Service when you create it, or the Object Storage d on Bluemix.
You have now created a brand new project!
Next, we will create a notebook and execute the code. From the add notebooks button on the project screen created earlier Moves to the Notebook creation screen.
Spark version can be selected from 2.0 and 1.6. Here, Python 2 and Spark 1.6 are selected.
About the name of the notebook Currently, there seems to be a bug that Preview cannot be done well when the Name item is entered in Japanese. I've raised the issue, so I think it will be fixed, but let's enter it in English here.
You now have a brand new Notebook!
Let's try running the Python code!
hallo = "Hallo Data Scientist!"
print(hallo)
Paste the above code into the created Notebook cell and press the execute button. The code is executed and the result is output.
You can execute cells by pressing Shift + Enter.
Now you are ready for Data Science! !! !! After that, we will look at analysis processing using Notebook, Object Storage, and other DataSources.
Recommended Posts