[PYTHON] I installed DSX Desktop and tried it

Install DSX Desktop

IBM's analysis software, Data Science Experience desktop version, DSX Desktop (beta) has been upgraded, so I installed it. (I tried a little right after the beta started, but it's been a few months) The download volume is a little over 9GB, which is a bit large, but it may be appreciated that it is provided as a Docker image from the beginning: slight_smile:

Download from here https://datascience.ibm.com/desktop I installed the mac version

When you execute the downloaded one, you will see this screen install1.png

If you "Drag & Drop" DSX Desktop to the folder in the window, "IBM DSX Desktop" is created in the application folder, so execute it.

(I forgot to take a screenshot) Install both Notebook and R Studio You will be prompted to choose options such as whether to install both or whether to use Spark with Notebook. The download amount was about 6GB without Spark, increased by about 3GB with Spark to 9GB, and with R Studio, it was about 11GB. It seems that the R Studio part can be additionally installed later, so I added it because it was a big deal up to Spark.

download

As I continued the installation process, a long download started. It was about 9GB, but I think it took about 5 hours to run it in my home LAN environment. I failed once on the way and tried again. The reason for the failure was that the download was moss on the way. At the time of retry, the screen saver of Mac is also temporarily turned off, and it is executed earnestly. After the download is completed, Extract runs for about 5 minutes to complete the installation. (Of the installation work, the only thing I'm addicted to is this download. The rest is smooth)

Fast! !! It's light! !! !!

When I ran it, the movement was very light and impressed: grin: It may not be as good as the DSX offered in SaaS in the cloud. When you start it, click the "≡" icon on the upper left to display the screen for creating a notebook. dsk1.png

(It is a screenshot taken after making various things, but the notebook creation screen is as follows. Click add notebook to create it.) dsx2.png

Of course, the notebook made with Jupyter also works. (However, as will be described later, the directory structure is not on the Mac, but inside the Docker container, so that area will not work as it is.) dsx3.png

When I tried to run it on a trial basis, immediately after installation, read_excel of pandas gave an execution error. The cause was that xlrd was not included. I added it with! Pip intall in my Notebook, and now I can run it normally.

Data used for analysis

Unlike the cloud version, DSX seems to be limited to file formats, at least in beta. Press the button called add dataset, or press the icon on the upper right (a button that looks like an "n = 2 identity matrix" that combines 1s and 0s) to register. Now when you import, the local file will be imported into Docker.

dsx4.png

Storage location of registered data files

It seems to be stored under / opt / notebooks / assets. (In the screenshot above, I pwded to see the default execution directory at runtime and the assets folder with the registered files)

File registration method (try to register with a command)

It works with docker and I know the directory, so I registered it from the Mac terminal on the command line. The list.txt in the screenshot above is registered from the command line. dsx5.png

Check docker image

If you check it, it's called anaconda_with_spark. (The second and subsequent ones may not have been installed this time.) dsx6.png

If you run the shell in the container where DSX Desktop is running, you can see what's going on. (Run shell with docker exec) I don't know if it's okay to customize it, but maybe you can create your own committed container image in your local environment. dsx7.png

Recommended Posts

I installed DSX Desktop and tried it
[Streamlit] I installed it
[Python] I installed the game from pip and played it
I tried using Twitter api and Line api
Wrangle x Python book I tried it [2]
I tried using PyEZ and JSNAPy. Part 2: I tried using PyEZ
I installed and used Numba with Python3.5
Wrangle x Python book I tried it [1]
I tried combining Fabric, Cuisine and Jinja2
I tried using PyEZ and JSNAPy. Part 1: Overview
[Python] I introduced Word2Vec and played with it.
I tried web scraping using python and selenium
I implemented DCGAN and tried to generate apples
I tried scraping
I tried object detection using Python and OpenCV
I tried PyQ
I tried playing with PartiQL and MongoDB connected
I tried Jacobian and partial differential with python
I tried function synthesis and curry with python
I tried AutoKeras
I tried using Google Translate from Python and it was just too easy
I made an image classification model and tried to move it on mobile
I tried papermill
I tried morphological analysis and vectorization of words
I tried django-slack
I tried Django
I tried spleeter
I tried cgo
[Introduction to PID] I tried to control and play ♬
I tried to use Twitter Scraper on AWS Lambda and it didn't work.
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
I tried to graph the packages installed in Python
I tried to read and save automatically with VOICEROID2 2
I tried pipenv and asdf for Python version control
I tried using google test and CMake in C
I tried adding post-increment to CPython. Overview and summary
I tried to automatically read and save with VOICEROID2
I tried adding system calls and scheduler to Linux
AWS Lambda now supports Python so I tried it
I tried to implement Grad-CAM with keras and tensorflow
AI Gaming I tried it for the first time
Python: I tried a liar and an honest tribe
I tried to install scrapy on Anaconda and couldn't
I tried using parameterized
I tried using mimesis
I tried using anytree
I tried competitive programming
I tried running pymc
I learned MNIST with Caffe and tried to draw it (MAC OS X El Capitan)
I tried ARP spoofing
When I tried to install PIL and matplotlib in a virtualenv environment, I was addicted to it.
I tried using aiomysql
I tried using Summpy
I tried Python> autopep8
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
When I installed python on macOS and used it, I got an error when I put an https connection
I tried using Hubot
I tried using ESPCN