[Environment construction] Procedure for implementing the Python environment of Rabbit Challenge, which is a JDLA certified E qualification measure account, with Databricks

Overview

The learning environment for Rabbit Challenge, which is a preparation course for the Deep Learning qualification test (E qualification) of the Japan Deep Learning Association, was conducted in the Databricks Community Edition, which can be used free of charge with a browser, so we will share it.

What is Rabbit Challenge?

It is a cheap countermeasure account provided by Study-AI Co., Ltd. for 3000 yen per month, which is provided on the premise of promoting self-learning independently. image.png Source: Rabbit ★ Challenge Deep Learning (ai999.careers)

Databricks Community Edition It is a free environment of Databricks, which is a service of data integration data platform (Lakehouse) that can perform big data processing by Spark and data analysis by Python and R. image.png Source: Databricks-Integrated Data Analysis

Why use Databricks

This is because it can be used in actual business by learning with Databricks. Databricks are available in multi-clouds such as AWS and Azure and can be deployed on virtual networks to meet enterprise-level security requirements.

Anaconda was charged and difficult to use, and Google Colab could not be used for business from the viewpoint of security.

-Anaconda package repository was charged for "large" commercial use-Qiita

Environment that seems necessary

The following environment is required, and with the latest version of Databricks Runtime, Kearas (standalone) is not installed, so it seems better to use Databricks Runtime 6.4 ML.

For the libraries installed in Databricks Runtime 6.4 ML, refer to the following documents.

--Machine Learning Azure Databricks Databricks Runtime 6.4 --Workspace | Microsoft Docs

Databricks environment construction

Apply for Databricks Community Editon.

Apply from Try Databricks.

image.png Source: Try Databricks

Select "COMMUNITY EDITION" below.

image.png Source: Try Databricks

Set the link for the email you received.

image.png

Set a password.

image.png

Make sure you can connect to Databricks.

image.png

Procedures for studying the course

Import the file you want to import. Since it cannot be imported as a folder, it may be easier to import all at once on the command line.

image.png image.png

Right-click on "Clustres", enter an appropriate name in "Cluster Name", enter "Databricks Runtime Version" in "Databricks Runtime 6.4 ML (Scala 2.11 Spark 2.4.5)", and select "Create Cluster".

image.png

Open the notebook, attach the cluster, and run the notebook.

image.png

Precautions when learning with this procedure

  1. GPU cannot be used in Community Edition
  2. Must be cloned each time before learning
  3. When importing learning code using GUI, it is necessary to execute it in folder units.

Recommended Posts

[Environment construction] Procedure for implementing the Python environment of Rabbit Challenge, which is a JDLA certified E qualification measure account, with Databricks
Environment construction procedure for those who are not familiar with the python version control system
Python project environment construction procedure (for windows)
[python] [meta] Is the type of python a type?
[Python] Number of integer sequences of length n for which the sum is m
[Introduction to Python] How to get the index of data with a for statement
Prepare the execution environment of Python3 with Docker
Procedure for creating a LineBot made with Python
Poetry-virtualenv environment construction with python of centos-sclo-rh ~ Notes
Commands for creating a python3 environment with virtualenv
Procedure for creating a Python quarantine environment (venv environment)
I thought about why Python self is necessary with the feeling of a Python interpreter