[PYTHON] A memo of how to use AIST supercomputer ABCI

Introduction

A memo from AIST's GPU-equipped supercomputer "ABCI" until the end of last year. Especially for those who want to run Anaconda's virtual environment with the calculation node of ABCI **, please refer to it.

Reference materials, etc.

--ABCI Web Page - https://abci.ai/

Login to interactive node

For Windows, you should use Windows Powershell, Powershell, or WSL from Windows Terminal. I was using Windows Terminal + WSL.

shell-window-1


$ ssh -i ./your_rsa_key -L 10022:es:22 -l $YOUR_ABCI_ID as.abci.ai

If you enter the correct password, you will get the following message

shell-window-1


Welcome to ABCI access server.
Please press any key if you disconnect this session.

Keep this ** shell-window-1 ** as it is. While keeping it, open another window and enter the following.

shell-window-2


$ ssh -i ./your_rsa_key -p 10022 -l $YOUR_ABCI_ID localhost

After entering the correct password, if the following message appears, the login is successful.

shell-window-2


--------------------------------------------------------------------------------
  ABCI Information                                           Date: Oct 04, 2019
--------------------------------------------------------------------------------

  Welcome to ABCI system

  - How to use
    Please see below for ABCI Users Guide:

    - https://docs.abci.ai/en/ (In English)
    - https://docs.abci.ai/ja/ (In Japanese)

  If you have any questions or need for further assistance,
  please refer to the following URL and contact us:

    - https://abci.ai/en/how_to_use/user_support.html (In English)
    - https://abci.ai/ja/how_to_use/user_support.html (In Japanese)

[xxxxxxxxxx@es1 ~]$ 

If you want to create an environment variable that will not disappear when you log out, write it in ~ / .bash_profile. It is convenient to set the user ID, group ID, email address, and group directory as environment variables.

shell-window-2


[xxxxxxxxxx@es1 ~]$ cat .bash_profile
# .bash_profile

## Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

## User specific environment and startup programs
export PATH=$PATH:$HOME/.local/bin:$HOME/bin

## Original
export ID_USER=xxxxxxxxxx
export ID_GROUP=xxxxxxxx
export [email protected]
export DIR_GROUP=/groups1/$ID_GROUP

After installing Anaconda, enter the following command to enable the conda command.

shell-window-2


[xxxxxxxxxx@es1 ~]$ export PATH=~/anaconda3/bin:$PATH

Use Jupyter lab with interactive nodes

Start Jupyter lab without opening it in the browser on the server side and check the server address.

shell-window-2


[xxxxxxxxxx@es1 ~]$ jupyter lab --no-browser --ip=`hostname` >> jupyter.log 2>&1 &
[xxxxxxxxxx@es1 ~]$ jupyter notebook list
Currently running servers:
http://es1.abci.local:8888/?token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX :: /home/xxxxxxxxxx

Open a new shell window and port forward. Enter the following and enter the password. It is not always "es1" or "8888", so please check with the above running server address every time.

shell-window-3


$ ssh -L 18888:es1:8888 -l $YOUR_ABCI_ID -i ./your_rsa_key -p 10022 localhost

While keeping shell-window-3 in this state, start the browser of the login source PC,

http://localhost:18888/?token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Access to. "token =" Please copy the one displayed in shell-window-2 below.

Use compute node in on-demand mode

Login to compute node using qrsh command

shell-window-2


[xxxxxxxxxx@es1 ~]$ qrsh -l rt_G.small=1 -g $ID_GROUP -l h_rt=00:30:00

If the node name after @ is a node name starting with g, it is successful. Please note that you will be taken to your home directory wherever you are on the interactive node

shell-window-2


[xxxxxxxxxx@g0001 ~]$ 

Use Jupyter lab on compute node

The procedure is basically the same as using Jupyter lab with an interactive node. When port forwarding, replace es ~ in the node name with g ~.

Use compute node in Spot mode

It is not necessary to log in to the compute node in advance ***.

shell-window-2


[xxxxxxxxxx@es1 ~]$ qsub -j y -cwd -l rt_G.small=1 -l h_rt=24:00:00 -g $ID_GROUP -M $EMAIL -m besa -o $DIR_LOG -e $DIR_LOG ./run.sh

For DIR_LOG, specify an appropriate directory in advance. If the current directory is fine

shell-window-2


[xxxxxxxxxx@es1 ~]$ export DIR_LOG=$(pwd)

Is OK.

If you want to use the virtual environment created by Anaconda, the contents of the above run.sh should be as follows, for example:

run.sh


#!/bin/bash

## >>> conda init >>>

__conda_setup="$(CONDA_REPORT_ERRORS=false '$HOME/anaconda3/bin/conda' shell.bash hook 2> /dev/null)"

if [ $? -eq 0 ]; then
    \eval "$__conda_setup"
else
    if [ -f "$HOME/anaconda3/etc/profile.d/conda.sh" ]; then
        . "$HOME/anaconda3/etc/profile.d/conda.sh"
        CONDA_CHANGEPS1=false conda activate base
    else
        \export PATH="$PATH:$HOME/anaconda3/bin"
    fi
fi
unset __conda_setup
## <<< conda init <<< 

## Activation
conda activate myenv1

##This is displayed for confirmation
conda info -e
date
hostname

##Run python code
python main.py arg1 arg2 arg3

Use compute node in Reserved mode

[Updated on May 22, 2020]

Reservation and deletion of reservation

Please refer to the official manual (because it is easy to understand) https://docs.abci.ai/ja/03/#reservation

Use the qrsub command to make a reservation. -a: Specify the job start date in YYYYMMDD format. The job will start ** at 10am ** on the start date. If you specify the day after 10 am, you will get some error. -d: Specify the job execution period (days). Execution will stop ** at 9:30 am ** N days after the start date specified by -a. Specify either -e below or this -d. -e: Specify the stop date and time of the job. Execution will be stopped ** at 9:30 am on the day specified by YYYYMMDD. Specify either -d above or this -e. -N: Specify the reserved name as a character string. You cannot specify a number at the beginning.

shell-window-2


[xxxxxxxxxx@es1 ~]$ qrsub -a 20180705 -d 7 -g grpname -n 4 -N "Reserve_for_AI"
Your advance reservation 12345 has been granted

Use the qrdel command to delete a reservation. Specify the ID issued when making a reservation ("12345" in the above execution example) as an argument.

shell-window-2


[xxxxxxxxxx@es1 ~]$ qrdel 12345

Run the job with the reserved resource

Specify the ID issued when making a reservation with the -ar option in the qsub command.

Execution example:

shell-window-2


[xxxxxxxxxx@es1 ~]$ qsub -j y -cwd -l rt_F=1 -g $ID_GROUP -M $EMAIL -m besa -o $DIR_LOG -e $DIR_LOG -ar 12345 ./run.sh

in conclusion

Excuse me, but I'm sorry, but it doesn't cover how to use ABCI because I just ported the memo that I made to suit my purpose. If you need more detailed information, please refer to the materials listed in "Reference Materials".

Recommended Posts

A memo of how to use AIST supercomputer ABCI
A simple example of how to use ArgumentParser
Memo of how to use properly when combining pandas.DataFrame
Summary of how to use pandas.DataFrame.loc
Summary of how to use pyenv-virtualenv
[Memo] How to use Google MµG
Summary of how to use csvkit
How to develop in a virtual environment of Python [Memo]
[Python] Summary of how to use pandas
[Memo] How to use BeautifulSoup4 (1) Display html
How to calculate Use% of df command
[Python2.7] Summary of how to use unittest
How to use jupyter notebook with ABCI
How to use Jupyter on the front end of supercomputer ITO
Jupyter Notebook Basics of how to use
Basics of PyTorch (1) -How to use Tensor-
Summary of how to use Python list
[Python2.7] Summary of how to use subprocess
[Question] How to use plot_surface of python
How to calculate the volatility of a brand
How to use folium (visualization of location information)
Not much mention of how to use Pickle
Summary of how to use MNIST in Python
How to use xml.etree.ElementTree
How to use Python-shell
How to use tf.data
How to use virtualenv
How to use Seaboan
How to use image-match
How to use shogun
How to use Pandas 2
How to use Virtualenv
How to use numpy.vectorize
How to use pytest_report_header
How to use partial
How to use Bio.Phylo
How to use SymPy
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
How to use list []
How to use python-kabusapi
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
Create a dataset of images to use for learning
A memo connected to HiveServer2 of EMR with python
I tried to summarize how to use matplotlib of python
A memo to visually understand the axis of pandas.Panel