Run GPU-required batch processing on AWS

Summary

background

Why AWS Batch?

How to use AWS Batch

  1. Push the image of the container you want to move to ECR (Elastic Container Registry)
  2. Create an AWS Batch computing environment
  3. Select managed type
  4. Instance settings-> Spot
  5. On-demand charge-> Set appropriately according to your budget. At 50%, I didn't have to wait that long.
  6. Allowed instance type-> Select an instance that can use the GPU. Since g4dn.xlarge is the cheapest, it may be good to see the situation from now on.
  7. Image type of EC2 settings-> Amazon Linux 2 (GPU)
  8. Create a job queue for AWS Batch
  9. Select the computing environment created in 2.
  10. Create a job definition for AWS Batch
  11. Platform-> EC2
  12. Image-> 1 pushed
  13. Command-> Write what you want to execute in batch so that it can be used as an argument of CMD. Since the CMD of the original image is overwritten, it is necessary to set it again even if it is described on the image side.
  14. If memory-> 2GB is not enough, increase it
  15. Number of GPUs-> 1
  16. Throw an AWS Batch job
  17. Click Create New Job from the job screen
  18. Select the job definition and job queue created in 4 and 3.
  19. Press the last submit and the job will be submitted
  20. Watch AWS Batch job execution
  21. Watch the job execution status from the dashboard screen.

About container image

Even if the container side is GPU Ready, if the software in the image file does not support GPU, it will eventually run on the CPU. The points that I personally should be aware of are as follows.

Recommended Posts

Run GPU-required batch processing on AWS
Run Processing on Ant
Run the AWS CLI on Docker
Run C binaries on AWS Lambda
Test run on rails
Run PostgreSQL on Java
Test processing with AWS KMS on your local PC
Run tiscamera on Ubuntu 18.04
Run phpunit on Docker
Run batch with docker-compose with Java batch
Run VS Code on Docker
Run openvpn on Docker (windows)
Run chromium-mir-kiosk on Ubuntu Core
Run java applet on ubuntu
Update RVM on AWS Cloud9
Install docker on AWS EC2
Run Eclipse CDT on Ubuntu
Run mruby / c on PSoC 5
How to run a job with docker login in AWS batch
Run (provisionally) a Docker image with ShellCommandActivity on AWS Data Pipeline