[PYTHON] How to call PyTorch in Julia


Create a virtual environment in Conda, make PyTorch available in Python in that virtual environment, make PyCall aware of that Python, and call it in Julia.

The procedure is as follows.

  1. Install Conda in the shell.
  2. Create a virtual environment with Conda.
  3. Make PyTorch available in your virtual environment.
  4. Install Julia.
  5. Install PyCall to reference Python in your virtual environment.


The installation method of Conda is omitted.

conda create -n my_env python=3.8
conda activate my_env
conda install -c pytorch pytorch

When executed, it becomes such a log.

paalon at paalon-mac in ~
↪ conda create -n my_env python=3.8                                      (base)
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/paalon/conda/envs/my_env

  added / updated specs:
    - python=3.8

The following NEW packages will be INSTALLED:

  ca-certificates    pkgs/main/osx-64::ca-certificates-2020.1.1-0
  certifi            pkgs/main/osx-64::certifi-2019.11.28-py38_0
  libcxx             pkgs/main/osx-64::libcxx-4.0.1-hcfea43d_1
  libcxxabi          pkgs/main/osx-64::libcxxabi-4.0.1-hcfea43d_1
  libedit            pkgs/main/osx-64::libedit-3.1.20181209-hb402a30_0
  libffi             pkgs/main/osx-64::libffi-3.2.1-h475c297_4
  ncurses            pkgs/main/osx-64::ncurses-6.2-h0a44026_0
  openssl            pkgs/main/osx-64::openssl-1.1.1d-h1de35cc_4
  pip                pkgs/main/osx-64::pip-20.0.2-py38_1
  python             pkgs/main/osx-64::python-3.8.1-h359304d_1
  readline           pkgs/main/osx-64::readline-7.0-h1de35cc_5
  setuptools         pkgs/main/osx-64::setuptools-45.2.0-py38_0
  sqlite             pkgs/main/osx-64::sqlite-3.31.1-ha441bb4_0
  tk                 pkgs/main/osx-64::tk-8.6.8-ha441bb4_0
  wheel              pkgs/main/osx-64::wheel-0.34.2-py38_0
  xz                 pkgs/main/osx-64::xz-5.2.4-h1de35cc_4
  zlib               pkgs/main/osx-64::zlib-1.2.11-h1de35cc_3

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
# To activate this environment, use
#     $ conda activate my_env
# To deactivate an active environment, use
#     $ conda deactivate

paalon at paalon-mac in ~
↪ conda activate my_env                                                  (base)
paalon at paalon-mac in ~
↪ conda install -c pytorch pytorch                                     (my_env)
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/paalon/conda/envs/my_env

  added / updated specs:
    - pytorch

The following NEW packages will be INSTALLED:

  blas               pkgs/main/osx-64::blas-1.0-mkl
  intel-openmp       pkgs/main/osx-64::intel-openmp-2019.4-233
  libgfortran        pkgs/main/osx-64::libgfortran-3.0.1-h93005f0_2
  mkl                pkgs/main/osx-64::mkl-2019.4-233
  mkl-service        pkgs/main/osx-64::mkl-service-2.3.0-py38hfbe908c_0
  mkl_fft            pkgs/main/osx-64::mkl_fft-1.0.15-py38h5e564d8_0
  mkl_random         pkgs/main/osx-64::mkl_random-1.1.0-py38h6440ff4_0
  ninja              pkgs/main/osx-64::ninja-1.9.0-py38h04f5b5a_0
  numpy              pkgs/main/osx-64::numpy-1.18.1-py38h7241aed_0
  numpy-base         pkgs/main/osx-64::numpy-base-1.18.1-py38h6575580_1
  pytorch            pytorch/osx-64::pytorch-1.4.0-py3.8_0
  six                pkgs/main/osx-64::six-1.14.0-py38_0

Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Start Julia.

paalon at paalon-mac in ~
↪ julia                                                                (my_env)
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.3.1 (2019-12-30)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

Add PyCall, set ʻENV ["PYCALL_JL_RUNTIME_PYTHON"] and ʻENV [" PYTHON "] to Sys.which ("python "), and build to make it available.

(v1.3) pkg> add PyCall
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
  Updating `~/.julia/environments/v1.3/Project.toml`
  [438e738f] + PyCall v1.91.4
  Updating `~/.julia/environments/v1.3/Manifest.toml`
  [8f4d0f93] + Conda v1.4.1
  [438e738f] + PyCall v1.91.4
  [81def892] + VersionParsing v1.2.0

julia> ENV["PYCALL_JL_RUNTIME_PYTHON"] = Sys.which("python")

julia> ENV["PYTHON"] = Sys.which("python")

(v1.3) pkg> build PyCall
  Building Conda ─→ `~/.julia/packages/Conda/3rPhK/deps/build.log`
  Building PyCall → `~/.julia/packages/PyCall/zqDXB/deps/build.log`

julia> using PyCall
[ Info: Precompiling PyCall [438e738f-606a-5dbb-bf0a-cddfbfd45ab0]

julia> torch = pyimport("torch")
PyObject <module 'torch' from '/Users/paalon/conda/envs/my_env/lib/python3.8/site-packages/torch/__init__.py'>

julia> x = torch.tensor([1, 2, 3])
PyObject tensor([1, 2, 3])


The following what is written in the official tutorial example

import torch

dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
# Setting requires_grad=False indicates that we do not need to compute gradients
# with respect to these Tensors during the backward pass.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights.
# Setting requires_grad=True indicates that we want to compute gradients with
# respect to these Tensors during the backward pass.
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=True)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=True)

learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y using operations on Tensors; these
    # are exactly the same operations we used to compute the forward pass using
    # Tensors, but we do not need to keep references to intermediate values since
    # we are not implementing the backward pass by hand.
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # Compute and print loss using operations on Tensors.
    # Now loss is a Tensor of shape (1,)
    # loss.item() gets the scalar value held in the loss.
    loss = (y_pred - y).pow(2).sum()
    if t % 100 == 99:
        print(t, loss.item())

    # Use autograd to compute the backward pass. This call will compute the
    # gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call w1.grad and w2.grad will be Tensors holding the gradient
    # of the loss with respect to w1 and w2 respectively.

    # Manually update weights using gradient descent. Wrap in torch.no_grad()
    # because weights have requires_grad=True, but we don't need to track this
    # in autograd.
    # An alternative way is to operate on weight.data and weight.grad.data.
    # Recall that tensor.data gives a tensor that shares the storage with
    # tensor, but doesn't track history.
    # You can also use torch.optim.SGD to achieve this.
    with torch.no_grad():
        w1 -= learning_rate * w1.grad
        w2 -= learning_rate * w2.grad

        # Manually zero the gradients after updating weights

When ported to Julia almost as is

ENV["PYCALL_JL_RUNTIME_PYTHON"] = Sys.which("python")
ENV["PYTHON"] = Sys.which("python")
#When you change the configuration of python, execute the following line to build.
# using Pkg; Pkg.build("PyCall")
using PyCall

torch = pyimport("torch")

dtype = torch.float
device = torch.device("cpu")
# device = torch.device("cuda:0") # Uncomment this to run on GPU

# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10

# Create random Tensors to hold input and outputs.
# Setting requires_grad=False indicates that we do not need to compute gradients
# with respect to these Tensors during the backward pass.
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)

# Create random Tensors for weights.
# Setting requires_grad=True indicates that we want to compute gradients with
# respect to these Tensors during the backward pass.
w1 = torch.randn(D_in, H, device=device, dtype=dtype, requires_grad=true)
w2 = torch.randn(H, D_out, device=device, dtype=dtype, requires_grad=true)

learning_rate = 1e-6
for t in 1:500
    # Forward pass: compute predicted y using operations on Tensors; these
    # are exactly the same operations we used to compute the forward pass using
    # Tensors, but we do not need to keep references to intermediate values since
    # we are not implementing the backward pass by hand.
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # Compute and print loss using operations on Tensors.
    # Now loss is a Tensor of shape (1,)
    # loss.item() gets the scalar value held in the loss.
    loss = (y_pred - y).pow(2).sum()
    if t % 100 == 0
        println("$(t) $(loss.item())")

    # Use autograd to compute the backward pass. This call will compute the
    # gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call w1.grad and w2.grad will be Tensors holding the gradient
    # of the loss with respect to w1 and w2 respectively.

    # Manually update weights using gradient descent. Wrap in torch.no_grad()
    # because weights have requires_grad=True, but we don't need to track this
    # in autograd.
    # An alternative way is to operate on weight.data and weight.grad.data.
    # Recall that tensor.data gives a tensor that shares the storage with
    # tensor, but doesn't track history.
    # You can also use torch.optim.SGD to achieve this.
    @pywith torch.no_grad() begin
        #If you substitute it, it will be replaced, so do not use it.
        # w1 -= learning_rate * w1.grad
        # w2 -= learning_rate * w2.grad
        w1.sub_(learning_rate * w1.grad)
        w2.sub_(learning_rate * w2.grad)
        # Manually zero the gradients after updating weights

Will be.

