Is WebAssembly really fast? [Numerical calculation]

It is said that if you make the JS function WebAssembly, the calculation will be faster, but I also saw the theory that it is not so, so I will briefly verify it. Here, I will focus on the speed of pure numerical calculation, file size etc. Aspects are excluded from consideration. The title is "[Numerical Calculation]", but there are no plans to write a sequel at this time.

Previous research:

Following the article "Comparison of calculation speeds in various languages", I will take up the evaluation of the Leibniz series. I don't know if it's calculated so often ... Also, WebAssembly is compared to the speed of native code in terms of speed close to native execution.

[04-22 19:40] Corrected a code error and at the same time increased the number of loops to 1e9.

Native execution

The environment is Debian 10.3 on Win10 (WSL), Intel Core i7-6700 CPU @ 3.40GHz.

C language

From the point of view of benchmarking numerical calculation, I think that the speed of native C language is still the standard, so I will start with C language.

leibniz.c


# include <stdio.h>
int main(void){
  int i;
  double sum = 0.0;
  int signum = 1;
  double denom = 1.0;
  
  for (i = 0; i<=1e9; i++) {
    sum += signum / denom;
    signum *= -1;
    denom += 2.0;
  };
  printf("Ans:%.16f\n", 4.0*sum);
  return 0;
}

Rust

leibniz.rs


fn main() {
    let mut sum: f64 = 0.0;
    let mut signum = 1;
    let mut denom = 1.0;

    for _ in 0..=1_000_000_000 {
        sum += signum as f64 / denom;
        signum *= -1;
        denom += 2.0;
    }
    println!("Ans:{:.16}", 4.*sum);
}

JS (Node.js)

leibniz.js


var sum = 0.0;
var signum = 1;
var denom = 1.0;

for (var i = 0; i<=1e9; i++) {
    sum += signum / denom;
    signum *= -1;
    denom += 2.0;
}
console.log('%d',4.0*sum);

result

This value is the result of executing it only once, but it was repeated several times and it was confirmed that the error was only the last digit fluctuated.

$ gcc --version
gcc (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc leibniz.c -o leibniz
$ time ./leibniz
real    0m2.596s
user    0m2.578s
sys     0m0.000s
$ gcc -O3 leibniz.c -o leibniz
$ time ./leibniz
Ans:3.1415926545880506

real    0m1.136s
user    0m1.125s
sys     0m0.016s
$ gcc -O3 -march=native leibniz.c -o leibniz
$ time ./leibniz
Ans:3.1415926545880506

real    0m1.133s
user    0m1.125s
sys     0m0.000s
$ rustc --version
rustc 1.42.0 (b8cedc004 2020-03-09)
$ rustc leibniz.rs
$ time ./leibniz
Ans:3.1415926545880506

real    0m36.107s
user    0m36.094s
sys     0m0.016s
$ rustc -C opt-level=3 leibniz.rs
$ time ./leibniz
Ans:3.1415926545880506

real    0m1.124s
user    0m1.094s
sys     0m0.031s
$ rustc -C opt-level=3 -C target-cpu=native leibniz.rs
$ time ./leibniz
Ans:3.1415926545880506

real    0m1.181s
user    0m1.156s
sys     0m0.031s
$ node --version
v12.16.2
$ time node leibniz.js
3.1415926545880506

real    0m1.989s
user    0m1.953s
sys     0m0.047s

~~ Even with optimization, C is unexpectedly slow ... Rust is about 4 times faster, but I wonder if I made a mistake. Rust without optimization was the slowest, and Rust with optimization was the fastest. . ~~

After optimization, both C and Rust were about 1.13 sec. Node is about 75% slower.

Run on browser

I don't have Google Chrome right now, so I tried it only with Mozilla Firefox 75.0 on Win10.

JS

I'm not interested in the time of HTML rendering etc., so click the button to start the calculation.

leibniz.html


<!DOCTYPE html>
<html>
  <head>
    <title>Leibniz Series</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <button id="start">Start!</button>
    
    <script>
      function leibniz() {
          var sum = 0.0;
          var signum = 1;
          var denom = 1.0;

          for (var i = 0; i<=1e9; i++) {
              sum += signum / denom;
              signum *= -1;
              denom += 2.0;
          }
          console.log('%f',4.0*sum);
      }

      let button = document.getElementById("start");
      button.addEventListener("click", () => {
          const startTime = performance.now()
          leibniz();
          const endTime = performance.now();
          console.log(endTime - startTime); 
      });
    </script>
  </body>
</html>

The result was ~~ 0.168 sec ~~ 1.660 sec. It's only about ~~ 30% ~~ 45% slower than the optimized C language. It's scaryly fast ...

Rust (wasm)

src/lib.rs


use wasm_bindgen::prelude::*;

#[wasm_bindgen]
extern "C" {
    #[wasm_bindgen(js_namespace=console)]
    fn log(s: String);
}

#[wasm_bindgen]
pub fn leibniz() {
    let mut sum: f64 = 0.0;
    let mut signum = 1;
    let mut denom = 1.0;

    for _ in 0..=1_000_000_000 {
        sum += signum as f64 / denom;
        signum *= -1;
        denom += 2.0;
    }
    log(format!("Ans:{:.16}", 4.*sum));
}

index.html


<!DOCTYPE html>
<html>
  <head>
    <title>Leibniz Series</title>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <button id="start">Start!</button>

    <script type="module">
        import init, {leibniz} from '/pkg/rust_wasm.js';

        let button = document.getElementById("start");
        button.addEventListener("click", async () => {
            const startTime = performance.now()
            await init();
            leibniz();
            const endTime = performance.now();
            console.log(endTime - startTime); 
        });
    </script>
  </body>
</html>

The result was ~~ 0.035 sec ~~ 1.462 sec. ~~ It's about the same speed as Rust after optimization (about 10% slower) ~~ Faster than JS and about 30% slower than native.

The time to call the initialization function ʻinit () is also measured, but even the native person measures the time from reading the binary to the end of the process, so this will make the measurement as equal as possible. By the way, when I measured the time after calling ʻinit, it was about ~~ 0.015 sec ~~ 1.381 sec.

Conclusion

It was confirmed that WebAssembly is certainly ~ ~ 5 times faster than JS in pure numerical calculation itself, and it is about 15% faster than native, but not as fast as native. Since the code contains only four integer and floating point arithmetic operations, the results may change again in other processing (such as power), function calls, and memory access. In previous studies, it seems that wasm has not won over JS so far. ~~

By the way, what happens if you call wasm in Node.js? I thought I'd try it, but I'm exhausted so I'll do it again.

Recommended Posts

Is WebAssembly really fast? [Numerical calculation]
Object-oriented numerical calculation
Is WebAssembly really fast? [Numerical calculation]
Ruby / Rust linkage (3) Numerical calculation with FFI
Ruby / Rust linkage (4) Numerical calculation with Rutie
Ruby / Rust linkage (5) Numerical calculation with Rutie ② Bezier
Object-oriented numerical calculation
Is short-circuit evaluation really fast? Difference between && and & in Java