[GO] Bit operation speedup memo

Introduction

The original story is Hacker's Fun (original title: Hacker's Delight).

Bitmap loop

01011000
↓
f(00001000)
f(00010000)
f(01000000)

When you want to scan from the right and process only when it is 1. You don't have to loop for the length of the bit, you can just loop one number.

for (int x = 0b01011000; x > 0; x &= x - 1) {
  f(x & -x);
}

By repeating x & -x and x & = x -1 in this way, you can walk only the part of 1.

x & -x

01011000 = x 10100111 = not(x) 10101000 = not(x)+1 = -x In other words 00001000 = x & -x Then, ** only the rightmost 1 remains. ** **

x & x-1

01011000 = x 01010111 = x-1 In other words 01010000 = x & x-1 Then, ** 1 at the right end disappears. ** **

Bit count

00101110 10100000 11100101 11001011
↓
16

When you want to count the number of 1 It can be obtained by bit operation log (number of bits) times without looping by the number of bits.

x = 0b00101110101000001110010111001011;
x = (x & 0x55555555) + ((x >>>  1) & 0x55555555)
x = (x & 0x33333333) + ((x >>>  2) & 0x33333333)
x = (x & 0x0F0F0F0F) + ((x >>>  4) & 0x0F0F0F0F)
x = (x & 0x00FF00FF) + ((x >>>  8) & 0x00FF00FF)
x = (x & 0x0000FFFF) + ((x >>> 16) & 0x0000FFFF)

What we are doing is divide and rule by writing the partial sum to x itself.

procedure

1st substitution

Divide the data into 2 bits, find the number of 1s for each 2 bits, and overwrite x. 00 10 11 10 10 10 00 00 11 10 01 01 11 00 10 11 = x ↓ _0 _1 _2 _1 _1 _1 _0 _0 _2 _1 _1 _1 _2 _0 _1 _2 (decimal notation) ↓ 00 01 10 01 01 01 00 00 10 01 01 01 10 00 01 10 = New x


00 10 11 10 10 10 00 00 11 10 01 01 11 00 10 11 = x _0 _0 _1 _0 _0 _0 _0 _0 _1 _0 _1 _1 _1 _0 _0 _1 = x & 0x55555555 _0 _1 _1 _1 _1 _1 _0 _0 _1 _1 _0 _0 _1 _0 _1 _1 = ((x >>> 1) & 0x55555555) Add the upper and lower digits at the same time when divided by 2 bits 00 01 10 01 01 01 00 00 10 01 01 01 10 00 01 10 = (x & 0x55555555) + ((x >>> 1) & 0x55555555) What we are doing is 2-bit vector operation.

Second substitution

Divide the data into 4 bits, find the number of 1s for each 4 bits, and overwrite x. 0001 1001 0101 0000 1001 0101 1000 0110 = x ↓ _0_1 _2_1 _1_1 _0_0 _2_1 _1_1 _2_0 _1_2 (decimal notation) ↓ ___ 1 ___ 3 ___ 2 ___ 0 ___ 3 ___ 2 ___ 2 ___ 3 (decimal notation) ↓ 0001 0011 0010 0000 0011 0010 0010 0011 = new x


0001 1001 0101 0000 1001 0101 1000 0110 = x __01 __01 __01 __00 __01 __01 __00 __10 = x & 0x33333333 __00 __10 __01 __00 __10 __01 __10 __01 = (x >>> 1) & 0x33333333 Add the upper 2 bits and the lower 2 bits at the same time when divided by 4 bits. 0001 0011 0010 0000 0011 0010 0010 0011 = (x & 0x33333333) + ((x >>> 2) & 0x33333333) What we are doing is 4-bit vector operation.

3rd substitution

Divide the data into 8 bits, find the number of 1s for each 8 bits, and overwrite x. 00010011 00100000 00110010 00100011 = x ↓ ___1___3 ___2___0 ___3___2 ___2___3 (decimal notation) ↓ _______ 4 _______2 _______ 5 _______ 5 (decimal notation) ↓ 00000100 00000010 00000101 00000101 = new x

4th substitution

Divide the data into 16 bits, find the number of 1s for each 16 bits, and overwrite x. 0000010000000010 0000010100000101 = x ↓ _______ 4 _______2 _______ 5 _______ 5 (decimal notation) ↓ _______________6 ______________ 10 (decimal notation) ↓ 0000000000000110 0000000000001010 = new x

5th substitution

Find the number of 1s in the entire data and overwrite x. 0000000000000110 0000000000001010 = new x ↓ _______________6 ______________ 10 (decimal notation) ↓ _______________________________ 16 (decimal notation) ↓ 0000000000000000 0000000000010000 = new x ↓ 16

By the way

It may be implemented in the standard library. I only know Java and Go examples, Java : Integer.bitCount()、Long.bitCount() Go: bits package OnesCountXX () (More optimized than the code above).

Furthermore, since it has been implemented as a CPU extension instruction (SSE4.2 POPCNT for Intel) about 10 years ago, In the first place, if you really want to speed up, you should use a language that can directly call POPCNT.

Regarding Java, JIT seems to replace bitCount () with POPCNT, What about Go?

reference

wikipedia: SSE4, Hamming weight https://en.wikipedia.org/wiki/SSE4 https://en.wikipedia.org/wiki/Hamming_weight

Long class (source code) of openjdk8 https://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/Long.java

Go bits package (source code) https://golang.org/src/math/bits/bits.go?s=3946:3972#L103

Bit reverse

00101110 10100000 11100101 11001011
↓
11010011 10100111 00000101 01110100

Divide and rule can also be used when you want to reverse the bits.

x = 0b00101110101000001110010111001011;
x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >>> 16;
x = (x & 0x00FF00FF) <<  8 | (x & 0xFF00FF00) >>> 8;
x = (x & 0x0F0F0F0F) <<  4 | (x & 0xF0F0F0F0) >>> 4;
x = (x & 0x33333333) <<  2 | (x & 0xCCCCCCCC) >>> 2;
x = (x & 0x55555555) <<  1 | (x & 0xAAAAAAAA) >>> 1;

procedure

1st substitution

Divide the data into 16 bits and replace them. 0010111010100000 1110010111001011 = x ↓ 1110010111001011 0010111010100000 = new x


0010111010100000 1110010111001011 = x 1110010111001011 ________________ = (x & 0x0000FFFF) << 16 ________________ 0010111010100000 = (x & 0xFFFF0000) >>> 16 1110010111001011 0010111010100000 = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >>> 16

Second substitution

Divide the data into 8 bits and swap them next to each other. 11100101 11001011 00101110 10100000 = x ↓ 11001011 11100101 10100000 00101110 = new x


11100101 11001011 00101110 10100000 = x 11001011 ________ 10100000 ________ = (x & 0x00FF00FF) << 8 ________ 11100101 ________ 00101110 = (x & 0xFF00FF00) >>> 8 11001011 11100101 10100000 00101110 = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >>> 8

3rd substitution

Divide the data into 4 bits and swap them next to each other. 1100 1011 1110 0101 1010 0000 0010 1110 = x ↓ 1011 1100 0101 1110 0000 1010 1110 0010 = new x

4th substitution

Divide the data into 2 bits and swap them next to each other. 10 11 11 00 01 01 11 10 00 00 10 10 11 10 00 10 = x ↓ 11 10 00 11 01 01 10 11 00 00 10 10 10 11 10 00 = New x

5th substitution

Divide the data into 1 bit at a time and replace them next to each other. 1 1 1 0 0 0 1 1 0 1 0 1 1 0 1 1 0 0 0 0 1 0 1 0 1 0 1 1 1 0 0 0 = x ↓ 1 1 0 1 0 0 1 1 1 0 1 0 0 1 1 1 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 = New x

11010011101001110000010101110100 = Original x 00101110101000001110010111001011 = New x And it is in the correct reverse order.

By the way

This may also be implemented in the standard library. Java -> Integer.reverseBytes()、Long.reverseBytes() Go-> bits package ReverseXX ()

Recommended Posts

Bit operation speedup memo
Docker operation memo
Introduction to bit operation
Lombok's @Getter @Setter operation memo