Added about 2018/04/20 in list

Actually, I wanted to put numbers in 〇〇 and make it look like a business book, but I made it 〇〇 because I didn't decide the number properly and I will add it from time to time: innocent :.

Caution

This is for beginners who are just starting out with Python, For those who have recently learned about numpy or scipy. So if you're familiar with Python, you'd rather get some advice. I would be grateful if you could tell me another case or a better way: smiley :.

Also, the execution environment was ** Python 3.5.3 **, so please be careful especially if you are using ** Python 2 series **. (Because the return value of map or filter is different)

Overview

With the recent machine learning boom, many people may have started learning Python. You may notice the existence of libraries such as numpy, especially when dealing with simple numerical processing and actual data.

However, I realize that I handle data of a certain size, but it may take time to execute if I do not devise a way of writing (I personally feel that it is a statically typed language before programming for the first time or Python. I think it's easy to get up if you do it).

Especially when I'm learning, I want to try different things, so if it takes time to execute each time, I can't do it. I hate programming: angry :.

So here, by using ** numpy ** etc., in the case where it seems that it can be speeded up "relatively easily" I will introduce about it.

Cython etc. are difficult to introduce (typing to individual variables, etc.), so we will not deal with them this time.

Rough policy

I'm personally careful. If it's slow due to writing problems, it's likely that you're stuck with one of the following:

The following example is unnecessary for those who follow the following parts.

** Avoid for statements ** <-Important
Python for statement is not fast
It makes me sad when I make a triple loop etc .: cry:
Be careful not to allocate memory dynamically
Adding to the list with append etc. tends to slow down
Consider whether it can be processed at once
In many cases, you can do it without using the for statement.
Use an existing function ← Super important
Fast (often optimized, such as the contents implemented in C)
Easy to understand (when read by others)
Major functions are generally understood (map function, etc.)

Concrete example

Preparation

The following libraries are imported in advance.

import numpy as np
import pandas as pd
import scipy as sp

Case 1: for I want to store the result of the statement in a list

`Sample.py`


def func1(n):
    a = []
    for i in range(n):
        a.append(i)
    return a

def func2(n):
    a = [0 for i in range(n)]  #A list of length n initialized with 0
    for i in range(n):
        a[i] = i
    return a

def func3(n):
    a = [i for i in range(n)]  #First initialized by comprehension
    return a

def func4(n):
    return [i for i in range(n)]  #Define directly and return

%time a = func1(10000000)
%time b = func2(10000000)
%time c = func3(10000000)
%time d = func4(10000000)

`result`


CPU times: user 660 ms, sys: 100 ms, total: 760 ms
Wall time: 762 ms
CPU times: user 690 ms, sys: 60 ms, total: 750 ms
Wall time: 760 ms
CPU times: user 290 ms, sys: 90 ms, total: 380 ms
Wall time: 388 ms
CPU times: user 320 ms, sys: 90 ms, total: 410 ms
Wall time: 413 ms

If you know the length of the list to be returned in advance, use comprehension It will be faster. In fact, this alone halves the execution time. It's a good idea to be aware of this, especially when turning a for statement on a long list.

Case 2: I want to perform four arithmetic operations on all elements in a vector with the same value.

Here, it is assumed that the following vectors are defined in advance.

a = np.array([i for i in range(10000000)])

Consider a function that doubles and returns all the elements in a vector for this vector.

`Sample.py`


def func1(x):
    y = x.copy()
    for i in range(len(y)):
        y[i] *= 2
    return y

def func2(a):
    return a * 2

%time b = func1(a)
%time c = func2(a)

`result`


CPU times: user 2.33 s, sys: 0 ns, total: 2.33 s
Wall time: 2.33 s
CPU times: user 10 ms, sys: 10 ms, total: 20 ms
Wall time: 13 ms

In this way, numpy can perform four arithmetic operations for each vector, so for Be careful not to circulate.

Case 4: I want to extract only some elements with a vector

Use the same vector as above. For example, suppose you want to fetch only multiple elements of 3 from the above vector. Then you might think, "I have no choice but to use the if statement in the for statement!" You can also write as follows.

`Sample.py`


def func1(a):
    ans  = []
    for i in range(len(a)):
        if a[i] % 3 == 0:
            ans.append(a[i])
    return np.array(ans)

def func2(a):
    return a[a % 3 == 0]

%time b = func1(a)
%time c = func2(a)

`result`


CPU times: user 3.44 s, sys: 10 ms, total: 3.45 s
Wall time: 3.45 s
CPU times: user 120 ms, sys: 10 ms, total: 130 ms
Wall time: 131 ms

Postscript

If you want to retrieve from a list instead of a vector, you can use the ** filter ** function. If you can't or don't want to use ** numpy **, consider this.

You can think of lambda x: y in the sample as an anonymous function that takes x as an argument and returns y.

`Sample.py`


x = [i for i in range(10000000)]
%time y = list(filter(lambda x: x % 3 == 0, x))

`result`


CPU times: user 1.67 s, sys: 10 ms, total: 1.68 s
Wall time: 1.68 s

It's slower than using ** numpy **, but faster than appending with a for statement!

Case 5: I want to apply a function to each element of a vector

Now consider applying a function to each element of the list. This section introduces the ** map ** function. This is a function that returns the result of applying the specified function to each element in the list (map object in Python3).

Also, the func below is a function that returns $ x ^ 2 + 2x + 1 $.

`Sample.py`


a = np.array([i for i in range(10000000)])
def func(x):
    return x**2 + 2*x + 1

def func1(a):
    return np.array([func(i) for i in a])

def func2(a):
    return np.array(list(map(func, a.tolist())))

%time b = func1(a)
%time c = func2(a)
%time d = a**2 + 2*a + 1

`result`


CPU times: user 5.14 s, sys: 90 ms, total: 5.23 s
Wall time: 5.23 s
CPU times: user 4.95 s, sys: 170 ms, total: 5.12 s
Wall time: 5.11 s
CPU times: user 20 ms, sys: 30 ms, total: 50 ms
Wall time: 51.2 ms

I'd like to introduce you to the map function, but it didn't change that much from the comprehension: cry :. If you read this far, you may have noticed in the middle, but in the case of the above example, it was a simple function, so it is overwhelmingly faster to perform vector operation directly!

Case 6: I want to convert each element (numerical value) of a matrix to an arbitrary score (discrete value)

So far, we have dealt with one-dimensional arrays (vectors). In the example below, I would like to deal with a two-dimensional array (matrix).

In the following cases, it is assumed that you want to convert each numerical value into a score by preprocessing such as machine learning. First, define the following matrix.

a = np.array([[i % 100 for i in range(1000)] for j in range(10000)])

Next, prepare a list to convert to a score. In the list below, 0 if the original number is less than 20, 1 if it is 20 or more and less than 50, 4 if it is 90 or more. Suppose that you want to convert the numbers in the matrix.

scores = [20, 50, 70, 90]

First of all, I would like to empty my head and implement it obediently.

`Sample.py`


def func1(x):
    y = np.zeros(x.shape)
    for s in scores:
        for i in range(x.shape[0]):
            for j in range(x.shape[1]):
                if x[i, j] >= s:
                    y[i, j] += 1
    return y

%time b = func1(a)

The result is a nice triple loop: innocent :. (Deep loops are not only slower, but also harder to read and follow loop variables. Don't make too many deep loops for humans)

The contents of the function are incremented by 1 for each element in the matrix if it is greater than the specified score.

`result1`


CPU times: user 14 s, sys: 10 ms, total: 14 s
Wall time: 14 s

As expected, the execution time also exceeded ** 10 seconds **: cry :.

Next, I will introduce a function that has been devised.

`Sample2.py`


def func2(x):
    y = np.zeros(x.shape)
    for s in scores:
        y += (x >= s)
    return y

%time c = func2(a)

Here's what we're doing:

Prepare all 0 matrices y with the same shape (number of matrices) as x
For each score, add (x> = s) to y
x> = s is ** True ** if element> = s, ** False ** if not ** matrix ** for each element of the matrix x
For the same form of matrices n and m, n + m will add the elements together.
The content of y is a number, and if you try to add ** True ** or ** False ** to the number, you get ** 1 ** and ** 0 **.

As mentioned above, the code is short but contains various elements. However, the amount that the for statement is no longer turned tightly ** 100 times or more ** Faster: smile :.

`result`


CPU times: user 90 ms, sys: 20 ms, total: 110 ms
Wall time: 111 ms

Postscript (2017/08/30)

At this point, you may feel like "I want to erase all ** for ** statements before they are born: angry:". So I wrote it as a trial.

`Sample3.py`


def func3(x):
    len_score = len(scores)
    y = x * np.array([[np.ones(len_score)]]).T
    s = np.array(scores).reshape(len_score, 1, 1)
    z = (y >= s)
    return z.sum(axis=0)

`result`


CPU times: user 200 ms, sys: 30 ms, total: 230 ms
Wall time: 235 ms

... late: cry: (maybe because of bad writing) This is slow, requires a lot of memory (because all of it is expanded first), and above all, it becomes difficult to understand, so I found that it is not a good idea to delete the for statement by force.

A 0-dimensional array is called a scalar, a 1-dimensional array is called a vector, and a 2-dimensional array is called a matrix, while a 3-dimensional array is called a tensor.
Is the above implementation a tensor calculation? Please let me know if you can write smarter depending on: innocent:

Case 7: Existence check on list element (Added 2018/04/20)

I remembered it when I saw a recent article, so I made a note.

In Python you can use ʻin` to see if an element is in the list.

But if you apply this to a list, it's $ O (n) $ for a list length of $ n $, so if you make a mistake, an accident will occur.

If you want to check the existence repeatedly, it is better to replace it with set etc. as shown below.

`JupyterNotebook(GoogleColaboratory)Confirmed in`


L = 100000
x = list(range(L))

def sample1(list_tmp):
    j = 0
    for i in list_tmp:
        if i in list_tmp:
            j += 1
    print("sample1 j: ", j)


def sample2(list_tmp):
    j = 0
    set_tmp = set(list_tmp)  #Convert to set
    for i in list_tmp:
        if i in set_tmp:     #Check if it is in set
            j += 1
    print("sample2 j: ", j)
    
%time sample1(x)
print("----------------------------------------")
%time sample2(x)

`result`


sample1 j:  100000
CPU times: user 1min 7s, sys: 16 ms, total: 1min 7s
Wall time: 1min 7s
----------------------------------------
sample2 j:  100000
CPU times: user 8 ms, sys: 6 ms, total: 14 ms
Wall time: 14 ms

Extra 1 "I still want to use the for statement"

I said above that I shouldn't use that much for statement, Even so, I think there are situations where you have to use it, or it is easier to understand.

In that case, reopen it and use * numba *. * numba * is a little compiler.

"Well, does the compiler specify all variables? Do I have to type a compile command?"

You might think, but don't worry. Just add one line (two lines if you include ʻimport`).

Let's see an actual usage example.


import numba

def sample1(n):
    ans = 0
    for i in range(n):
        ans += i
    return ans

@numba.jit
def sample2(n):
    ans = 0
    for i in range(n):
        ans += i
    return ans

@numba.jit('i8(i8)', nopython=True)
def sample3(n):
    ans = 0
    for i in range(n):
        ans += i
    return ans

%time a = sample1(100000000)  #If you do nothing
%time b = sample2(100000000)  #When using jit
%time c = sample3(100000000)  # jit(Type specification)When using

From top to bottom, "I didn't do anything", "I used numba", "I used numba (type specification)" It is a function. Inside the function is a function that adds and returns 0 to $ n -1 $.

For type specification, refer to Python acceleration Numba introduction 2 --tkm2261's blog.

The execution time is as follows. If you do nothing, it will take 5 seconds, but if you use "numba (type specification)", it will be about 5.5 microseconds. It's just a different digit (in this example, it's ** about 940,000 times faster **: innocent :).

CPU times: user 5.16 s, sys: 0 ns, total: 5.16 s
Wall time: 5.16 s
CPU times: user 30 ms, sys: 0 ns, total: 30 ms
Wall time: 25.9 ms
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 5.48 µs

The above result is the result at the first execution after the function definition. Since sample2 has been compiled, it can be executed in about 6 μs the next time it is executed. Still, sample3 was slightly faster. I think the reason sample2 takes a long time to run for the first time is probably because it takes a long time to type infer at compile time.
Note that numba supports numpy and scipy, but pandas does not.

in conclusion

I feel like I wrote a lot, but in the above case, I feel that it ended with "Don't use for statement". In the future, I would like to put together various things such as ** scipy ** and ** pandas **.

How to make Python faster for beginners [numpy]

Caution

Overview

Rough policy

Concrete example

Preparation

Case 1: for I want to store the result of the statement in a list

Sample.py

result

Case 2: I want to perform four arithmetic operations on all elements in a vector with the same value.

Sample.py

result

Case 4: I want to extract only some elements with a vector

Sample.py

result

Postscript

Sample.py

result

Case 5: I want to apply a function to each element of a vector

Sample.py

result

Case 6: I want to convert each element (numerical value) of a matrix to an arbitrary score (discrete value)

Sample.py

result1

Sample2.py

result

Postscript (2017/08/30)

Sample3.py

result

Case 7: Existence check on list element (Added 2018/04/20)

JupyterNotebook(GoogleColaboratory)Confirmed in

result

Extra 1 "I still want to use the for statement"

in conclusion

`Sample.py`

`result`

`Sample.py`

`result`

`Sample.py`

`result`

`Sample.py`

`result`

`Sample.py`

`result`

`Sample.py`

`result1`

`Sample2.py`

`result`

`Sample3.py`

`result`

`JupyterNotebook(GoogleColaboratory)Confirmed in`

`result`