I tried to explain what a Python generator is for as easily as possible.

Introduction

There are other pages that explain Python generators, but many of them are in English and I don't feel comfortable with them, so I explained the generators in my own way, also to organize my thoughts. I will try. If I refuse, I'm basically confused, so I might say something wrong. If you have any mistakes, please comment and it will be a great help for you to study. Also, the usage of the generator introduced here is just a few examples of usage, and it does not mean that there is no other usage.

What is a generator?

There are other great articles for detailed and accurate explanations, so please refer to them: Python Iterators and Generators

To put it briefly, think of it as "a normal function with the return statement replaced by yield". (It's different, but please miss it for the time being). If you write yield instead of return, ** just calling will not return a value **. Instead, it will be called with a for statement etc. and ** will return values in sequence **. It's probably easier to understand by giving an example.

example.py


def count_up():
    x = 0
    while True:
        yield x
        x += 1

Certainly, it's yield, not return. And something is happening after * yield *. Since there is no processing after the return statement, you can immediately see that it is something different from simply replacing the return of a normal function. That's a big explanation. For now, this generator calls, for example:

>>> countup()
<generator object countup at 0x101fa0468>
>>> for i in countup():
...     print(i)
...     if i == 5:
...             break
0
1
2
3
4
5

Certainly, it seems that no value is returned just by calling. Instead, when you call it in a for statement, it returns values in sequence. What is happening at this time is that count_up returns the value of x each time it is called, but increments the value of x each time it is called. This is what the process comes after yield. Now that you know what the generator is (don't worry, I'll give you some examples if you're not sure). But what you probably think about here is ** Where do you use this? ** ** I think that's what it means. So, this time I will explain the purpose of the generator in my own way.

Difference between generator and function

As you can see from the previous example, generators and functions are ** completely different **. Generators are overwhelmingly closer to ** classes ** than functions. For example, in the previous example

>>> countup()

Can be clearly understood by interpreting that it was creating an instance of the class. So the following code also works:

>>> y = countup()
>>> for i in y:
...     print(i)
...     if i == 5:
...             break
0
1
2
3
4
5

As a test, if you run the following process after this,

>>> for i in y:
...     print(i)
...     if i == 10:
...             break
6
7
8
9
10

0 to 5 have disappeared! The reason is that the generator ** remembers what was done ** each time it was called. In other words, it has a ** state **. This is the biggest difference from a function. In the above example, x = 6 in y when the first for statement ends, and that continues until the next for statement.

So why is this important? By the way, let's consider, for example, the scene of calculating the Fibonacci sequence that everyone loves. Here are some commonly introduced calculation methods that use recursion:

fibonacci_recursive.py


def fibonacci(n):
    if n == 0:
        return 0
    if n == 1:
        return 1
    return fibonacci(n-1) + fibonacci(n-2)

It would be nice if n was small, but if n was large, the stack would fill up and an error would be thrown. On the other hand, when using a generator

fibonacci_generator.py


def fibonacci():
    a, b = 0, 1
    while 1:
        yield b
        a, b = b, a+b

By holding the values of the last two sequences as states in this way, the Fibonacci sequence can be calculated without wasting the stack.

Difference between generator and List

Just looking at the examples so far, the reader probably thinks ** Isn't List good? ** ** I think that's what it means. Certainly, if you just want to print from 0, you can list it separately. Then, when it's not good for a list, it's ** when it's difficult to keep the entire list in memory and it's unnecessary **. For example, consider the example of calculating the Nth element of the Fibonacci sequence using a list:

fibonacci_list.py


f = [0, 1]
n = 2
while n < N:
    f.append(f[n-1] + f[n-2])
    n += 1

If you just want to get the Nth value, the values in the N-3rd and earlier Fibonacci sequences are a waste of memory. In other words, the scene where the generator works is ** the scene where you need to return values in sequence and have a state, but you don't need to keep the entire list **. I would like you to imagine a finite state automaton with an output symbol. On the contrary, it is not suitable for situations where you do not need to have a separate state or you need to keep a list. The former example is for calculating a hash function for an input number, and the latter example is for creating a list of prime numbers. Below, we'll look at some more practical examples.

Practical use of the generator

Lexical analysis

An example of where the generator is actually used is the standard library for python phrase analysis: http://docs.python.jp/2/library/tokenize.html Lexical analysis is a process that is often performed when compiling a program. It looks at the program character by character and divides the character string of the program into important parts (called tokens). For example def f(hoge, foo): I mean, def, f, (, hoge, foo, ), : It will probably be split into a token string called. You can tell from the string def that it is a function definition, f is the function name, and the comma-separated variables between the "(" and ")" after it are the arguments. Lexical analysis also adds this information to the token and passes the token sequence to the next process (which may be a bit inaccurate, but see the compiler documentation for more information).

What is important here is that when you look at each character and analyze it, for example, the processing when you see the character ")" changes depending on whether you see "(" or not. .. That is, you must have some ** state **. However, once this function definition is complete, ** the information about this function doesn't matter **. It would be a waste of memory to save the information of the entire program one by one. Therefore, lexical analysis is a good place for generators to play an active role.

pipeline

Details are explained on the following site. https://brett.is/writing/about/generator-pipelines-in-python/

Simply put, a generator pipeline is a process of ** connecting ** several generators. I hope you can imagine the process being carried out from the side on a conveyor belt at the factory. The example presented is a pipeline that triples the even elements of a given list of integers, converts them to a string, and returns them.

pipeline.py


def even_filter(nums):
    for num in nums:
        if num % 2 == 0:
            yield num
def multiply_by_three(nums):
    for num in nums:
        yield num * 3
def convert_to_string(nums):
    for num in nums:
        yield 'The Number: %s' % num

nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pipeline = convert_to_string(multiply_by_three(even_filter(nums)))
for num in pipeline:
    print num

In this example, the individual generators have no state, but they are still using the generators effectively. It's a situation where you don't have to save the entire list, but you need to process it sequentially. You may think that function processing is fine, but processing odd-numbered values, for example, is just a wasteful calculation in the above example. That's why a generator that can apply processing to each element one by one is convenient.

Recursion

The following sites have detailed explanations and examples: http://www.unixuser.org/~euske/doc/python/recursion-j.html

As I explained earlier, the big advantage of generators is that they can process sequentially without holding the entire list. In other words, it goes well with problems that can be subdivided and solved through iterative processing. You can see that this is very close to the idea of solving problems with recursion. The Fibonacci sequence example was an example of replacing recursion with a generator, but of course you can use the generator itself recursively. The advantage of using a generator instead of a function is that once you have generated a value, you can discard it on the fly. For example, the code to scan a tree structure recursively using a generator looks like this:

tree.py


class Node:
    def __init__(self, data):
        self.data = data
	self.left = None
	self.right = None
		

def traverse(node):
    if node is not None:
	    for x in traverse(node.left):
	        yield x
        yield t.dat
	    for x in traverse(node.right):
	        yield x

By the way, if you use the yield from statement newly added from the python3.3 series, the traverse function will be even cleaner.

def traverse(node):
    if node is not None:
	    yield from traverse(node.left):
        yield t.dat
	    yield from traverse(node.right)

It will be.

Realization of coroutines

The following sources have detailed instructions: http://masnun.com/2015/11/13/python-generators-coroutines-native-coroutines-and-async-await.html

A coroutine is a subroutine that allows you to suspend processing and then resume processing in the middle. Coroutines can not only retrieve values, but also send them. For example, take the following example:

coroutine.py


def coro():
    hello = yield "Hello"
    yield hello
 
 
c = coro()
print(next(c))
print(c.send("World"))

Here you can see that the yield is on the right side of the assignment expression. Then, the send method sends the string " World ". You can see that it makes use of the characteristics of generators that have states. If you're talking about coroutines in detail, this is likely to be another article, so I'll just introduce it here. If you feel like it, you may also post an article about coroutines. Since native coroutines have been implemented since python3.5, the same processing can be achieved without using a generator.

Summary

The generator is a difficult concept to grasp, but I think it's surprisingly simple to understand if you think of it as a tool that has a state and returns values in order with only partial actions without holding the entire list. Let's all use the generator and write cool programs! (Although I haven't mastered it yet).

Recommended Posts

I tried to explain what a Python generator is for as easily as possible.
I tried to explain multiple regression analysis as easily as possible using concrete examples.
I tried to find out if ReDoS is possible with Python
Python: I tried to make a flat / flat_map just right with a generator
[Pyto] I tried to use a smartphone as a flick keyboard for PC
I tried to implement merge sort in Python with as few lines as possible
I tried to easily create a fully automatic attendance system with Selenium + Python
[Python] I tried to get the type name as a string from the type function
I tried to implement what seems to be a Windows snipping tool in Python
I want to easily implement a timeout in python
I want to iterate a Python generator many times
I tried to draw a route map with Python
I tried to implement a pseudo pachislot in Python
I tried to automatically generate a password with Python3
To myself as a Django beginner (2) --What is MTV?
What is a python map?
I tried to implement a one-dimensional cellular automaton in Python
[Markov chain] I tried to read a quote into Python.
I tried "How to get a method decorated in Python"
Creating a GUI as easily as possible with python [tkinter edition]
I tried to create a bot for PES event notification
I tried to make a stopwatch using tkinter in python
[1 hour challenge] I tried to make a fortune-telling site that is too suitable with Python
I tried "Streamlit" which turns the Python code into a web application as it is
I tried to make a generator that generates a C # container class from CSV with Python
I also tried to imitate the function monad and State monad with a generator in Python
What is Python? What is it used for?
[Python] What is a zip function?
[Python] What is a with statement?
I tried to touch Python (installation)
I tried to explain Pytorch dataset
Python for statement ~ What is iterable ~
What is the python underscore (_) for?
[5th] I tried to make a certain authenticator-like tool with python
I made a library to easily read config files with Python
I tried to convert a Python file to EXE (Recursion error supported)
[2nd] I tried to make a certain authenticator-like tool with python
I tried to make a regular expression of "amount" using Python
What I was addicted to when introducing ALE to Vim for Python
[Python] I tried to implement stable sorting, so make a note
I tried to make a regular expression of "time" using Python
[3rd] I tried to make a certain authenticator-like tool with python
[Python] A memo that I tried to get started with asyncio
I tried to create a list of prime numbers with python
I tried to make a regular expression of "date" using Python
[Pandas] I tried to analyze sales data with Python [For beginners]
I tried to implement a misunderstood prisoner's dilemma game in Python
I tried to make a periodical process with Selenium and Python
I tried to make a 2channel post notification application with Python
I tried to make a todo application using bottle with python
[4th] I tried to make a certain authenticator-like tool with python
I tried to easily detect facial landmarks with python and dlib
[1st] I tried to make a certain authenticator-like tool with python
I tried to make a strange quote for Jojo with LSTM
[Python] I tried to explain words that are difficult for beginners to understand in an easy-to-understand manner.
Mayungo's Python Learning Episode 4: I tried to see what happens when numbers are treated as letters
I tried to find out if m is included in what is called a range type or range such as n..m and range (n, m).
I tried to create a linebot (implementation)
I made a Docker container to use JUMAN ++, KNP, python (for pyKNP).
I tried to summarize Python exception handling
I tried to implement PLSA in Python