There are other pages that explain Python generators, but many of them are in English and I don't feel comfortable with them, so I explained the generators in my own way, also to organize my thoughts. I will try. If I refuse, I'm basically confused, so I might say something wrong. If you have any mistakes, please comment and it will be a great help for you to study. Also, the usage of the generator introduced here is just a few examples of usage, and it does not mean that there is no other usage.
There are other great articles for detailed and accurate explanations, so please refer to them: Python Iterators and Generators
To put it briefly, think of it as "a normal function with the return statement replaced by yield". (It's different, but please miss it for the time being). If you write yield instead of return, ** just calling will not return a value **. Instead, it will be called with a for statement etc. and ** will return values in sequence **. It's probably easier to understand by giving an example.
example.py
def count_up():
x = 0
while True:
yield x
x += 1
Certainly, it's yield, not return. And something is happening after * yield *. Since there is no processing after the return statement, you can immediately see that it is something different from simply replacing the return of a normal function. That's a big explanation. For now, this generator calls, for example:
>>> countup()
<generator object countup at 0x101fa0468>
>>> for i in countup():
... print(i)
... if i == 5:
... break
0
1
2
3
4
5
Certainly, it seems that no value is returned just by calling. Instead, when you call it in a for statement, it returns values in sequence. What is happening at this time is that count_up returns the value of x each time it is called, but increments the value of x each time it is called. This is what the process comes after yield. Now that you know what the generator is (don't worry, I'll give you some examples if you're not sure). But what you probably think about here is ** Where do you use this? ** ** I think that's what it means. So, this time I will explain the purpose of the generator in my own way.
As you can see from the previous example, generators and functions are ** completely different **. Generators are overwhelmingly closer to ** classes ** than functions. For example, in the previous example
>>> countup()
Can be clearly understood by interpreting that it was creating an instance of the class. So the following code also works:
>>> y = countup()
>>> for i in y:
... print(i)
... if i == 5:
... break
0
1
2
3
4
5
As a test, if you run the following process after this,
>>> for i in y:
... print(i)
... if i == 10:
... break
6
7
8
9
10
0 to 5 have disappeared! The reason is that the generator ** remembers what was done ** each time it was called. In other words, it has a ** state **. This is the biggest difference from a function. In the above example, x = 6 in y when the first for statement ends, and that continues until the next for statement.
So why is this important? By the way, let's consider, for example, the scene of calculating the Fibonacci sequence that everyone loves. Here are some commonly introduced calculation methods that use recursion:
fibonacci_recursive.py
def fibonacci(n):
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n-1) + fibonacci(n-2)
It would be nice if n was small, but if n was large, the stack would fill up and an error would be thrown. On the other hand, when using a generator
fibonacci_generator.py
def fibonacci():
a, b = 0, 1
while 1:
yield b
a, b = b, a+b
By holding the values of the last two sequences as states in this way, the Fibonacci sequence can be calculated without wasting the stack.
Just looking at the examples so far, the reader probably thinks ** Isn't List good? ** ** I think that's what it means. Certainly, if you just want to print from 0, you can list it separately. Then, when it's not good for a list, it's ** when it's difficult to keep the entire list in memory and it's unnecessary **. For example, consider the example of calculating the Nth element of the Fibonacci sequence using a list:
fibonacci_list.py
f = [0, 1]
n = 2
while n < N:
f.append(f[n-1] + f[n-2])
n += 1
If you just want to get the Nth value, the values in the N-3rd and earlier Fibonacci sequences are a waste of memory. In other words, the scene where the generator works is ** the scene where you need to return values in sequence and have a state, but you don't need to keep the entire list **. I would like you to imagine a finite state automaton with an output symbol. On the contrary, it is not suitable for situations where you do not need to have a separate state or you need to keep a list. The former example is for calculating a hash function for an input number, and the latter example is for creating a list of prime numbers. Below, we'll look at some more practical examples.
An example of where the generator is actually used is the standard library for python phrase analysis:
http://docs.python.jp/2/library/tokenize.html
Lexical analysis is a process that is often performed when compiling a program. It looks at the program character by character and divides the character string of the program into important parts (called tokens). For example
def f(hoge, foo):
I mean,
def, f, (, hoge, foo, ), :
It will probably be split into a token string called.
You can tell from the string def that it is a function definition, f is the function name, and the comma-separated variables between the "(" and ")" after it are the arguments. Lexical analysis also adds this information to the token and passes the token sequence to the next process (which may be a bit inaccurate, but see the compiler documentation for more information).
What is important here is that when you look at each character and analyze it, for example, the processing when you see the character ")" changes depending on whether you see "(" or not. .. That is, you must have some ** state **. However, once this function definition is complete, ** the information about this function doesn't matter **. It would be a waste of memory to save the information of the entire program one by one. Therefore, lexical analysis is a good place for generators to play an active role.
Details are explained on the following site. https://brett.is/writing/about/generator-pipelines-in-python/
Simply put, a generator pipeline is a process of ** connecting ** several generators. I hope you can imagine the process being carried out from the side on a conveyor belt at the factory. The example presented is a pipeline that triples the even elements of a given list of integers, converts them to a string, and returns them.
pipeline.py
def even_filter(nums):
for num in nums:
if num % 2 == 0:
yield num
def multiply_by_three(nums):
for num in nums:
yield num * 3
def convert_to_string(nums):
for num in nums:
yield 'The Number: %s' % num
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pipeline = convert_to_string(multiply_by_three(even_filter(nums)))
for num in pipeline:
print num
In this example, the individual generators have no state, but they are still using the generators effectively. It's a situation where you don't have to save the entire list, but you need to process it sequentially. You may think that function processing is fine, but processing odd-numbered values, for example, is just a wasteful calculation in the above example. That's why a generator that can apply processing to each element one by one is convenient.
The following sites have detailed explanations and examples: http://www.unixuser.org/~euske/doc/python/recursion-j.html
As I explained earlier, the big advantage of generators is that they can process sequentially without holding the entire list. In other words, it goes well with problems that can be subdivided and solved through iterative processing. You can see that this is very close to the idea of solving problems with recursion. The Fibonacci sequence example was an example of replacing recursion with a generator, but of course you can use the generator itself recursively. The advantage of using a generator instead of a function is that once you have generated a value, you can discard it on the fly. For example, the code to scan a tree structure recursively using a generator looks like this:
tree.py
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
def traverse(node):
if node is not None:
for x in traverse(node.left):
yield x
yield t.dat
for x in traverse(node.right):
yield x
By the way, if you use the yield from
statement newly added from the python3.3 series, the traverse
function will be even cleaner.
def traverse(node):
if node is not None:
yield from traverse(node.left):
yield t.dat
yield from traverse(node.right)
It will be.
The following sources have detailed instructions: http://masnun.com/2015/11/13/python-generators-coroutines-native-coroutines-and-async-await.html
A coroutine is a subroutine that allows you to suspend processing and then resume processing in the middle. Coroutines can not only retrieve values, but also send them. For example, take the following example:
coroutine.py
def coro():
hello = yield "Hello"
yield hello
c = coro()
print(next(c))
print(c.send("World"))
Here you can see that the yield is on the right side of the assignment expression. Then, the send method sends the string " World "
. You can see that it makes use of the characteristics of generators that have states.
If you're talking about coroutines in detail, this is likely to be another article, so I'll just introduce it here. If you feel like it, you may also post an article about coroutines.
Since native coroutines have been implemented since python3.5, the same processing can be achieved without using a generator.
The generator is a difficult concept to grasp, but I think it's surprisingly simple to understand if you think of it as a tool that has a state and returns values in order with only partial actions without holding the entire list. Let's all use the generator and write cool programs! (Although I haven't mastered it yet).