[PYTHON] What is an iterator?

aim

Somehow understand how the code below works

for i in range(5):
    print(i)
# 0
# 1
# 2
# 3
# 4

What is an iterator

Python for statements work for iterators.

I don't think it will come to you even if you say that. So, I will explain why you need such a concept.

Try to reproduce with a while statement once

Consider the following sample for statement.

l = ['Alpha', 'Beta', 'Charlie']

for name in l:
    print(name)

Anyone who has learned Python will know what this will output?

Alpha
Beta
Charlie

Yes, it takes it out of the array and displays the name. If you rewrite this with while, you can rewrite it like this.

l = ['Alpha', 'Beta', 'Charlie']
i = 0
while True:
    if i == len(l):
        break
    print(l[i])
    i += 1

Do the same for Set

The output is the same. Is the Python for statement a feature that makes it easy to perform such operations on various arrays? Consider the following example. This time, it is an example of using the set type instead of the array. set is an object that represents a set, and even if you enter the same number, it will be saved as one.

s = {1, 2, 2, 3, 1, 4}
↓
{1, 2, 3, 4}

Attempts to operate on this object as before.

s = {1, 2, 3, 4, 5}

i = 0
while True:
    if i == len(s):
        break
    print(s[i])
    i += 1

When I run this program, I get an error and get angry.

Traceback (most recent call last):
  File "a.py", line 7, in <module>
    print(s[i])
TypeError: 'set' object is not subscriptable

I wonder why? This is because objects such as Set and Dict types are not arranged in a horizontal row like an array. It is represented by a structure called a hash table.

1200px-HASHTB08.svg.png (Image from Wikipedia)

There are also data structures in the world that are represented by trees. This time it is easy to understand, so let's take this as an example. Binary_tree.png (Image from Wikipedia)

Objects expressed in this way will not be taken immediately even if you instruct them to say "Get the 5th one!". You have to follow in order from the head 2 in this figure. Therefore, access by subscript is prohibited. Instead, they will immediately answer instructions such as "Look for the one that says hogehoge!". The dictionary type is an object that takes advantage of such characteristics.

So how do you reproduce the same behavior with a while statement? The following is an example.

s = {1, 2, 3, 4, 5}

while True:
    if s == set():
        break
    print(s.pop())

I will omit the detailed algorithm, but you can see that the operation is completely different from list. However, the for statement can also be used for this set type.

s = {1, 2, 3, 4, 5}

for num in s:
    print(num)

I wonder why? This is the essence of "Python's for statement drives an iterator". Iterators are implemented in list and set. And the for statement is passing an iterator object for list and set. So, even if you write as follows, it works in the same way.

s = {1, 2, 3, 4, 5}

a = iter(s)
for num in a:
    print(num)

Image of iterator operation

And \ _ \ _next \ _ \ _ () is always implemented in the iterator object, and it is a function that returns the value of the location next to the current iterator.

So this will still give the same output

s = {1, 2, 3, 4, 5}

a = iter(s)

print(next(a)) # __next__Call from the outside
print(next(a))
print(next(a))
print(next(a))
print(next(a))

This is illustrated below.

I think you somehow found the convenience of iterators. The iterator only needs to take the next value, so you don't have to keep all the data.

Understand the aim

Take a look at the following code. This is the code I wrote for the purpose

for i in range(5):
    print(i)
# 0
# 1
# 2
# 3
# 4

You can also think of this code as:

a = [0, 1, 2, 3, 4]

for i in a:
    print(i)
# 0
# 1
# 2
# 3
# 4

What if this was 10000 instead of 5? What if it was 1000000? Will it generate a list from 0 to 999999? wrong. All you have to do is prepare only one number and increase it.

This way you don't have to actually create a list from 0 to 999999, which saves memory. And the for statement can use various objects efficiently by calling this iterator.

Edit history

2020-06-19 As pointed out by shiracamus, the source code that was calling next from the beginning was corrected. 2020-06-21 https://github.com/zerokpr pointed out that the error around the data structure of Set was corrected.

Recommended Posts

What is an iterator?
What is an instance variable?
What is namespace
What is copy.copy ()
What is dotenv?
What is Linux
What is klass?
What is SALOME?
What is Linux?
What is Linux
What is pyvenv
What is __call__
What is Linux
What is Python
[Statistics for programmers] What is an event?
[What is an algorithm? Introduction to Search Algorithm] ~ Python ~
What is a distribution?
What is Piotroski's F-Score?
What is Raspberry Pi?
[Python] What is Pipeline ...
What is Calmar Ratio?
What is a terminal?
[PyTorch Tutorial ①] What is PyTorch?
What is a hacker?
What is JSON? .. [Note]
What is Linux for?
What is a pointer?
What is ensemble learning?
What is TCP / IP?
What is Python's __init__.py?
What is UNIT-V Linux?
[Python] What is virtualenv
What is machine learning?
[Ruby / Python / Java / Swift / JS] What is an algorithm?
[Statistics] Understand what an ROC curve is by animation.
What is Minisum or Minimax?
What is Linux? [Command list]
What is Logistic Regression Analysis?
What is the activation function?
Python is an adult language
What is a decision tree?
What is a Context Switch?
What is Google Cloud Dataflow?
[DL] What is weight decay?
[Python] Python and security-① What is Python?
What is a super user?
Competitive programming is what (bonus)
[Python] * args ** What is kwrgs?
What is a system call
[Definition] What is a framework?
What is the interface for ...
What is a callback function?
What is the Callback function?
What is a python map?
What is your "Tanimoto coefficient"?
Python Basic Course (1 What is Python)
What is Python? What is it used for?
[Python] What is a zip function?
[Python] What is a with statement?
What is Reduced Rank Ridge Regression?
What is Azure Automation Update Management?