[Python] Reason why index error does not occur in slice

Introduction

It's a well-known story (I often forget) that no matter how large or small a Python slice is specified, an index error does not occur, but the reason is "** It handles it nicely internally. ** ”I thought I couldn't find an article in Japanese that explained the above, so I decided to write it as a gap industry.

In this article, I will explain the theme of the subject from two perspectives: ** Thought ** and ** Implementation . I'm hoping to get rid of the sickness of people who have just started Python and feel that " Index error when extracting a specific element May is said to be a slice, but it's not said to be a slice **". I will.

Also, since the author himself is inexperienced, I would appreciate it if you could point out any inaccuracies in the expressions for later study.

What does a slice (in the first place) do?

This section corresponds to the ideological part of slicing. Even in the official document, "what is a slice" is not written clearly (it seems), but it is frustrating, but there is the following explanation for the time being.

--Note about s [i: j]

A slice of> s from i to j is defined as a sequence of elements with an index k such that i <= k <j.

--Note about s [i: j: k]

The> s "slice from i to j with steps k" consists of elements with the index x = i + n * k (where n is any integer satisfying 0 <= n <(ji) / k). Defined as a sequence.

Embedded — Python 3.8.5 documentation

You can understand it if you read it properly, but it feels a bit annoying.

If you extract only what you need now, does slicing mean an operation that "creates a new ** sequence ** consisting of elements ** that correspond to the specified index ** from the original sequence"?

(A sequence is a data type such as list, tuple, range, string, byte string, etc.)

Here's a quote from Stack Overflow about why this is important.

Indexing returns a single item, but slicing returns a subsequence of items. So when you try to index a nonexistent value, there's nothing to return. But when you slice a sequence outside of bounds, you can still return an empty sequence.

https://stackoverflow.com/a/9490148

In other words

  1. Element extraction returns a single element, but slices return a sequence.
  2. If you specify an index that does not exist in the element extraction, there is nothing to return, but if it is a sequence, you can return an empty sequence.

I'm saying that. I mentioned 1 above, so I don't think it's necessary to explain it, but what about 2?

In short, [0,1,2] [3] is ** an error because there is no return value **, but [0,1,2] [3:] is an element corresponding to index 3 or higher. If there is no ** empty list [] can be returned, so you don't have to make an error **.

That said, the Antoinettes (including the author) who are reading this say, "** If you have to have a return value, you can return None with[0,1,2] [3]. You might think, "It's not good **".

However, in that case, it is possible to determine whether ** [0,1,2] [3] returns None or [0,1,2, None] [3] returns None ** I still need an index error because it gets harder (and kindly supplemented it in the continuation of the previous quote).

The explanation may have been a bit verbose, but the conclusion was that the "** slice can return an empty sequence even if there is no corresponding element **" error does not occur.

(Jabashi: It makes me want to use terms like "subset / subset" or "empty set", but when I say "set", Ignore the element of sequence type "order". After all, there is no choice but to explain it like the official document.)

The true identity of "good feeling"

This section corresponds to the implementation part of the slice.

How to handle operations like [P, y, t, h, o, n] [100: 200] in a "good way", the answer is a continuation of the quote from the official document in the previous section. It is hidden in the part.

--Note about s [i: j]

If> i or j is greater than len (s), use len (s). If i is omitted or None, use 0. If j is omitted or None, use len (s). If i is greater than or equal to j, the slice will be an empty sequence.

In other words, in [P, y, t, h, o, n] [100: 200], len (s) is used because both ʻi and jare larger thanlen (s). Then, ʻi (= len (s))> = j (= len (s)) holds, so it is judged that an empty sequence is returned.

[Slices are also indexing internally](https://qiita.com/tanuk1647/items/276d2be36f5abb8ea52e#How slices are converted to indexes), so they are larger than len (s) The numbers are converted in advance. When step is specified, basically the same processing is performed.

Finally, I will give you a reference on how the processing around here is implemented in CPython. I'm not strong in C either, so I think you should look at it to the extent that "Ah, it's definitely written like that" (also in the source code / * this is harder to get right than you might). It says think * / ).

(In this section, we explained the case where step is not used, but in the quotation code, it is the process when step is used. I don't know.)

sliceobject.c


defstop = *step < 0 ? -1 : length;
...
if (r->stop == Py_None) {
    *stop = defstop;
}
...
if ((*step < 0 && *stop >= *start)
    || (*step > 0 && *start >= *stop)) {
    *slicelength = 0;

cpython: 3a1db0d2747e Objects/sliceobject.c

Conclusion

Slicing is convenient.

Thank you for watching until the end.

Reference site

Embedded — Python 3.8.5 documentation [Python] Summary of slice operation --Qiita python - Why does substring slicing with index out of range work? - Stack Overflow string - Why python's list slicing doesn't produce index out of bound error? - Stack Overflow

Recommended Posts

[Python] Reason why index error does not occur in slice
Slice error in python (´ ; ω ; `)
Python Not Implemented Error
Key input that does not wait for key input in Python
BigQuery-If you get a Reason: responseTooLarge error in Python
Measure BMI index in Python.
Python error support note: "... does not support argument 0 of type float ..."
Python version does not switch
python> does not include the letters mm> if "mm" not in text: / print "not including mm"
[Python] Name Error: name'urlparse' is not defined
Path problem does not occur on debian-linux
[Illegal hardware instruction python] error in PyMC3
[VScode] autopep8 format does not work [Python]
Virtualenv does not work on Python3.5 (Windows)
Python / dictionary> setdefault ()> Add if not in dictionary
Python> Python does not include the last offset
tensorflow does not enter in windows + anaconda.
Tkinter could not be imported in Python
In Ruby, inspect does not substitute to_s
Import Error in Python3: No module named'xxxxx'
Why can't I install matplotlib in python! !!
What is the reason why the basic commands are not displayed in Japanese in man?
Examples and solutions that the Python version specified in pyenv does not run
Python virtual environment in 2021 ~ There is no reason not to use venv now ~
Patch when full text search does not work in GAE / Python local environment