Compare the Python array-like guys

This article is the 14th day article of NCC Advent Calender 2019.

Introduction

Arrays are important when writing programs. However, there is a ** too many Python array-like problem **. Even with standard Python features listdictsettuple there is. In addition, when you start using libraries such as numpy, you will find more similar ones.

So, this time, I will explain them so that they can be used properly in the following four points.

  1. Overview
  2. Good points
  3. Bad place
  4. Impressions

Since this is an article for ** proper use **, I will not touch on detailed usage. Also, I will write it on the assumption that I somehow know what the array is. I will explain using simple words as much as possible.

Target audience

--I started using Python, but I'm in trouble because there are too many arrays. ――I was working in another language, but I don't know where each array stands in Python. --People who have decided which one to use during implementation --People who started using libraries such as numpy

What to pick up

Python standard features

Library

numpy

Main story

list

Overview

The list is simply ** the most basic one **. It is the same as a general array in other languages, and in Python it is generally called a "list". The symbol is [], so if it is surrounded by this and there are a lot of , inside, think of it as a list.

In Python, there can be multiple types in the list. You can also enter as many same values as you like. In addition, it has an array function that is generally called.

Later, when I explain the following dict, I think that I often compare it with list, so I think it would be nice if you could understand the nature of list while looking at it. (Because it is a normal guy, there is not much explanation.)

Good point

--Easy to handle ――You can implement it by just using list without thinking about difficult things.

Bad cousin

――It becomes difficult to understand as the number of dimensions increases --I can't see the length of each array at once (there is no such thing as shape in numpy) --Only numbers with references starting from 0

Impressions

The origin, good or bad. Even with numpy or pandas, you can convert it to list and then use it. It is also list when the length is indefinite in one dimension, or when the order has meaning but the number has no meaning. However, it is not very suitable for multidimensional. (Don't design too many dimensions in the first place) If it is a numerical value, it will be refreshing if you combine it with numpy, and if it is complex data, you can combine it with dict.

dict

Overview

It's a so-called ** associative array **. In Python it is a dictionary or dictionary (dict for short). The symbol is {}, and each element is connected by : like {key: value}. I will not explain in detail with examples. Please refer to the reference etc.

I think the key type of dict could be anything, but it's easier to understand if you use a string or an integer. Also, when the contents are full, it is easier to see if you start a new line with key. Also, let's align the indentation. (like json) Here is an example.


ncc = { 'name': 'ncc',
        'full name': 'nakano computer club',
        'estimate': 2015,
        'web site': 'https://meiji-ncc.tech/'
        }

Good point

--Easy to understand --Easy to use with json --A character string can be used as a reference

Bad cousin

--If you are not used to it, it will cause an error ――It is difficult to take out when it gets deep --It becomes like dic ['first'] ['second'] ['third']

Impressions

It's up to you when the value and its name are important! Dict is easier to handle when the intervals are irregular even with numbers. Also, before converting to DataFrame of pandas (library), it is often summarized once with dict. Because it is easier to handle. I think you should use it in the image that connects key and value. It's convenient to deal with kettles and json. .. (Easy to read and write using json library)

set

Overview

The simple way to express set is ** no cover list **. The symbol is {}, which is the same as dict, but does not use: . As with list, the values are arranged by, . So, if it is surrounded by {} and there are a lot of ,, it is set. Strictly speaking, it represents a set. So you can also do set operations. (I won't touch it here)

list can have the same value as[0, 1, 2, 1, 0], but not with set. Converting the above list to set gives {0, 1, 2}. This is the same except for numbers.

To put it the other way around, if you want to eliminate the cover, you can convert it to set. In this case, if you want to treat it as a list again, you need to convert it from the top of set to list and return it to the list. I will write an example.

list_duplicate = [0, 1, 2, 2, 1, 0, 3]
list_non_duplicate = list(set(list_duplicate))
print(list_non_duplicate) # out: [0, 1, 2, 3]

Good point

--Can be put in without covering --A set operation is possible

Bad cousin

--Loose in order of elements -(If you convert from list, the cover will be removed and it will be packed before that) --The symbol is difficult to understand with dict (slightly)

Impressions

It is rarely used in set from the beginning. It is often converted from the list when you want to eliminate the cover or when you want to extract the intersection of the elements of multiple lists. Therefore, you should think that it is used when taking a collective approach at the time of implementation.

tuple

Overview

Should tuple be called ** a little stiff list **? There is a little habit. The symbol is ().

Basically it's like list, but it's different. Roughly speaking, ** I can't mess with what I've made **. You can add another tuple after it. (1) You can also change the tuple itself into something else entirely. (2) In addition, elements cannot be rewritten. It's a little difficult, so I'll show you an example.

t = (0, 1, 2)
#Add tuples behind
t += (3, 4) # OK(1)
#Rewriting the tuple itself
t = (0, 1, 2) # OK(2)
#Rewriting elements
t[0] = 1 # Error

There are no methods for assigning or deleting. Also, since the elements cannot be rewritten, the specific order cannot be changed. To do this, you need to convert it to list.

Good point

--Once made, it cannot be rewritten --The order is always guaranteed at the time of creation --Behavior can be fixed --Can be used as a constant

Bad cousin

--No flexibility --Difficult to handle --Source of error

Impressions

Since Python does not have a type that represents a constant (const in js), you can do it with tuple in a pseudo manner. However, I don't use it much because I can't do anything dynamic. The return value of the library method may be tuple, so use it there.


Next, I will move on to the explanation of the array system in the Python library numpy. Before that, let's take a quick look at numpy.

(Supplement) What is numpy?

numpy is a library that can perform matrix operations performed by linear algebra. Addition and subtraction of array elements. It can be used when multiplying the entire array by a numerical value. You can do more advanced things, but it's okay if you think that ** numerical calculation of arrays will be convenient **.

numpy.array/numpy.ndarray

Overview

The one-dimensional array of numpy is numpy.array. Multidimensional is numpy.ndarray. The treatment does not change much whether it is one-dimensional or multidimensional. Since this is a library, it cannot be represented by a specific symbol. If you enclose list etc. innumpy.array (), it will be converted.

import numpy as np

#Define a regular list
list_num0 = [0, 1, 2, 3, 4]

#Convert to numpy array
np_num0 = np.array(list_num0)
print(np_num0) # out: [0 1 2 3 4]

#Generate numpy array directly
np_num1 = np.array([5, 6, 7, 8, 9])
print(np_num1) # out: [5 6 7 8 9]

#Convert numpy array to list
list_num1 = list(np_num1) 
print(list_num1) # out: [5, 6, 7, 8, 9]

#Try doubling each of the list and numpy arrays
list_num0_twice = 2*list_num0
print(list_num0_twice) # out: [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
np_num0_twice = 2*np_num0
print(np_num0_twice) # out: [0 2 4 6 8]

# list,Try adding each with a numpy array
list_num_add = list_num0 + list_num1
print(list_num_add) # out: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
np_num_add = np_num0 + np_num1
print(np_num_add) # out: [ 5  7  9 11 13]

In this way, you can easily perform operations within the elements of an array with numpy.array.

Good point

--Easy operation between array elements --Easy to handle even in multiple dimensions --Calculation is fast (often)

Bad cousin

――It's difficult to handle unless you get used to it --Not suitable for multidimensional arrays with different lengths --Difficult to use other than numerical values

Impressions

If you want to do mathematical things, it's definitely numpy. It goes well with a library called scipy that can perform applied calculations (integral, etc.). If you get used to doing a lot of calculations, you should try it. It is easy to use if you understand the difference from list.

numpy.matrix --I have never used it ――I checked various things, but it seems that you should use ndarray --It seems convenient when using an m × n matrix

Summary

Here is a summary of each in one word.

list: Basic dict: Strong name-value ties set: uncovered list tuple: Non-rewritable list numpy.array / numpy.ndarray: Numerical calculation specialization list Don't use it because it's a mess with numpy.matrix: numpy.ndarray

flowchart

Here is a diagram showing somehow the flow when deciding which one to use personally. flowchart.png

Actually, it's a little more complicated, but until you get used to it, you should think like this. Since tuple is not used, it is not included. (Actually, I don't use set too much)

At the end

This time I compared the Python array-like guys. There are more if you include the details and libraries. However, since this area is the basis, if you can understand this area, I think that other understandings will improve.

Recommended Posts

Compare the Python array-like guys
Compare the speed of Python append and map
Find the maximum Python
the zen of Python
[Python] Split the date
Compare strings in Python
python3 Measure the processing speed.
Towards the retirement of Python2
Download the file in Python
Find the difference in Python
About the Python module venv
Compare the fonts of jupyter-themes
About the ease of Python
About the enumerate function (python)
[Python] Adjusting the color bar
[Python] Get the previous month
[Python 2/3] Parse the format string
Call the API with python3.
About the features of Python
[Python] Check the installed libraries
I downloaded the python source
The Power of Pandas: Python
Find the maximum python (improved)
Leave the troublesome processing to Python
[Python] Check the current directory, move the directory
Extract the xz file with python
Getting the arXiv API in Python
Check the behavior when assigning Python
[Python] Find the second smallest value.
[Python] The stumbling block of import
First Python 3 ~ The beginning of repetition ~
Python in the browser: Brython's recommendation
AtCoder: Python: Daddy the sample test.
Hit the Sesami API in Python
Try the Python LINE Pay SDK
[Python] Hit the Google Translation API
Get the desktop path in Python
"The easiest Python introductory class" modified
[Python] What is @? (About the decorator)
Existence from the viewpoint of Python
Get the weather with Python requests
Get the weather with Python requests 2
pyenv-change the python version of virtualenv
Get the script path in Python
In the python command python points to python3.8
Implement the Singleton pattern in Python
[Python] Adjusted the color map standard
How to get the Python version
Find the Levenshtein Distance with python
[python] What is the sorted key?
Change the Python version of Homebrew
Hit the Etherpad-lite API with Python
Install the Python plugin with Netbeans 8.0.2
[Python] How to import the library
[Python] Make the function a lambda function
Have python read the command output
Hit the web API in Python
[Python] Understanding the potential_field_planning of Python Robotics
I liked the tweet with python. ..
Memorize the Python commentary on YouTube.
Review of the basics of Python (FizzBuzz)