Python application: Pandas Part 2: Series

From the continuation of the last time

I will post a Series that deals with one-dimensional arrays.

Series generation

Basically, it is the same as Numpy, import Pandas, rename it and use it.

import pandas as pd
#Introducing pandas with import
#Enabled to handle pandas with pd with as.

One of Pandas' data structures, Series, can be treated like a one-dimensional array.

You can generate a Series by passing a dictionary-type list in the form of pd.Series (dictionary-type list).

You can also generate a Series by specifying the data and the index associated with it.

pd.Series(Data array, index=Index array) #Specify in this format.

If no index is specified, integers are automatically added as indexes in ascending order from 0.

When you output Series

dtype: int64

Although it is output as above Indicates that the value stored in Series is of the data type "int64". dtype stands for “Data type” and refers to the type of data. (If the data is an integer, "int", if it has a decimal point, "float", etc.)

int64 is an integer with a size of 64 bits, from −263−263 to 263-1263-1. It can handle integers.

There are other dtypes such as int32 that have the same integer but different sizes. Some are bool types that have only 0 or 1 as their value.

import pandas as pd

fruits = {"banana": 3, "orange": 2}
print(pd.Series(fruits))
#Output result
banana    3
orange    2
dtype: int64

Series (uppercase)

When dealing with pd, we deal with Series methods Notice that the head is capitalized as S.

reference

When referencing elements of Series How to specify the index number There is a way to specify the index value.

By specifying series [: 3] etc. like the slice notation of the list You can retrieve any range.

Refer to the index values of the desired elements in one list. If you specify one integer value instead of a list, you can retrieve only the data that corresponds to that position.

In the code below, the index number is specified using slices and the data is retrieved.

import pandas as pd
fruits = {"banana": 3, "orange": 4, "grape": 1, "peach": 5}
series = pd.Series(fruits)
print(series[0:2])

#Because the order is preserved when converting to Series
#Next to banana is orange.
# (Pandas version is 0.23.If it is before 0,
#Since the keys are sorted in ascending order, grape is next to banana in alphabetical order.)

#Output result
banana    3
orange    4

#The code below retrieves the data by specifying the index values in a list.

import pandas as pd
fruits = {"banana": 3, "orange": 4, "grape": 1, "peach": 5}
series = pd.Series(fruits)
print(series[["orange", "peach"]])
#Output result
orange    4
peach     5
#The code below specifies the index number and retrieves a single piece of data.

import pandas as pd
fruits = {"banana": 3, "orange": 4, "grape": 1, "peach": 5}
series = pd.Series(fruits)
print(series[3])
#Output result
5

Note: List index values

When specifying the index value, as shown below Must be specified in double brackets

[[]]

#Example
import pandas as pd
fruits = {"banana": 3, "orange": 4, "grape": 1, "peach": 5}
series = pd.Series(fruits)
print(series[["orange", "peach"]]) #Pay attention here

Extract data and index

There is a method to retrieve only the data value of the created Series or only the index.

Data values (series.values)

import pandas as pd

index = ["soccer", "tennis", "basketball"]
data = [11, 4, 10]
series = pd.Series(data, index=index)
print(series.values)
#Output result
[11  4 10]

Index reference (series.index)

For index reference

import pandas as pd

index = ["soccer", "tennis", "basketball"]
data = [11, 4, 10]
series = pd.Series(data, index=index)
print(series.index)
#Output result
Index(['soccer', 'tennis', 'basketball'], dtype='object')

Add an element

When adding an element to a Series, the element to be added must also be of type Series.

Convert the element you want to add to Series type in advance

It can be added by using the variable name .append () of the Series type to be added.

import pandas as pd

fruits = {"banana": 3, "orange": 2}
series = pd.Series(fruits)
print(series)
#Output result
banana    3
orange    2
dtype: int64

#The above series is omitted
grape = {"grape": 3}
series = series.append(pd.Series(grape))
print(series)
#Output result
banana    3
orange    2
grape     3
dtype: int64

#You can also add it by writing as follows without converting in advance.

series = series.append(pd.Series([3], index=["grape"]))

Drop an element (.drop)

You can delete an element using the Series index reference.

When the Series type variable is series

series.drop("Index value")
#You can now remove the element with the specified index value.
fruits = {"banana": 3, "orange": 2}
series = pd.Series(fruits)
series= series.drop("banana")
#Output result
orange    2

filtering

You may want to extract elements that match the conditions in Series type data.

In Pandas, if you specify a bool type sequence Only True ones can be extracted.

A sequence is a "continuous" or "order".

import pandas as pd

index = ["apple", "orange", "banana", "strawberry", "kiwifruit"]
data = [10, 5, 8, 12, 3]
series = pd.Series(data, index=index)

conditions = [True, True, False, False, False]
print(series[conditions])
#Output result
apple     10
orange     5

I created a bool type sequence here, In Pandas, if you create a conditional expression using Series (or DataFrame) You can get a bool type sequence.

Conditional element only

series[series >= 5]

If you specify as above You can only get elements with a value of 5 or greater.

AND condition

#In case of AND condition
series[ ][ ] #[ ]Specify the AND condition by arranging multiple
series[(Condition 1)&(Condition 2)]

OR condition

#For OR conditions
series[(Condition 1)|(Condition 2)]
# (conditions)To|Connect with (stroke).

sort

In Series, you can sort indexes and data respectively. When the Series type variable is series

series.sort_index() #Index sorting
series.sort_values() #Sorting data

Unless otherwise specified, it will be sorted in ascending order. By passing the following as an argument, it will be in descending order.

ascending=False #By passing this, it will be sorted in descending order.
#Sort in ascending order
import pandas as pd

index = ["apple", "orange", "banana", "strawberry", "kiwifruit"]
data = [10, 5, 8, 12, 3]
series = pd.Series(data, index=index)

items1 = series.sort_index()
print(items1)
#Output result
apple         10
banana         8
kiwifruit      3
orange         5
strawberry    12
#Sort in descending order
items2 = series.sort_values(ascending=False)
print(items2)
#Output result
strawberry    12
apple         10
banana         8
orange         5
kiwifruit      3

Recommended Posts

Python application: Pandas Part 2: Series
Python application: Pandas Part 1: Basic
pandas series part 1
Python application: Pandas Part 4: DataFrame concatenation / combination
Python application: Pandas # 3: Dataframe
Python Application: Data Cleansing Part 1: Python Notation
Python Application: Data Handling Part 3: Data Format
[Python] How to use Pandas Series
Python application: Numpy Part 3: Double array
Python application: data visualization part 1: basic
Excel aggregation with Python pandas Part 1
QGIS + Python Part 2
[Python] What is pandas Series and DataFrame?
My pandas (python)
Python Application: Data Visualization Part 3: Various Graphs
QGIS + Python Part 1
Excel aggregation with Python pandas Part 2 Variadic
Python: Scraping Part 1
Adding Series to columns in python pandas
python pandas notes
Python3 Beginning Part 1
Python: Scraping Part 2
Python Application: Data Handling Part 2: Parsing Various Data Formats
Basic operation of Python Pandas Series and Dataframe (1)
Python: Time Series Analysis
Python time series question
Installing pandas on python2.6
Python basic memorandum part 2
Python basic memo --Part 2
[Python learning part 3] Convert pandas DataFrame, Series, and standard List to each other
Python basic memo --Part 1
Python Mathematics Series ① Transpose
Application of Python 3 vars
Python Basic --Pandas, Numpy-
"My Graph Generation Application" by Python (PySide + PyQtGraph) Part 2
Web application made with Python3.4 + Django (Part.1 Environment construction)
"My Graph Generation Application" by Python (PySide + PyQtGraph) Part 1
Introduction to Python numpy pandas matplotlib (~ towards B3 ~ part2)
Python Math Series ② Matrix Multiplication
Illustrated pandas function application process
Studying Python with freeCodeCamp part1
Read csv with python pandas
Bordering images with python Part 1
Python 2 series and 3 series (Anaconda edition)
Scraping with Selenium + Python Part 1
Python: Ship Survival Prediction Part 2
[Python] Convert list to Pandas [Pandas]
Most wanted pandas functions (Part 02)
Python pandas strip header space
Python: Supervised Learning: Hyperparameters Part 1
Python Mathematics Series ③ Determinant (replacement)
Python Basic Grammar Memo (Part 1)
Python: Ship Survival Prediction Part 1
Studying Python with freeCodeCamp part2
[Python] Change dtype with pandas
Python application: Data handling Part 1: Data formatting and file input / output
Image processing with Python (Part 1)
Install pandas 0.14 on python3.4 [on Mac]
Python application: data visualization # 2: matplotlib
Solving Sudoku with Python (Part 2)
Python 3 series installation for mac