[Python] How to run Jupyter-notebook + pandas + multiprocessing (Pool) [pandas] Memo

Purpose

problem

Conclusion

NG example

OK example

  1. Copy the whole as func.py to the same folder as .ipynb
  2. Place the __init__.py file
  3. Import and run func on Jupyter-notebook

func.py


import pandas as pd
import os,glob
from datetime import datetime as dt
from multiprocessing import Pool

FOLDER_PATH = r'folder_path\\'
FILE_TYPE = r'*.csv'
FILE_FORMAT = 'Report_%Y%m%d.csv'

def read_report_to_dataframe():
    #List file paths
    csv_pathlist = glob.glob(FOLDER_PATH + FILE_TYPE)
    with Pool(os.cpu_count()) as p:
        df = pd.concat(p.map(read_report, csv_pathlist))
        
    return df 

    
#report read
def read_report(csv_path):
    separator_list = [';',',']

    for sep in separator_list:
        df = pd.read_csv(filepath_or_buffer=csv_path,
                         engine='python',
                         parse_dates=[0],
                         index_col=[0],
                         skiprows=[1],
                         nrows=96,
                         sep=sep)
        #Check if the data frame is empty
        if not df.empty:
            break

    return df 

jupyter-notebook


import func
func.read_report_to_dataframe()

Summary

Recommended Posts

[Python] How to run Jupyter-notebook + pandas + multiprocessing (Pool) [pandas] Memo
How to run Notepad ++ Python
[Python] How to use Pandas Series
[Nanonets] How to post Memo [Python]
[Python] Summary of how to use pandas
How to run Cython on OSX Memo
How to run a Maya Python script
[Python] How to read excel file with pandas
How to run MeCab on Ubuntu 18.04 LTS Python
How to run Leap Motion in non-Apple Python
How to install Python
How to use Pandas 2
How to install python
How to run python in virtual space (for MacOS)
How to run tests in bulk with Python unittest
How to convert JSON file to CSV file with Python Pandas
[Python] How to deal with pandas read_html read error
How to run setUp only once in python unittest
[Python] A memo to write CSV vertically with Pandas
I tried to summarize how to use pandas in python
How to use python multiprocessing (continued 3) apply_async in class with Pool as a member
[2020.8 latest] How to install Python
How to install Python [Windows]
python3: How to use bottle (2)
[Python] Convert list to Pandas [Pandas]
[Python] How to use list 1
How to update Python Tkinter to 8.6
How to use Python argparse
How to use Pandas Rolling
Python: How to use pydub
[Python] How to use checkio
How to change Python version
How to develop in Python
[python] How to judge scalar
[Python] How to use input ()
How to use Python lambda
[Python] How to use virtualenv
python3: How to use bottle (3)
python3: How to use bottle
How to use Python bytes
How to run a Python file at a Windows 10 command prompt
How to run a Python program from within a shell script
Don't lose to Ruby! How to run Python (Django) on Heroku
How to develop in a virtual environment of Python [Memo]
How to run an app built with Python + py2app built with Anaconda
[Python] How to output a pandas table to an excel file
How to read an Excel file (.xlsx) with Pandas [Python]
Data science companion in python, how to specify elements in pandas
How to use cron (personal memo)
How to install python using anaconda
How to write a Python class
[Python] How to FFT mp3 data
[Python] How to do PCA in Python
Python: How to use async with
[Python] Operation memo of pandas DataFrame
How to install OpenCV on Cloud9 and run it in Python
python / pandas / dataframe / How to get the simplest row / column / index / column
[Python] How to derive nCk (ABC156-D)
How to write soberly in pandas
How to collect images in Python
How to use Requests (Python Library)