[PYTHON] I got an error when I tried to process luigi in parallel on windows, but the solution

What is luigi

When converting various data, for example, if conversion process A depends on conversion process B and conversion process B depends on conversion process C, the dependency is checked and processing is executed in order from conversion process C. A handy library. It is effective in dealing with an error that occurs in the middle of a series of processes and skipping the already executed part when re-executing. The part that has no dependency will be processed in parallel.

For more information, see Building a data pipeline with Python and Luigi.

What causes an error

If the number of parallels is 2 or more

python


PicklingError: Can't pickle <function update_tracking_url at 0x0000000001E100B8>: it's not found as luigi.worker.update_tracking_url

It will be an error like this. It seems that the compatibility between multiprocessing and pickle is bad only for windows. I'm not sure.

solution

Reduce the luigi version to 1.2.1. If you don't specify a version in pip, the latest version 2.3.0 will be included, which is a trap.

Postscript (2016/8/27) The version provided by conda has been updated to 2.3.0. Also, it seems that there is only this version for windows, so there seems to be no choice but to install it with pip.

~~ For anaconda ~~ ~~conda install luigi~~

For pip

python


pip install luigi==1.2.1

Will contain version 1.2.1.

This solved it in my environment. However, in Pickle crashing when trying to pickle "update_tracking_url" in luigi.worker?, the person who said that it was solved by upgrading to version 2.0.1. There are also, so you may need to try which version is better.

Added about the difference between versions 1 and 2 (2016/8/27)

If you're willing to tweak your package, it's available up to version 2.1.1. See Pickle crashing when trying to pickle "update_tracking_url" in luigi.worker? for edits.

Version 2 seems to be basically more sophisticated. In particular,

--Increased types of parameter data types that can be specified explicitly --The UI of the global scheduler has been cleaned up. --The display of the dependency graph available in the UI of the global scheduler no longer fails. --As far as I can tell, in version 1 I couldn't see if the task had parameters.

And so on.

Finally

Mario went to the Olympics, but I'm sorry, luigi.

Recommended Posts

I got an error when I tried to process luigi in parallel on windows, but the solution
When I get an error with Pylint in Atom on Windows
I got an error when trying to install Xgboost and its solution
[Python] I want to know the variables in the function when an error occurs!
I got an error when trying to run Hello World in Go language
When I tried to use Python on WSL (windows subsystem for linux), it got stuck in Jupyter (solved)
I tried changing the python script from 2.7.11 to 3.6.0 on windows10
I got an AttributeError when mocking the open method in python
# Solution when pip install gives an error when using Anaconda on Windows 10
I tried to process the image in "sketch style" with OpenCV
I tried to process the image in "pencil style" with OpenCV
I got an SSL Error when I installed Anaconda in a new environment, so I solved it (Windows10, Anaconda3-2019.10)
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I tried to install Docker on Windows 10 Home but it didn't work
I got an error when saving with OpenCV
In the Chainer tutorial, I get an error when importing a package. (mock)
I want to use Python in the environment of pyenv + pipenv on Windows 10
I stumbled on the character code when converting CSV to JSON in Python
I got an error when I put opencv in python3 with Raspberry Pi [Remedy]
I tried to use Resultoon on Mac + AVT-C875, but I was frustrated on the way.
[I'm an IT beginner] I tried my best to implement Linux on Windows
I referred to it when I got stuck in the django geodjango tutorial (editing)
I tried to graph the packages installed in Python
When I tried to introduce python3 to atom, I got stuck
I tried to build an environment with WSL + Ubuntu + VS Code in a Windows environment
I get an error when I put opencv in pyautoGUI
When I try to import pandas on macOS I get the error No module named'_bz2'
sphinx-quickstart got messy and I tried to create an alternative command and the stress disappeared
I got an error when pip install pandas on Mac, so I dealt with it
I got an error when using Tensorboard with Pytorch
Force luigi to do parallel processing in windows environment
I tried to notify the honeypot report on LINE
I get an error when I try to raise Python to 3 series using pyenv on Catalina
I got an error when pip install tweepy on macOS Sierra, so I dealt with it
[Deep Learning from scratch] I tried to explain the gradient confirmation in an easy-to-understand manner.
When I installed python on macOS and used it, I got an error when I put an https connection
Solution if the module is installed in Python but you get an error in Jupyter notebook
I tried the super-resolution algorithm "PULSE" in a Windows environment
I tried running PIFuHD on Windows for the time being
I got an error in vim and zsh in Python 3.7 series
I get [Error 2055] when trying to connect to MySQL on Heroku
I tried to summarize the code often used in Pandas
I tried to illustrate the time and time in C language
I tried to summarize the commands often used in business
I tried to implement the mail sending function in Python
I tried to launch ipython cluster to the minimum on AWS
I tried to create an article in Wiki.js with SQLAlchemy
I built an environment from centos installation to php source expansion on linux, but what to do when a browser error occurs
[Addition] Vulnerability in git! I have to update! But yum doesn't have the latest version, and I got it from the source! Note when
A story that didn't work when I tried to log in with the Python requests module
About the error I encountered when trying to use Adafruit_DHT from Python on a Raspberry Pi
I tried to create a server environment that runs on Windows 10
How to intentionally issue an error in the shell During testing
I tried to create an environment of MkDocs on Amazon Linux
I tried to describe the traffic in real time with WebSocket
The record I was addicted to when putting MeCab on Heroku
Resolved an error when putting pygame in python3 on raspberry pi
[Linux] I want to know the date when the user logged in
I got an error when I ran composer global require laravel / installer
I get an error when trying to install maec 4.0.1.0 with pip
I tried to digitize the stamp stamped on paper using OpenCV