I've been misunderstanding about the Luigi framework and suffered so much that some people may suffer from the same troubles, so I'll write it down.
--Wait for the dependent task to complete
--Wait for the completion of the dependent task --Successful dependent task is required to continue subsequent processing
File output to the target specified by `ʻoutput``
Exception in task
For example, consider the following case.
The process of reading a list of URLs of 1000 lines or so in the list taken by ʻinput`` and taking files from that URL. I think it's a common process, but there is a trap here. I don't want to do serial processing to download 1000 files, and I want to give parameters to the task based on the data collected by
ʻinput``, so [Dynamic dependency](http: //: I think it will be written as luigi.readthedocs.io/en/stable/tasks.html#dynamic-dependencies).
If even one of the 1000 created tasks fails, the subsequent processing will not be executed.
However, it is often possible that one or two tasks will fail due to a malfunction of the WEB server or a URL description error, and if that causes the subsequent processing to stop, it is a problem. ..
In this case, the conclusion is that the subsequent processing task and the task that is generating the task should not be dependent on each other, and the processing should be written outside of luigi.
Recommended Posts