[PYTHON] Use shutil to delete all folders with a small number of files

If you want a large amount of crawl destination data, you want to delete all the folders that were acquired halfway because they get in the way. I have put together a function for ease of use.

usage environment

windows10 Anaconda 3.6.1

shutil shutil is one of the python standard libraries. Advanced directory operations are possible. [Official Reference] https://docs.python.jp/3/library/shutil.html

code

delete_folder.py


import os
import shutil

#By default, delete folders with 3 or less elements in the execution folder
def delete_folder( directory_dir= os.getwsd() ,size=3):
    
    folder_list = os.listdir(directory_dir)
    folder_dir = [os.path.join(directory_dir,i) for i in folder_list if len(os.listdir(os.path.join(directory_dir,i))) <= size ]
    for folder in folder_dir:
        print(folder+'To remove')
        shutil.rmtree(folder)

    print('Done')

Recommended Posts

Use shutil to delete all folders with a small number of files
Organize a large number of files into folders
Consolidate a large number of CSV files in folders with python (data without header)
Convert a large number of PDF files to text files using pdfminer
Generate all files with a specific extension
Don't use rm command to delete files
Create a command to delete all temporary files generated in a specific folder
A simple example of how to use ArgumentParser
Delete files that have passed a certain period of time with Raspberry PI
One-liner to create a large number of test files at once on Linux
Upload a large number of images to Wordpress
Convert data with shape (number of data, 1) to (number of data,) with numpy.
Find out how to divide a file with a certain number of lines evenly
Story of trying to use tensorboard with pytorch
Use twitter API to get the number of tweets related to a certain keyword
I wanted to know the number of lines in multiple files, so I tried to get it with a command
How to get a list of files in the same directory with python
How to identify the element with the smallest number of characters in a Python list?
A tool to follow posters with a large number of likes on instagram [25 minutes to 1 second]
Use API to mark a large number of unread emails in Gmail as read
Create a dataset of images to use for learning
Accelerate a large number of simple queries with MySQL
A memo of how to use AIST supercomputer ABCI
2 ways to read all csv files in a folder
Unzip a lot of ZIP-compressed files with Linux commands to UTF8 and stick them together
Give the history command a date and time and collect the history files of all users with a script
[Ubuntu] How to delete the entire contents of a directory
Paste a large number of image files into PowerPoint [python-pptx]
A collection of competitive pro techniques to solve with Python
[Command] Command to get a list of files containing double-byte characters
[python] A note when trying to use numpy with Cython
How to display a list of installable versions with pyenv
I want to use a virtual environment with jupyter notebook!
Delete all libraries installed on pip with a single command
Upload and delete files to Google Cloud Storages with django-storage
[Python] Easy reading of serial number image files with OpenCV
Command to list all files in order of file name
[Python] What is a slice? An easy-to-understand explanation of how to use it with a concrete example.
I tried to predict the number of domestically infected people of the new corona with a mathematical model
A small memorandum of openpyxl
How to Delete with SQLAlchemy?
How to display a specified column of files in Linux (awk)
A note on what you did to use Flycheck with Python
I want to use a wildcard that I want to shell with Python remove
[Introduction to StyleGAN] I played with "The Life of a Man" ♬
Sample to use after OAuth authentication of BOX API with Python
TensorFlow To learn from a large number of images ... ~ (almost) solution ~
I tried to create a list of prime numbers with python
I made a lot of files for RDP connection with Python
Here's a brief summary of how to get started with Django
Explain how to use TensorFlow 2.X with implementation of VGG16 / ResNet50
Node.js: How to kill offspring of a process started with child_process.fork ()
Find all patterns to extract a specific number from the set
Get a list of files in a folder with python without a path
I tried to make a mechanism of exclusive control with Go
I have lived a life with a lot of "happiness". [Use COTOHA API to make "human disqualification" "happy"]
I want to backtest a large number of exchange pairs and strategies at once with Python's backtesting.py
I want to solve the problem of memory leak when outputting a large number of images with Matplotlib
Upload data to s3 of aws with a command and update it, and delete the used data (on the way)