Python script that makes UTF-8 files with all BOMs under the folder without BOMs

The reason for making it

Converting UTF-8 files with BOM to UTF-8 files without BOM due to business needs I created a script to convert all at once.

It was unexpectedly convenient, so I will publish it. Please refer to the comments in the source for the processing contents.

script

convert.py


#---BOM removal tool
import sys, os, codecs, shutil

#---Get command line arguments
args = sys.argv

#---Get current path
current_path = os.getcwd()

#---Setting temporary file name for work
tempfile = ''.join([current_path,'\\tempfile'])

#---Variable for counting the number of cases processed
conv_count = 0

print("BOM file removal script")

#---If there is no argument, an error is output and the process ends.
if len(sys.argv) == 1:
    print("There are no arguments.")
    sys.exit(1)
else:
    #---Get the first argument
    filepath = args[1]

#---Existence check of file path of specified argument
if(os.path.exists(filepath)):

    #--- os.walk function loop
    for (current, subfolders, files) in os.walk(filepath):
        
        #---List of acquired files(files)Loop with
        #---Get individually from files to fileName
        for fileName in files:
        
            #---File path generation to be processed
            target_path = '\\'.join([current, fileName])
            
            #--- UTF-8BOM to UTF-8 Conversion process to NOBOM
            
            #---UTF the file to be processed-Open in read mode as 8BOM
            with codecs.open(target_path, 'r', 'utf_8_sig') as r:
            
                #---UTF temporary files-Open in write mode with 8NOBOM
                with codecs.open(tempfile, 'w', 'utf-8') as w:
                
                    #---Read line by line from the file to be processed(Assign to line)
                    for line in r:
                    
                        #---Output the contents of line to a temporary file in one line
                        w.write(line)

            #---File replacement process
            #---Overwrite and copy the temporary file to the file to be processed
            shutil.copyfile(tempfile, target_path)
            
            #---Delete temporary files
            os.remove(tempfile)
            
            #Count up the number of cases processed
            conv_count += 1

else:
    #---If the path of the specified argument does not exist, an error is output and the process ends.
    print("The specified folder does not exist.")
    sys.exit(1)

#---End message
print(filepath + "Destroyered files under it (converted without BOM)")
print('Number of converted files:{}'.format(conv_count))
sys.exit(0)

Execution method

Scripts are machine-independent, so they work on Windows, Mac, and Linux. Suppose you save your Python script with the file name convert.py.

・ For Windows and Linux

> py convert.py dir_path

・ For Mac

> python convert.py dir_path

Specify dir_path as the full path.

Summary

The script created this time recursively scans under the specified folder. Performs processing on all files.

For example, targeting a specific type of file in the current code, Give a file name as an argument and only match files I think it would be more convenient to add the improvement of targeting.

I think there is a smarter way to convert BOM, Python, which can write such processing with only basic instructions, After all it is a convenient language.

Recommended Posts

Python script that makes UTF-8 files with all BOMs under the folder without BOMs
[Python] Get the files in a folder with Python
I tried searching for files under the folder with Python by file name
Check what the character code is for all files under the directory that is Python and output
Get a list of files in a folder with python without a path
[python] Move files that meet the conditions
List all files under the current directory line by line with full path
Read all csv files in the folder
Download files on the web with Python
Delete all pyc files under the specified directory
How to get the files in the [Python] folder
Examine Python script bottlenecks with the cProfile module
Unzip all zip files under the current directory
Workaround for the problem that UTF-8 Japanese mail cannot be sent with Flask-Mail (Python3)
Checks if there is a specific character string for all files under the directory that is Python and outputs the target line
Script that changes the length of the sound with REAPER
I wrote a Python script that exports all my posts using the Qiita API v2
I replaced the Windows PowerShell cookbook with a python script.
Recursively copy files from the directory directly under the directory using Python
Let's create a script that registers with Ideone.com in Python.
Creating a Python script that supports the e-Stat API (ver.2)
Batch convert all xlsx files in the folder to CSV files
The story that Python stopped working with VS Code (Windows 10)
Treat the Interface class like that with Python type annotations
A set of script files that do wordcloud in Python3
A Python script that compares the contents of two directories
[Python] A function that searches the entire string with a regular expression and retrieves all matching strings.