[LINUX] Python program that looks for the same file name

Search for files with the same name in the folder (directory) hierarchy

If the number of folders and the hierarchy are large, files are inadvertently duplicated, and similar and non-similar files remain in the hierarchy, which tends to cause problems when searching or editing.

Actually, when I took a text file that had been used for a while to a notebook app that can manage the hierarchy mechanically with a tool to mechanically migrate it to another notebook app, it seemed that there were multiple files with the same name. At this time, I wrote a program to create a file list to make it easier to find the file with the same name in order to clean it up.

Program specifications

Such a place. You can also output a complete list (listed in order of file name) and visually determine which ones are marked and which are unmarked but have similar names. I did. (If you don't like it, just look at the marked ones)

List making program (Python)

The list is created by separating the file name and folder name with ":". Arrange in ascending order of file names. If there is the same file name (before the extension), add "\ ***** duplicated file ?? ****" at the end. The folder to be searched is searched with the current folder as the top level. The file name of the output list is "out.txt", and the UTF-8 line break is only LF, so please modify it according to your environment.

I'm still new to Python, so I made it by combining information from various websites. Your opinions and professors are welcome.


# -*- coding: utf-8 -*-

import codecs
import os

oList = []
odata = ""
prev = "...."

for root, dirs, files in os.walk(u'.'):
	for file_ in files:
		filename = file_
		itm = filename + u' : ' + root
for data_ in oList:
	wList = data_.split('.')
	if prev in wList[0] :
		data_ = data_ + "  ***** duplicated file?? ****"
	prev = wList[0]
	odata = odata + data_ + "\n"
fout = codecs.open(u'out.txt',"w","utf-8")

