anaconda3 installed For anaconda, see Install anaconda on Mac
--If you don't have anaconda installed, pip install or install anaconda --Install chardet
zsh
$ anaconda search -t conda chardet #Find chardet in anaconda
Using Anaconda API: https://api.anaconda.org
Run 'anaconda show <USER/PACKAGE>' to get more details:
Packages:
Name | Version | Package Types | Platforms
------------------------- | ------ | --------------- | ---------------
anaconda/chardet | 3.0.4 | conda | linux-ppc64le, linux-64, win-32, osx-64, linux-32, win-64
...
$ anaconda show anaconda/chardet #Select the one you like from the chardet you searched for and find out how to install it
Using Anaconda API: https://api.anaconda.org
Name: chardet
Summary:
Access: public
Package Types: conda
Versions:
+ 2.3.0
+ 3.0.2
+ 3.0.3
+ 3.0.4
To install this package with conda run:
conda install --channel https://conda.anaconda.org/anaconda chardet
$ conda install --channel https://conda.anaconda.org/anaconda chardet #Copy and paste the installation command that came out after checking
Fetching package metadata .........
Solving package specifications: ..........
Package plan for installation in environment /Users/berry/.pyenv/versions/anaconda3-4.2.0:
The following packages will be downloaded:
package | build
---------------------------|-----------------
conda-env-2.6.0 | 0 601 B anaconda
chardet-3.0.4 | py35_0 188 KB anaconda
requests-2.14.2 | py35_0 725 KB anaconda
pyopenssl-16.2.0 | py35_0 70 KB anaconda
conda-4.3.22 | py35_0 516 KB anaconda
------------------------------------------------------------
Total: 1.5 MB
The following NEW packages will be INSTALLED:
chardet: 3.0.4-py35_0 anaconda
conda-env: 2.6.0-0 anaconda
The following packages will be UPDATED:
conda: 4.2.9-py35_0 --> 4.3.22-py35_0 anaconda
pyopenssl: 16.0.0-py35_0 --> 16.2.0-py35_0 anaconda
requests: 2.11.1-py35_0 --> 2.14.2-py35_0 anaconda
Proceed ([y]/n)? y
Fetching packages ...
conda-env-2.6. 100% |############################################################################| Time: 0:00:00 304.59 kB/s
chardet-3.0.4- 100% |############################################################################| Time: 0:00:02 83.22 kB/s
requests-2.14. 100% |############################################################################| Time: 0:00:52 14.02 kB/s
pyopenssl-16.2 100% |############################################################################| Time: 0:00:02 25.32 kB/s
conda-4.3.22-p 100% |############################################################################| Time: 0:00:18 28.66 kB/s
Extracting packages ...
[ COMPLETE ]|###############################################################################################| 100%
Unlinking packages ...
[ COMPLETE ]|###############################################################################################| 100%
Linking packages ...
[ COMPLETE ]|###############################################################################################| 100%
$
--Export the list of files under './file/'
to csv
python3
# support python3
import os
import datetime
import csv
from chardet.universaldetector import UniversalDetector
_path = './file/'
# get encode of a file
def univ_detect(file_dir):
ud = UniversalDetector()
with open(file_dir, 'rb') as fd:
for b in fd:
ud.feed(b)
if ud.done:
break
ud.close()
return ud.result['encoding']
# get a file name, path and last-modified timestamp
all_files = []
def get_file_list(file_path):
# all_files = []
file_list = [f for f in os.listdir(file_path)]
for g in file_list:
g_path = os.path.join(file_path, g)
# last modified time
last_modified = os.path.getmtime(g_path)
dt = datetime.datetime.fromtimestamp(last_modified).strftime('%Y%m%d_%H:%M:%S')
# chardet
encode = 'Directory'
if os.path.isdir(g_path):
pass
else:
encode = univ_detect(g_path)
all_files.append([dt, g_path.split(_path,1)[1], ' '.join(['~',encode,'~'])])
# subdirectory
if os.path.isdir(g_path):
subfile_list = [i for i in os.listdir(g_path)]
for j in subfile_list:
j_path = os.path.join(g_path, j)
# last modified time
sub_last_modified = os.path.getmtime(j_path)
sub_dt = datetime.datetime.fromtimestamp(sub_last_modified).strftime('%Y%m%d_%H:%M:%S')
# chardet
encode = 'Directory'
if os.path.isdir(j_path):
pass
else:
encode = univ_detect(j_path)
all_files.append([sub_dt, j_path.split(_path,1)[1], ' '.join(['~',encode,'~'])])
return file_list
# return all_files
print(get_file_list(_path))
csv_file = [['Last_modified', 'file_path', 'encode']] # Header
csv_file.extend(all_files)
with open('file_checker.csv', 'w') as h:
writer = csv.writer(h, lineterminator='\n')
writer.writerows(csv_file)
Recommended Posts