[Python] Challenge 100 knocks! (010-014)

About the history so far

Please refer to First Post

Knock status

9/24 added

Chapter 2: UNIX Command Basics

hightemp.txt is a file that stores the record of the highest temperature in Japan in the tab-delimited format of "prefecture", "point", "℃", and "day". Create a program that performs the following processing and execute hightemp.txt as an input file. Furthermore, execute the same process with UNIX commands and check the execution result of the program.

010. Counting the number of lines

Count the number of lines. Use the wc command for confirmation.

wc_010.py


#-*- coding:utf-8 -*-

import subprocess
import codecs

if __name__=="__main__":
    filename = 'hightemp.txt'
    basepath = '/Users/masassy/PycharmProjects/Pywork/training/'
    f = codecs.open(filename,'r','utf-8')

#\Count the number of n. The array starts at 0, so at the end+1
    for index,data in enumerate(f):
        data.split('\n')

    print("The number of lines in the file",index+1)

#Check the output with the wc command
    output = subprocess.check_output(["wc","-l",basepath+filename])
    print(output.decode('utf-8'))

result


24 lines in the file
      24 /Users/masassy/PycharmProjects/Pywork/training/hightemp.txt

Impressions: Open the file and count the line feed code by index. Codecs that can be read by specifying the character code is convenient.

011. Replace tabs with spaces

Replace each tab character with one space character. Use the sed command, tr command, or expand command for confirmation.

tab2space_011.py


-*- coding:utf-8 -*-

import subprocess
import codecs

if __name__=="__main__":
    filename = 'hightemp.txt'
    basepath = '/Users/masassy/PycharmProjects/Pywork/training/'
    f = codecs.open(filename,'r','utf-8')
#read reads all characters, readline reads one line, readlines reads all lines
    r = f.read()
    space_data=''
    for tab_data in r:
        if(tab_data=='\t'):
            space_data += " "
            continue
        else:
            space_data += tab_data

    print(space_data)
#Check the output with the sed command
    output =subprocess.check_output(["sed","-e" ,"s/\t/ /g",basepath+filename])
    print(output.decode('utf-8'))

result


Kochi Prefecture Ekawasaki 41 2013-08-12
40 Kumagaya, Saitama Prefecture.9 2007-08-16
40 Tajimi, Gifu Prefecture.9 2007-08-16
(Omitted because the result is long)

Kochi Prefecture Ekawasaki 41 2013-08-12
40 Kumagaya, Saitama Prefecture.9 2007-08-16
40 Tajimi, Gifu Prefecture.9 2007-08-16
(Omitted because the result is long)

Process finished with exit code 0

Impressions: I was able to confirm the difference between read (), readline () and readlines (). The subprocess that can use commands is really convenient.

012. Save the first column in col1.txt and the second column in col2.txt

Save only the first column of each row as col1.txt and the second column as col2.txt. Use the cut command for confirmation.

cut_012.py


# -*- coding:utf-8 -*-

import codecs
import subprocess

if __name__ == "__main__":
    filename = 'hightemp.txt'
    writename1='col1.txt'
    writename2='col2.txt'
    basepath = '/Users/masassy/PycharmProjects/Pywork/training/'
    f = codecs.open(filename,'r','utf-8')
    r = f.readlines()
    word_list1= []
    word_list2= []

#with split\Add to the list separately for each t
    for temp1 in r:
        word_list1.append(temp1.split('\t')[0])
    f.close
    f = codecs.open(writename1,'w','utf-8')
    for word in word_list1:
        f.write(word+'\n')
    f.close

    for temp2 in r:
        word_list2.append(temp2.split('\t')[1])
    f.close
    f = codecs.open(writename2,'w','utf-8')
    for word in word_list2:
        f.write(word+'\n')
    f.close

#Check the output with the cut command
    output = subprocess.check_output(["cut","-f","1,2",basepath+filename])
    print(output.decode('utf-8'))

result


*The cut command outputs the 1st and 2nd columns at the same time.
Kochi Prefecture Ekawasaki
Kumagaya, Saitama Prefecture
Gifu Prefecture Tajimi
(Omitted because the result is long)

Process finished with exit code 0

col1.txt
Kochi Prefecture
Saitama
Gifu Prefecture
(Omitted because the result is long)

col2.txt
Ekawasaki
Kumagaya
Tajimi
(Omitted because the result is long)

Impression: I divided the processing into col1.txt and col2.txt, but there seems to be some good processing ...

013. Merge col1.txt and col2.txt

Combine the col1.txt and col2.txt created in 12, and create a text file in which the first and second columns of the original file are arranged by tab delimiters. Use the paste command for confirmation.

merge_013.py


#-*- conding:utf-8 -*-

import codecs
import subprocess
basepath = '/Users/masassy/PycharmProjects/Pywork/training/'
filename1 = 'col1.txt'
filename2 = 'col2.txt'
filename3 = 'col3.txt'

#Read files with readlines and list them
f1 = codecs.open(filename1,'r','utf-8')
r1 = f1.readlines()
f1.close()

f2 = codecs.open(filename2,'r','utf-8')
r2 = f2.readlines()
f2.close()

s_r1=''
s_r2=''

#Change the list to a string, r1\n is\Change to t(\t becomes a sentinel)
for data in r1:
    s_r1 += str(data)
    s_r1=s_r1.replace('\n','\t')

#Change list to string(\n is left as it is because it is a sentinel)
for data in r2:
    s_r2 += str(data)

address=''
i=0
#s_Evaluate r1 character by character and guard(\t)Add data to address until
for temp in s_r1:
    if(temp!='\t'):
        address+=temp
    else:
#s to address_Sentinel data for r2(\n)Add until
        address+='\t'
        while(s_r2[i]!='\n'):
            address+=s_r2[i]
            i+=1
        else:
            address+='\n'
            i+=1
            continue

f3=codecs.open(filename3,'w','utf-8')
f3.write(address)
f3.close()

output=subprocess.check_output(["paste",basepath+filename1,basepath+filename2])
print(output.decode('utf-8'))

result


Kochi Prefecture Ekawasaki
Kumagaya, Saitama Prefecture
Gifu Prefecture Tajimi
(Omitted because the result is long)
Process finished with exit code 0

Impressions: Add data in a double loop.

014. Output N lines from the beginning

Receive the natural number N by means such as command line arguments, and display only the first N lines of the input. Use the head command for confirmation.

head_014.py


#-*- coding:utf-8 -*-

import codecs
import subprocess

def head(data,N):
    i=0
    j=0
    msg=''
    while(i<N):
        for temp in data[j]:
            if(temp!='\n'):
                msg += temp
                j+=1
            else:
                msg += '\n'
                i+=1
                j+=1
                break
    else:
        return msg

if __name__=="__main__":
    filename = 'hightemp.txt'
    basepath = '/Users/masassy/PycharmProjects/Pywork/training/'
    f = codecs.open(filename,'r','utf-8')
    r=f.read()
    N=4
    msg = head(r,N)
    print(msg)

#Confirm with head command
    output=subprocess.check_output(["head","-n",str(N),basepath+filename])
    print(output.decode('utf-8'))

result


Kochi Prefecture Ekawasaki 41 2013-08-12
40 Kumagaya, Saitama Prefecture.9	2007-08-16
40 Tajimi, Gifu Prefecture.9	2007-08-16
Yamagata 40 Yamagata.8	1933-07-25

Kochi Prefecture Ekawasaki 41 2013-08-12
40 Kumagaya, Saitama Prefecture.9	2007-08-16
40 Tajimi, Gifu Prefecture.9	2007-08-16
Yamagata 40 Yamagata.8	1933-07-25

Process finished with exit code 0

Impressions: It has become something like C language. .. ..

Recommended Posts

[Python] Challenge 100 knocks! (030-034)
[Python] Challenge 100 knocks! (006-009)
[Python] Challenge 100 knocks! (000-005)
[Python] Challenge 100 knocks! (010-014)
[Python] Challenge 100 knocks! (025-029)
[Python] Challenge 100 knocks! (020-024)
python challenge diary ①
Challenge 100 data science knocks
Python
Sparta Camp Python 2019 Day2 Challenge
100 Pandas knocks for Python beginners
Challenge Python3 and Selenium Webdriver
Challenge LOTO 6 with Python without discipline
Image processing with Python 100 knocks # 3 Binarization
# 2 Python beginners challenge AtCoder! ABC085C --Otoshidama
Image processing with Python 100 knocks # 2 Grayscale
Python basics ⑤
python + lottery 6
Python Summary
Built-in python
Python comprehension
Python technique
Studying python
Python 2.7 Countdown
Python memorandum
Python FlowFishMaster
Python service
python tips
python function ①
Python basics
ufo-> python (3)
Python comprehension
install python
Python Singleton
python memo
Python Jinja2
Image processing with Python 100 knocks # 8 Max pooling
atCoder 173 Python
[Python] function
Python installation
python tips
Installing Python 3.4.3.
Try python
Python memo
Python algorithm
Python2 + word2vec
[Python] Variables
Python functions
Python sys.intern ()
Python tutorial
Python decimals
python underscore
Python summary
Start python
[Python] Sort
Note: Python
Python basics ③
python log
Python basics
[Scraping] Python scraping
Python update (2.6-> 2.7)