[PYTHON] I tried 100 language processing knock 2020: Chapter 1

Introduction

I tried Language processing 100 knock 2020. Links to other chapters can be found at here, and source code can be found at here.

Chapter 1 Preparatory movement

No.00 Reverse order of character strings

Get a string in which the characters of the string "stressed" are arranged in reverse (from the end to the beginning).

Answer

000.py


str = 'stressed'
print(str[::-1])

# -> desserts
Comments

Output in reverse order using slices. It's interesting to be able to easily write such operations.

No.01 "Patatokukashi"

Take out the 1st, 3rd, 5th, and 7th characters of the character string "Patatokukashi" and get the concatenated character string.

Answer

001.py


str = "Patatoku Kashii"
print(str[0:8:2])

# ->Police car
Comments

Since the odd number is taken out, step is set to 2.

No.02 "Police car" + "Taxi" = "Patatokukashi"

Obtain the character string "Patatokukashi" by alternately connecting the characters "Police car" + "Taxi" from the beginning.

Answer

002.py


str1 = "Police car"
str2 = "taxi"
print(''.join([s1 + s2 for s1, s2 in zip(str1, str2)]))

# ->Patatoku Kashii
Comments

At first I thought about looping with ʻindex, It seems that you can handle multiple functions at once by using the zip` function.

No.03 Pi

Break down the sentence "Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."

Answer

003.py


sentense = "Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."
print([len(item) for item in sentense.replace(',', "").replace('.', "").split(' ')])

# -> [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9]
Comments

I tried using list comprehension notation. It may be convenient because you can write in a few lines when creating a new list.

No.04 Element symbol

Break down the sentence “Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can.” Into words 1, 5, 6, 7, 8, 9, 15, 16, The 19th word is the first character, and the other words are the first two characters, and the associative array (dictionary type or map type) from the extracted character string to the word position (what number of words from the beginning) Create.

Answer

004.py


str = "Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can."
str = str.split()
num = [1, 5, 6, 7, 8, 9, 15, 16, 19]
dict = {}

for i_str in range(0, len(str)):
    if i_str + 1 == 12:
        dict[str[11][:3:2]] = 12  # 'Mg'Output of
    elif i_str + 1 in num:
        dict[str[i_str][:1]] = i_str + 1
    else:
        dict[str[i_str][:2]] = i_str + 1
print(dict)

# -> {'H': 1, 'He': 2, 'Li': 3, 'Be': 4, 'B': 5, 'C': 6, 'N': 7, 'O': 8, 'F': 9, 'Ne': 10, 'Na': 11, 'Mg': 12, 'Al': 13, 'Si': 14, 'P': 15, 'S': 16, 'Cl': 17, 'Ar': 18, 'K': 19, 'Ca': 20}
Comments

I feel like the code is a little long ... If the rules are followed, the Mg part will be output as Mi and I'm curious, so I'm processing it with the ʻif` statement.

No.05 n-gram

Create a function that creates an n-gram from a given sequence (string, list, etc.). Use this function to get the word bi-gram and the letter bi-gram from the sentence "I am an NLPer".

Answer

005.py


def n_gram(list, n):
    return ["".join(list[list_i: list_i + n]) for list_i in range(len(list) - n + 1)]

sentence = "I am an NLPer"
print(f"Word bi-gran:  {n_gram(sentence.split(), 2)}")
print(f"Character bi-gram:  {n_gram(sentence, 2)}")

# ->Word bi-gran:  ['Iam', 'aman', 'anNLPer']
#Character bi-gram:  ['I ', ' a', 'am', 'm ', ' a', 'an', 'n ', ' N', 'NL', 'LP', 'Pe', 'er']
Comments

Use join to join the elements of the list. Since the word bi-gram and the character bi-gram are doing similar processing, I tried to make it a function, but I feel that I was able to write it well.

No.06 set

Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.

Answer

006.py


str1 = "paraparaparadise"
str2 = "paragraph"

def n_gram(list, n):
    return {"".join(list[list_i: list_i + n]) for list_i in range(len(list) - n + 1)}

X = n_gram(str1, 2)
Y = n_gram(str2, 2)
print(f"Union:{X | Y}")
print(f"Intersection:{X & Y}")
print(f"Difference set:{X - Y}")

se = {"se"}
print(f"Is se included in X? :{se <= X}")
print(f"Is se included in Y? :{se <= Y}")

# ->Union:{'ph', 'di', 'ar', 'gr', 'ad', 'is', 'se', 'ap', 'pa', 'ra', 'ag'}
#Intersection:{'ra', 'ap', 'ar', 'pa'}
#Difference set:{'is', 'di', 'se', 'ad'}
#Is se included in X? : True
#Is se included in Y? : False
Comments

Union () , ʻintersection (), difference () can also be used for union, product, and difference.

No.07 Sentence generation using template

Implement a function that takes arguments x, y, z and returns the string "y at x is z". Furthermore, set x = 12, y = ”temperature”, z = 22.4, and check the execution result.

Answer

007.py


def templete(x, y, z):
    return f"{x}of time{y}Is{z}"

print(templete(12, "temperature", 22.4))

# ->The temperature at 12:00 is 22.4
Comments

nothing special.

No.08 Ciphertext

Implement the function cipher that converts each character of the given character string according to the following specifications. ・ If lowercase letters, replace with (219 --character code) characters ・ Other characters are output as they are Use this function to encrypt / decrypt English messages.

Answer

008.py


def cipher(sentence):
    return "".join([chr(219 - ord(ch)) if ch.islower() else ch for ch in sentence])

sen = "FireWork"
print(cipher(sen))
print(cipher(cipher(sen)))

# -> FrivWlip
#    FireWork

Comments

It seems to be Atbash encryption. You can get it back by passing the cipher function twice.

No.09 Typoglycemia

Create a program that randomly rearranges the order of the other letters, leaving the first and last letters of each word for the word string separated by spaces. However, words with a length of 4 or less are not rearranged. Give an appropriate English sentence (for example, "I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind.") And check the execution result.

Answer

009.py


import random

sentence = "I couldn’t believe that I could actually understand what I was reading : the phenomenal power of the human mind."
new_sent = ""
for item in sentence.split():
    if len(item) > 4:
        new_item = []
        new_item.extend(item[0])
        new_item.extend(random.sample(item[1:-1], len(item) - 2))
        new_item.extend(item[-1])
        item = new_item
    new_sent += "".join(item) + " "

print(new_sent)

# -> I could’nt blveeie that I cuold atlculay utnresnadd what I was renadig : the pamohneenl pewor of the human mdin. 
Comments

In addition to random.sample, there is random.shuffle as a function to randomly arrange the elements of the list. The shuffle function sorts the original list, so I think the code can be a little shorter.

reference

[Upura / nlp100v2020 100 language processing knock 2020] is solved with Python](https://github.com/upura/nlp100v2020) Amateur language processing 100 knock summary

Recommended Posts

I tried 100 language processing knock 2020: Chapter 3
I tried 100 language processing knock 2020: Chapter 1
I tried 100 language processing knock 2020: Chapter 2
I tried 100 language processing knock 2020: Chapter 4
I tried 100 language processing knock 2020
100 Language Processing Knock 2020 Chapter 1
100 Language Processing Knock Chapter 1
100 Language Processing Knock 2020 Chapter 3
100 Language Processing Knock 2020 Chapter 2
100 Language Processing Knock Chapter 1 (Python)
100 Language Processing Knock Chapter 2 (Python)
I tried to solve 100 language processing knock 2020 version [Chapter 2: UNIX commands 10 to 14]
I tried to solve 100 language processing knock 2020 version [Chapter 2: UNIX commands 15 to 19]
100 Language Processing Knock with Python (Chapter 1)
100 Language Processing Knock Chapter 1 in Python
100 Language Processing Knock 2020 Chapter 4: Morphological Analysis
100 Language Processing Knock 2020 Chapter 9: RNN, CNN
100 Language Processing Knock (2020): 28
100 Language Processing Knock: Chapter 1 Preparatory Movement
100 Language Processing Knock 2020 Chapter 6: Machine Learning
100 Language Processing Knock Chapter 4: Morphological Analysis
100 Language Processing Knock 2020 Chapter 10: Machine Translation (90-98)
100 Language Processing Knock 2020 Chapter 8: Neural Net
Python beginner tried 100 language processing knock 2015 (05 ~ 09)
100 Language Processing Knock (2020): 38
100 language processing knock 00 ~ 02
100 Language Processing Knock 2020 Chapter 1: Preparatory Movement
100 Language Processing Knock Chapter 1 by Python
100 Language Processing Knock 2020 Chapter 3: Regular Expressions
100 Language Processing Knock 2015 Chapter 4 Morphological Analysis (30-39)
Python beginner tried 100 language processing knock 2015 (00 ~ 04)
100 Language Processing Knock 2020 with GiNZA v3.1 Chapter 4
100 Language Processing Knock with Python (Chapter 2, Part 2)
[Programmer newcomer "100 language processing knock 2020"] Solve Chapter 1
I tried natural language processing with transformers.
100 language processing knock 2020 [00 ~ 39 answer]
100 language processing knock 2020 [00-79 answer]
100 language processing knock 2020 [00 ~ 69 answer]
100 Amateur Language Processing Knock: 17
100 language processing knock 2020 [00 ~ 49 answer]
100 language processing knocks ~ Chapter 1
100 Amateur Language Processing Knock: 07
100 Amateur Language Processing Knock: 09
100 Amateur Language Processing Knock: 47
100 Language Processing Knock-53: Tokenization
100 Amateur Language Processing Knock: 97
100 language processing knock 2020 [00 ~ 59 answer]
100 Amateur Language Processing Knock: 67
100 Language Processing Knock UNIX Commands Learned in Chapter 2
100 Language Processing Knock Regular Expressions Learned in Chapter 3
I tried to solve the 2020 version of 100 language processing [Chapter 3: Regular expressions 25-29]
100 Language Processing with Python Knock 2015
100 Language Processing Knock-58: Tuple Extraction
100 Language Processing Knock-57: Dependency Analysis
100 language processing knock-50: sentence break
[I tried] Nand2 Tetris Chapter 6
100 Language Processing Knock-25: Template Extraction
100 Language Processing Knock-87: Word Similarity
100 language processing knock-56: co-reference analysis
Solving 100 Language Processing Knock 2020 (01. "Patatokukashi")
100 Amateur Language Processing Knock: Summary