100 Language Processing Knock Chapter 1 by Python

Recently, I had to study Python, so I tried knocking 100 language processes. First of all, from Chapter 1: Preparatory Movement.

__ Language processing 100 knocks __

http://www.cl.ecei.tohoku.ac.jp/nlp100/

00. Reverse order of strings

Get a string in which the characters of the string "stressed" are arranged in reverse (from the end to the beginning).

q00='stressed'
print(q00[::-1])

01. "Patatokukashi"

Take out the 1st, 3rd, 5th, and 7th characters of the character string "Patatokukashi" and get the concatenated character string.

q01='Patatoku Kashii'
#print(q01[1]+q01[3]+q01[5]+q01[7])

# ->updated version
print(q01[1::2]) 

02. "Police car" + "Taxi" = "Patatokukashi"

Get the character string "Patatokukashi" by alternately connecting the characters "Police car" + "Taxi" from the beginning.

Solution 1

q021='Police car'
q022='taxi'

length=min(len(q021),len(q022))

ansq02=''
for i in range(length):
    temp=q021[i]+q022[i]
    ansq02+=temp

print(ansq02)

Solution 2

q021='Police car'
q022='taxi'

ansq022="".join(i+j for i,j in zip(q021,q022))

print(ansq022)

03. Pi

Break down the sentence "Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."

q03="Now I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics."

ansq03=[len(i.strip(",.")) for i in q03.split()]

print(ansq03)

04. Element symbol

Break down the sentence "Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can." Into words 1, 5, 6, 7, 8, 9, 15, 16, 19 The first word is the first character, the other words are the first two characters, and the associative array (dictionary type or map type) from the extracted character string to the word position (what number of words from the beginning) is created. Create it.

q04="Hi He Lied Because Boron Could Not Oxidize Fluorine. New Nations Might Also Sign Peace Security Clause. Arthur King Can."

dict={}

q04_list=[(i.strip(",.")) for i in q04.split()]
print(q04_list)

q04_listNum=[1, 5, 6, 7, 8, 9, 15, 16, 19]

for idx,val in enumerate(q04_list):
    temp_char=val
    idx += 1
    if ((idx) in q04_listNum):
        dict[temp_char[0]] = idx
    else:
        dict[temp_char[:2:1]] =idx

print(dict)
  1. n-gram Create a function that creates an n-gram from a given sequence (string, list, etc.). Use this function to get the word bi-gram and the letter bi-gram from the sentence "I am an NLPer".
q05="I am an NLPer"

# bi-gram for char
char_bigram=[q05[i:i+2] for i in range(len(q05)-1)]
print(char_bigram)

# n-bigram for words
words=[(i.strip(".,")) for i in q05.split()]
words_bigram=["-".join(words[i:i+2]) for i in range(len(words)-1)]
print(words_bigram)

06. Meeting

Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.

import copy

def bigram(a):
    result=[a[i:i+2] for i in range(len(a)-1)]
    return result

q061="paraparaparadise"
q062="paragraph"

bigramX_list = copy.deepcopy(bigram(q061))
bigramY_list = copy.deepcopy(bigram(q062))

bigramX_set=set(bigramX_list)
bigramY_set=set(bigramY_list)
print ('bigramX_set =', bigramX_set)
print ('bigramY_set =', bigramY_set)

#Union
print ('Union= ',  (bigramX_set | bigramY_set))
#Difference set
print ('Difference set= ',  (bigramX_set - bigramY_set))
#Intersection
print ('Intersection= ',  (bigramX_set & bigramY_set))
#Search
print ('search results= ', 'se' in  (bigramX_set | bigramY_set))

07. Sentence generation by template

Implement a function that takes arguments x, y, z and returns the string "y at x is z". Furthermore, set x = 12, y = "temperature", z = 22.4, and check the execution result.

def maketext(x=1,y='Anko',z=10):
    result="".join(str(x)+'of time'+y+'Is'+str(z))
    return result

x,y,z=12,'temperature',22.4

print (maketext(x,y,z))
#print (maketext())

08. Ciphertext

Implement the function cipher that converts each character of the given character string according to the following specifications.

Replace with (219 --character code) characters in lowercase letters Output other characters as they are Use this function to encrypt / decrypt English messages.

#Q08
def cipher(a):
    temp_list=[a[i:i+1] for i in range(len(a))]
    ciptex_list=[]
    for i in temp_list:
 
        texCode=ord(i)
        if (texCode>96 & texCode<123):
            updtexCode=chr(219-texCode)
        else:
            updtexCode=chr(texCode)

        ciptex_list.append(updtexCode)

    result="".join(i for i in ciptex_list)
    return result

print (cipher('abcdef')) #=> 'zyxwyu'
  1. Typoglycemia Create a program that randomly rearranges the order of the other letters, leaving the first and last letters of each word for the word string separated by spaces. However, words with a length of 4 or less are not rearranged. Give an appropriate English sentence (for example, "I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind.") And check the execution result.
import random
def randsort(a):

    result = []
    listA = [(i.strip(',.')) for i in a.split()]
    randchar = lambda x: ''.join(random.sample(x,len(x)))

    for i in listA:
        if len(i) > 4:
            temp_word=i[:1:1]+randchar(i[1:len(i)-1:1])+i[len(i)-1::1]
            result.append(temp_word)
        else:
            result.append(i)
    return (result)

q09="I couldn't believe that I could actually understand what I was reading : the phenomenal power of the human mind ."

print(randsort(q09))

For the time being, I tried to write Python myself for the first time so far, but I did a lot of research and learned a lot. There may be other more efficient ways to do it, but for now I'm going to do it.

reference

"Getting the code value of a character" / "Getting a character from a code value" in Python http://d.hatena.ne.jp/flying-foozy/20111204/1323009984

Unicode HOWTO https://docs.python.jp/3/howto/unicode.html

Python: Compare two list elements with a set type set operation http://www.yukun.info/blog/2008/08/python-set-list-comparison.html

3.7 set type --set, frozenset http://docs.python.jp/2.5/lib/types-set.html

Recommended Posts

100 Language Processing Knock Chapter 1 by Python
100 Language Processing Knock Chapter 1 (Python)
100 Language Processing Knock Chapter 2 (Python)
100 Language Processing Knock with Python (Chapter 1)
100 Language Processing Knock Chapter 1 in Python
100 Language Processing Knock with Python (Chapter 3)
100 Language Processing Knock 2020 Chapter 1
100 Language Processing Knock Chapter 1
100 Language Processing Knock 2020 Chapter 3
100 Language Processing Knock 2020 Chapter 2
100 Language Processing Knock with Python (Chapter 2, Part 2)
100 Language Processing Knock with Python (Chapter 2, Part 1)
100 Language Processing with Python Knock 2015
100 Language Processing Knock 2020 Chapter 2: UNIX Commands
100 Language Processing Knock 2015 Chapter 5 Dependency Analysis (40-49)
100 Language Processing Knock 2020 Chapter 4: Morphological Analysis
100 Language Processing Knock 2020 Chapter 9: RNN, CNN
100 Language Processing Knock (2020): 28
I tried 100 language processing knock 2020: Chapter 3
100 Language Processing Knock: Chapter 1 Preparatory Movement
100 Language Processing Knock 2020 Chapter 6: Machine Learning
100 Language Processing Knock 2020 Chapter 10: Machine Translation (90-98)
100 Language Processing Knock 2020 Chapter 5: Dependency Analysis
100 Language Processing Knock 2020 Chapter 7: Word Vector
100 Language Processing Knock 2020 Chapter 8: Neural Net
Python beginner tried 100 language processing knock 2015 (05 ~ 09)
100 Language Processing Knock (2020): 38
I tried 100 language processing knock 2020: Chapter 1
100 language processing knock 00 ~ 02
100 Language Processing Knock 2020 Chapter 1: Preparatory Movement
100 Language Processing Knock 2020 Chapter 3: Regular Expressions
100 Language Processing Knock 2015 Chapter 4 Morphological Analysis (30-39)
I tried 100 language processing knock 2020: Chapter 2
I tried 100 language processing knock 2020: Chapter 4
Python beginner tried 100 language processing knock 2015 (00 ~ 04)
100 Language Processing Knock 2020 with GiNZA v3.1 Chapter 4
Image processing by Python 100 knock # 1 channel replacement
100 Language Processing Knock-89: Analogy by Additive Constitutiveness
100 image processing by Python Knock # 6 Color reduction processing
[Programmer newcomer "100 language processing knock 2020"] Solve Chapter 1
100 language processing knock 2020 [00 ~ 39 answer]
100 language processing knock 2020 [00-79 answer]
100 Amateur Language Processing Knock: 17
100 language processing knock 2020 [00 ~ 49 answer]
Python: Natural language processing
Communication processing by Python
100 Language Processing Knock-52: Stemming
100 language processing knocks ~ Chapter 1
100 Amateur Language Processing Knock: 07
100 language processing knocks Chapter 2 (10 ~ 19)
100 Amateur Language Processing Knock: 09
100 Amateur Language Processing Knock: 47
100 Language Processing Knock-53: Tokenization
100 Amateur Language Processing Knock: 97
100 language processing knock 2020 [00 ~ 59 answer]
100 Amateur Language Processing Knock: 67
Python inexperienced person tries to knock 100 language processing 14-16
100 Language Processing Knock UNIX Commands Learned in Chapter 2
100 Language Processing Knock Regular Expressions Learned in Chapter 3
Python inexperienced person tries to knock 100 language processing 10 ~ 13
100 language processing knock-99 (using pandas): visualization by t-SNE