[PYTHON] 100 language processing knocks 06 ~ 09

* 06. Set *

Find the set of character bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.

nlp06.py


#!usr/bin/env python
#coding:UTF-8
def char_ngram(n,seq):
    li = []
    for i in range(len(seq)):
        li.append(seq[i:i+n])
    return li
                        
seq1 = "paraparaparadise"
seq2 = "paragraph"

X = char_ngram(2,seq1)
Y = char_ngram(2,seq2)
Z = ['se']

logical_sum = set(X).union(Y)
logical_product = set(X).intersection(Y)
logical_difference = set(X).symmetric_difference(Y)
print "Union:",
print logical_sum
print "Intersection:",
print logical_product
print "Difference set:",
print logical_difference
print "bi called se-Is gram included in X?",
print('se' in X)
print "bi called se-Is gram included in Y?",
print('se' in Y)

Execution result Union: set (['e','ad','ag','di','h','is','ap','pa','ra','ph','ar', 'se','gr']) Intersection: set (['ap','pa','ar','ra']) Complement: set (['e','gr','ag','di','h','is','ph','se','ad']) Is bi-gram called se included in X? True Is the bi-gram called se included in Y? False

* 07. Sentence generation by template *

Implement a function that takes arguments x, y, z and returns the string "y at x is z". Furthermore, set x = 12, y = "temperature", z = 22.4, and check the execution result.

nlp07.py


def combine(x,y,z):
    s1 = "of time"
    s2 = "Is"
    seq = str(x)+s1+str(y)+s2+str(z)
    return seq
x = 12
y = "temperature"
z = 22.4
print combine(x,y,z)

Execution result The temperature at 12:00 is 22.4

* 08. Ciphertext *

Implement the function cipher that converts each character of the given character string according to the following specifications. Replace with (219 --character code) characters in lowercase letters Output other characters as they are Use this function to encrypt / decrypt English messages.

nlp08.py


#!usr/bin/env python
#coding:UTF-8
def cipher(str):
    enc = ""
    for char in str:
        if 97 <= ord(char)<= 123:
            enc +=chr(219-ord(char))
        else:
            enc +=char
    return enc
str = "Machine Learning"
print cipher(str)#encryption
print cipher(cipher(str))#Composite

Execution result Mzxsrmv Lvzimrmt Machine Learning

09. Typoglycemia Create a program that randomly rearranges the order of the other letters, leaving the first and last letters of each word for the word string separated by spaces. However, words with a length of 4 or less are not rearranged. Give an appropriate English sentence (for example, "I couldn't believe that I could actually understand what I was reading: the phenomenal power of the human mind.") And check the execution result.

nlp09.py


#!usr/bin/env python
#coding:UTF-8
#Typoglycemia
import random
def Typo(word): 
    if len(word)>4:
        wordli = list(word[1:-1])
        random.shuffle(wordli)
        neword = word[0] + ''.join(wordli)+ word[-1]
        return neword
    else:
        return word
   
str="I couldn't believe that I could actually understand what I was reading : the phenomenal power of the human mind."

typostr=""
for word in str.replace("."," .").split():
    typostr +=" "+Typo(word)
print(typostr)

Execution result I clundo't bivelee that I colud allctuay uedtnasrnd what I was riednag : the phnoneeaml peowr of the hmaun mind .

There should be a better way to write the typostr + =" "+ Typo (word) part,

Recommended Posts

100 language processing knocks 03 ~ 05
100 language processing knocks (2020): 40
100 language processing knocks (2020): 32
100 language processing knocks (2020): 35
100 language processing knocks (2020): 47
100 language processing knocks (2020): 39
100 language processing knocks (2020): 22
100 language processing knocks (2020): 42
100 language processing knocks (2020): 29
100 language processing knocks (2020): 49
100 language processing knocks 06 ~ 09
100 language processing knocks (2020): 43
100 language processing knocks (2020): 24
100 language processing knocks (2020): 45
100 language processing knocks (2020): 10-19
100 language processing knocks (2020): 30
100 language processing knocks (2020): 00-09
100 language processing knocks (2020): 31
100 language processing knocks (2020): 48
100 language processing knocks (2020): 44
100 language processing knocks (2020): 41
100 language processing knocks (2020): 37
100 language processing knocks (2020): 25
100 language processing knocks (2020): 23
100 language processing knocks (2020): 33
100 language processing knocks (2020): 20
100 language processing knocks (2020): 27
100 language processing knocks (2020): 46
100 language processing knocks (2020): 21
100 language processing knocks (2020): 36
100 amateur language processing knocks: 41
100 amateur language processing knocks: 71
100 amateur language processing knocks: 56
100 amateur language processing knocks: 24
100 amateur language processing knocks: 59
100 amateur language processing knocks: 70
100 amateur language processing knocks: 60
100 amateur language processing knocks: 92
100 amateur language processing knocks: 30
100 amateur language processing knocks: 06
100 amateur language processing knocks: 84
100 amateur language processing knocks: 81
100 amateur language processing knocks: 33
100 amateur language processing knocks: 40
100 amateur language processing knocks: 45
100 amateur language processing knocks: 43
100 amateur language processing knocks: 55
100 amateur language processing knocks: 22
100 amateur language processing knocks: 61
100 amateur language processing knocks: 94
100 amateur language processing knocks: 54
100 amateur language processing knocks: 04
100 amateur language processing knocks: 63
100 amateur language processing knocks: 12
100 amateur language processing knocks: 14
100 amateur language processing knocks: 08
100 amateur language processing knocks: 42
100 language processing knocks ~ Chapter 1
100 amateur language processing knocks: 19
100 amateur language processing knocks: 73
100 amateur language processing knocks: 75