It is a challenge record of Language processing 100 knock 2015. The environment is Ubuntu 16.04 LTS + Python 3.5.2 : : Anaconda 4.1.1 (64-bit). Click here for a list of past knocks (http://qiita.com/segavvy/items/fb50ba8097d59475f760).
Find the set of characters bi-grams contained in "paraparaparadise" and "paragraph" as X and Y, respectively, and find the union, intersection, and complement of X and Y, respectively. In addition, find out if the bi-gram'se'is included in X and Y.
The finished code:
main.py
# coding: utf-8
def n_gram(target, n):
'''N from the specified list-Create gram
argument:
target --Target list
n -- n-gram n value (1 is uni-gram, 2 for bi-gram...)
Return value:
List of gram
'''
result = []
for i in range(0, len(target) - n + 1):
result.append(target[i:i + n])
return result
#Creating a set
set_x = set(n_gram('paraparaparadise', 2))
print('X:' + str(set_x))
set_y = set(n_gram('paragraph', 2))
print('Y:' + str(set_y))
#Union
set_or = set_x | set_y
print('Union:' + str(set_or))
#Intersection
set_and = set_x & set_y
print('Intersection:' + str(set_and))
#Difference set
set_sub = set_x - set_y
print('Difference set:' + str(set_sub))
# 'se'Is included?
print('se is included in X:' + str('se' in set_x))
print('se is included in Y:' + str('se' in set_y))
Execution result:
Terminal
X:{'ar', 'se', 'di', 'is', 'pa', 'ap', 'ad', 'ra'}
Y:{'ar', 'gr', 'ph', 'ra', 'pa', 'ap', 'ag'}
Union:{'ar', 'gr', 'se', 'ph', 'di', 'is', 'pa', 'ap', 'ad', 'ra', 'ag'}
Intersection:{'ar', 'ap', 'pa', 'ra'}
Difference set:{'di', 'is', 'se', 'ad'}
se is included in X:True
se is included in Y:False
n_gram ()
is a reuse of previous question.
For the answers of seniors, set.union ()
, set.intersection ()
, [set.difference ()
](http://docs.python.jp/3/library/stdtypes.html # set.difference) is often used. It is said that it is more readable, but I am not good at English, so |
, &
and -
are more intuitive, so I tried this.
Even so, paraparaparadise will appear in such a place ^^ That's all for the 7th knock. If you have any mistakes, I would appreciate it if you could point them out.
Recommended Posts