[Python] Try to classify ramen shops by natural language processing

Introduction

Hello, you copy and paste data scientist.

About 3 years ago, I did a funny LT called "Ramen and Natural Language Processing", but now it's embarrassingly shabby, so Python I tried to remake it with.

It's been long, so if you put it together in 3 lines

Method

We use a technique called statistical latent semantic analysis. Roughly speaking, it gives you an idea of what topic the document has and what it is about.

Since the ratio allocated to each topic can be calculated with the following image, it is possible to calculate that A and B are close in the following example.

Although it is not used as described above, the following application examples and books may be helpful.

Mainly used

Rough flow

It's been so long First, I will list the general flow.

  1. Preparation of document data such as word of mouth
  1. Preparation for classifier learning
  1. Classifier learning: Learn LDA model with gensim
  2. Store classification: Topic classification through document data in a trained classifier
  3. Find a similar store

I will actually work from here

1. Word-of-mouth data preparation

#Reading the collected document data
from io_modules import load_data  #Self-made DB read library
rows = load_data(LOAD_QUERY, RAMEN_DB)

#Extract the stem with the stem function in the reference article
from utils import stems  #Implementation of reference article Almost as it is
docs = [stems(row) for row in rows]

"""
docs = [
  ['Large serving', 'Impressions', 'Direction', 'Best', 'ramen', ...
  ['ramen', 'queue', 'Cold', 'Hot', 'joy', ...
   ...
]
"""

2. Advance preparation

From here, we will actually perform LDA using gensim. First, create a dictionary and corpus for gensim.

gensim load

from gensim import corpora, models

Creating a dictionary

It's confusing, but it's not the user dictionary used for word-separation in MeCab, but for gensim to map the appearing words and word IDs in the document.

dictionary = gensim.corpora.Dictionary(docs)
dictionary.save_as_text('./data/text.dict')  #Save
# gensim.corpora.Dictionary.load_from_text('./data/text.dict')  #File can be loaded from next time

"""
Word ID Word appearance count
1543 Clam 731
62 Easy 54934
952 Warm 691
672 hot 1282
308 Thank you 4137
・
・
"""

Creating a corpus

The first word of mouth collected will be maintained as a corpus and used for learning the classifier.

corpus = [dictionary.doc2bow(doc) for doc in docs]
gensim.corpora.MmCorpus.serialize('./data/text.mm', corpus)  #Save
# corpus = gensim.corpora.MmCorpus('./data/text.mm')  #File loading is possible from the next time

"""\
doc_id  word_id frequency of occurrence
     6      150       3  # word_id=150:Bean sprouts
     6      163       9  # word_id=163:soy sauce
     6      164       1
     6      165       1
・
・
"""

There is some debate about whether it is necessary, This time, we will perform TFIDF processing on the corpus and perform LDA.

tfidf = gensim.models.TfidfModel(corpus)
corpus_tfidf = tfidf[corpus]

#I calculated it, so save it in pickle
import pickle
with open('./data/corpus_tfidf.dump', mode='wb') as f:
    pickle.dump(corpus_tfidf, f)

#You can load it from the next time
# with open('./data/corpus_tfidf.dump', mode='rb') as f:
#     corpus_tfidf = pickle.load(f)

3. Learning the classifier

Now that we are ready, we will actually do LDA with gensim. This time, I classified it into 50 topics.

#Depending on the amount of documents, it may take several hours.
# '18/12/03 postscript:Increasing the number of workers in LdaMulticore may be much faster
lda = gensim.models.LdaModel(corpus=corpus_tfidf, id2word=dictionary,
                             num_topics=50, minimum_probability=0.001,
                             passes=20, update_every=0, chunksize=10000)
lda.save('./data/lda.model')  #Save
# lda = gensim.models.LdaModel.load('./data/lda.model')  #Can be loaded from next time

Now let's display the contents of the learned model.

Although some topics that express impressions (# 0, # 36, # 42, etc.) are mixed, they are generally topics that express the taste of ramen (# 2: miso, # 49: family line, etc.). It seems that the classifier of is made.

for i in range(50):
    print('tpc_{0}: {1}'.format(i, lda.print_topic(i)[0:80]+'...'))

==============

tpc_0: 0.019*Impressed+ 0.014*impact+ 0.013*Long-sought+ 0.012*Difficulty+ 0.012*delicious+ 0.011*ramen+ 0.010*Deep emotion+...
tpc_1: 0.035*Grilled pork+ 0.022*chilled Chinese noodles+ 0.018*hot+ 0.010*Addictive+ 0.009*Stubborn+ 0.008*delicious+ 0.008*Ma...
tpc_2: 0.050*miso+ 0.029*Miso+ 0.017*ginger+ 0.013*butter+ 0.012*Bean sprouts+ 0.011*lard+ 0.009*corn+...
tpc_3: 0.013*Flavor+ 0.010*garlic+ 0.010*Rich+ 0.009*roasted pork fillet+ 0.008*oil+ 0.008*Rich+ 0.008*...
tpc_4: 0.010*Soy sauce+ 0.009*use+ 0.009*kelp+ 0.008*Material+ 0.007*soup+ 0.007*seafood+ 0.007*roasted pork fillet...
tpc_5: 0.015*Come+ 0.014*Clams+ 0.012*Thin+ 0.010*ramen+ 0.010*popularity+ 0.010*It feels good+ 0.010*...
tpc_6: 0.047*Shrimp+ 0.046*shrimp+ 0.014*sesame+ 0.014*shrimp+ 0.012*Addictive+ 0.008*delicious+ 0.008*Sukiyaki...
tpc_7: 0.016*Unpalatable+ 0.015*Expectations+ 0.013*bad+ 0.012*Sorry+ 0.012*delicious+ 0.011*usually+ 0.011*ramen...
tpc_8: 0.070*Soba+ 0.015*Soboro+ 0.013*Attach+ 0.012*Mentaiko+ 0.012*chicken+ 0.010*Rich+ 0.010*delicious+...
tpc_9: 0.041*Citron+ 0.024*Japanese style+ 0.017*Stew+ 0.010*Trefoil+ 0.010*life+ 0.009*delicious+ 0.009*seafood+...
tpc_10: 0.040*Vegetables+ 0.027*garlic+ 0.018*Extra+ 0.013*Garlic+ 0.010*Bean sprouts+ 0.010*Less+ 0.009*Ca...
tpc_11: 0.026*Handmade+ 0.023*Offal+ 0.016*Ginger+ 0.010*spicy+ 0.010*ramen+ 0.009*delicious+ 0.008*Feeling...
tpc_12: 0.031*Buckwheat+ 0.030*Soba+ 0.029*Chinese+ 0.016*Plain hot water+ 0.011*Shamo chicken+ 0.008*delicious+ 0.007*ramen+...
tpc_13: 0.057*black+ 0.023*black+ 0.020*Black+ 0.018*Soy sauce+ 0.011*stamina+ 0.010*oyster+ 0.009*Appearance...
tpc_14: 0.060*Tanmen+ 0.048*shrimp+ 0.019*Vegetables+ 0.014*Chinese cabbage+ 0.011*Fish ball+ 0.009*Gyoza+ 0.007*delicious...
tpc_15: 0.073*Spicy+ 0.015*Spicy+ 0.012*miso+ 0.011*Peppers+ 0.011*Sansho+ 0.010*Spicy+ 0.010*辛miso+ 0...
tpc_16: 0.031*Aoba+ 0.029*Mesh+ 0.019*double+ 0.012*seafood+ 0.010*trend+ 0.009*instant+ 0.009*Rame...
tpc_17: 0.041*Replacement ball+ 0.017*Replacement ball+ 0.014*Tonkotsu+ 0.014*Mustard+ 0.010*Extra fine+ 0.010*ramen+ 0.009*Red...
tpc_18: 0.032*Nostalgic+ 0.023*Easy+ 0.016*meaning+ 0.012*ramen+ 0.011*friendly+ 0.010*Feeling+ 0.010*Ah...
tpc_19: 0.027*Lemon+ 0.016*Normal+ 0.011*guts+ 0.009*Regrettable+ 0.009*steak+ 0.008*Rich+ 0.008*Delicious...
tpc_20: 0.088*Niboshi+ 0.009*Soba+ 0.008*fragrance+ 0.008*ramen+ 0.008*soup+ 0.007*roasted pork fillet+ 0.007*Soy sauce...
tpc_21: 0.023*sushi+ 0.015*Recommended+ 0.012*favorite+ 0.010*ramen+ 0.009*delicious+ 0.008*Growing up+ 0.008*...
tpc_22: 0.025*Fried+ 0.021*Fashionable+ 0.017*Fashionable+ 0.016*Cafe+ 0.014*Fashionable+ 0.014*atmosphere+ 0.011*...
tpc_23: 0.024*value+ 0.022*White miso+ 0.018*miso+ 0.014*赤miso+ 0.010*ultimate+ 0.010*delicious+ 0.009*burnt+...
tpc_24: 0.095*Fried rice+ 0.040*set+ 0.017*mini+ 0.013*Gyoza+ 0.012*ramen+ 0.011*delicious+ 0.009*...
tpc_25: 0.024*Oden+ 0.015*Nostalgic+ 0.013*Grilled meat+ 0.011*flat+ 0.010*Dark mouth+ 0.010*ramen+ 0.009...
tpc_26: 0.010*Off+ 0.009*ramen+ 0.009*delicious+ 0.008*serious+ 0.008*Delicious+ 0.008*Noisy+ 0.008...
tpc_27: 0.073*Mochi+ 0.032*Kimchi+ 0.012*Spicy miso+ 0.010*Delicious+ 0.010*delicious+ 0.008*roasted pork fillet+ 0.00...
tpc_28: 0.021*Sudachi+ 0.019*Shichimi+ 0.018*Men+ 0.015*onion+ 0.011*Onion+ 0.010*Disappointing+ 0.010*Attach...
tpc_29: 0.079*Gyoza+ 0.026*beer+ 0.011*delicious+ 0.010*ramen+ 0.009*生beer+ 0.009*Soy sauce+ 0.008...
tpc_30: 0.021*Tightening+ 0.018*Asexual+ 0.018*germ+ 0.015*Sake lees+ 0.010*Cooked in water+ 0.009*crab+ 0.009*Rich+ 0....
tpc_31: 0.051*Champon+ 0.024*student+ 0.015*Tantan+ 0.011*seafood+ 0.009*shock+ 0.009*Genuine+ 0.009*Delicious...
tpc_32: 0.025*odor+ 0.023*odor+ 0.016*smell+ 0.010*secret+ 0.010*Delicious+ 0.010*ramen+ 0.010*Soup...
tpc_33: 0.010*Soy sauce+ 0.009*roasted pork fillet+ 0.008*seafood+ 0.008*taste+ 0.007*soup+ 0.007*Menma+ 0.007*good...
tpc_34: 0.074*curry+ 0.040*Fried rice+ 0.015*Ganso+ 0.011*spices+ 0.010*set+ 0.008*delicious+ 0.008*La...
tpc_35: 0.068*Tomato+ 0.031*cheese+ 0.015*Italian+ 0.014*pasta+ 0.011*hormone+ 0.011*risotto+ 0.00...
tpc_36: 0.038*Colleague+ 0.014*strongest+ 0.010*hard+ 0.010*ramen+ 0.010*Dantotsu+ 0.009*delicious+ 0.009*topic...
tpc_37: 0.059*Tonkotsu+ 0.026*soy sauce+ 0.025*children+ 0.015*Delicious+ 0.012*Muddy+ 0.012*ramen+ 0.01...
tpc_38: 0.027*rice+ 0.025*rice ball+ 0.022*rice+ 0.016*Rice porridge+ 0.014*rice+ 0.012*pickles+ 0.011*set...
tpc_39: 0.026*Yuzu+ 0.019*pale+ 0.009*Aging+ 0.009*Grilled pork+ 0.008*Soy sauce+ 0.008*roasted pork fillet+ 0.007*soup...
tpc_40: 0.042*Udon+ 0.012*Skipjack+ 0.009*Yeah+ 0.009*tempura+ 0.009*ramen+ 0.008*delicious+ 0.008*Feeling...
tpc_41: 0.023*Salty+ 0.020*Who+ 0.012*junk+ 0.012*Attach+ 0.009*French+ 0.008*chef+ 0.008*Ra...
tpc_42: 0.029*friend+ 0.028*Delicious+ 0.015*queue+ 0.015*delicious+ 0.013*ramen+ 0.013*Easy+ 0.012*...
tpc_43: 0.012*Menma+ 0.011*roasted pork fillet+ 0.010*Soy sauce+ 0.009*Leek+ 0.009*good+ 0.008*seafood+ 0.008*soup...
tpc_44: 0.040*Attach+ 0.014*Rich+ 0.013*Slimy+ 0.013*Split+ 0.013*seafood+ 0.013*Fishmeal+ 0.011*Prime+ 0....
tpc_45: 0.019*Bad taste+ 0.017*Rock glue+ 0.017*Crowndaisy+ 0.012*No. 1 in Japan+ 0.010*delicious+ 0.009*ramen+ 0.008*line...
tpc_46: 0.074*Wonton+ 0.045*men+ 0.015*roasted pork fillet+ 0.009*delicious+ 0.008*Wonton+ 0.008*Soy sauce+ 0.007*...
tpc_47: 0.027*Usually+ 0.019*series+ 0.017*Soba noodles+ 0.012*Pickled+ 0.010*old+ 0.010*Delicious+ 0.010*Hard...
tpc_48: 0.018*half+ 0.014*salad+ 0.014*dessert+ 0.014*cuisine+ 0.013*Izakaya+ 0.012*tofu+ 0.010*set...
tpc_49: 0.068*Family line+ 0.019*spinach+ 0.013*Seaweed+ 0.010*Soy sauce+ 0.010*roasted pork fillet+ 0.010*Dark+ 0.010*La...

4. Store classification (topic classification)

Now that you have learned the classifier, you can divide the store into topics (topic vector calculation) by passing the reviews of the store through this classifier.

First of all, let's enter one tabelog pickup review of "Shinpuku Saikan Main Store @ Kyoto" that I have loved since I was a junior high school student and see the performance. But before that, I will share what kind of ramen Shinfukusaikan offers to make it easier to understand the following.

"Shinpuku Saikan Kyoto" スクリーンショット 2016-07-23 3.30.54.png

It's black ... It's a black ramen.

However, unlike its appearance, it is a delicious soy sauce ramen that is surprisingly light and rich.

Now, what kind of result will the classifier built this time return?

#Quote source: http://tabelog.com/kyoto/A2601/A260101/26000791/dtlrvwlst/763925/
#By the way, I personally prefer not to put raw eggs in it lol
>> str="When I was on a business trip to Kyoto, I stopped by the main store of Shinpuku Saikan, which I had longed for.(Omission)After all, the soup is adjusted, the noodles are boiled, the ingredients are served, the char siu is delicious, and the main store is even more delicious! I felt that. Anyway, this is cheap at 650 yen. "Chinese soba" at Shinfukusaikan is characterized by a black soup like the broth of char siu and a generous amount of Kujo green onions. Of course, "Yakimeshi" is irresistible.(Omission)On my way home, I suddenly saw a customer's order (I saw the review here later and found out that it was "extra large Shinfuku soba"), and I like raw eggs. Then, I was shocked, "Oh, could I have done that?" It seems to go well with the soup at Shinpuku Saikan, and I will definitely ask for it next time."
>> vec = dictionary.doc2bow(utils.stems(str))

#Classification result display
>> print(lda[vec])
[(0, 0.28870310712135505), (8, 0.25689765230576195), (13, 0.3333132412551591), (31, 0.081085999317724824)]

#Contents of each topic/In descending order of influence
>> lda.print_topic(13)  #It captures the characteristics of the black ramen at Shinfukusaikan
0.057*black+ 0.023*black+ 0.020*Black+ 0.018*Soy sauce+ 0.011*stamina+ 0.010*oyster+ 0.009*Appearance+ 0.008*Dark+ 0.008*ramen+ 0.008*delicious

>> lda.print_topic(0)
0.019*Impressed+ 0.014*impact+ 0.013*Long-sought+ 0.012*Difficulty+ 0.012*delicious+ 0.011*ramen+ 0.010*Deep emotion+ 0.010*queue+ 0.010*Delicious+ 0.008*delicious

>> lda.print_topic(8)
0.070*Soba+ 0.015*Soboro+ 0.013*Attach+ 0.012*Mentaiko+ 0.012*chicken+ 0.010*Rich+ 0.010*delicious+ 0.008*roasted pork fillet+ 0.008*Easy+ 0.007*rice

>> lda.print_topic(31)
0.051*Champon+ 0.024*student+ 0.015*Tantan+ 0.011*seafood+ 0.009*shock+ 0.009*Genuine+ 0.009*delicious+ 0.008*ramen+ 0.008*Vegetables+ 0.008*Special

The black ramen came out properly.

Now that we know it works, let's concatenate hundreds of sentences on the Web that refer to the Shinpuku Saikan main store and put them in a classifier.

>> str = "Sentences collected on the WEB"
>> sinpuku_vec = dictionary.doc2bow(utils.stems(str))
>> print(lda[sinpuku_vec])
[(0, 0.003061940579476011), (5, 0.001795672854987279), (7, 0.016165280743592875), (11, 0.0016683462844631061), (13, 0.387457274481951), (16, 0.048457912903426922), (18, 0.025816920842756448), (19, 0.0014647251485231138), (20, 0.0018013651819984121), (21, 0.001155430885775867), (24, 0.11249915373166983), (25, 0.0030405756373518885), (26, 0.0031413889216075561), (27, 0.0030955757983300515), (29, 0.0021349369911582098), (32, 0.006158571006380364), (34, 0.061260735988294568), (36, 0.0023903609848973475), (37, 0.020874795314517719), (41, 0.0018301667593946488), (42, 0.27803177713836785), (45, 0.0055461332216832828), (46, 0.0016396961473594117), (47, 0.0056507918659765869)]

>> lda.print_topic(13)  #value: 0.38
0.057*black+ 0.023*black+ 0.020*Black+ 0.018*Soy sauce+ 0.011*stamina+ 0.010*oyster+ 0.009*Appearance+ 0.008*Dark+ 0.008*ramen+ 0.008*delicious

>> lda.print_topic(42)  #value: 0.27
0.029*friend+ 0.028*Delicious+ 0.015*queue+ 0.015*delicious+ 0.013*ramen+ 0.013*Easy+ 0.012*Wow+ 0.011*Feeling+ 0.011*Famous+ 0.011*Many

>> lda.print_topic(24)  #value: 0.11
0.095*Fried rice+ 0.040*set+ 0.017*mini+ 0.013*Gyoza+ 0.012*ramen+ 0.011*delicious+ 0.009*Delicious+ 0.009*Single item+ 0.008*order+ 0.008*roasted pork fillet

The result is a little hard to see, but fried rice has appeared in the third topic as a result that reflects many opinions.

It seems that the person who wrote the first review was not asked, but in fact Shinpuku Saikan is also famous for its black fried rice. From this (?), It was found that the topic division of ramen shops by collective intelligence works.

5. Find a similar store

Now, the classifier can now calculate the topic vector of any ramen shop. Finally, I would like to use this classifier to find similar stores.

In theory, you can find a store that is similar to any store, but I will continue the content of the previous section and look for a store that is similar to the Shinpuku Saikan main store.

Although the detailed code is broken, in general, the LDA topic calculation of ramen shops nationwide → the similarity calculation with the LDA topic of the Shinpuku Saikan main store is performed as follows.

#various settings
MIN_SIMILARITY = 0.6  #Similarity threshold
RELATE_STORE_NUM = 20  #Number of similar stores extracted

#LDA topic calculation of ramen shops nationwide
from my_algorithms import calc_vecs
(names, prefs, vecs) = calc_vecs(reviews, lda, dictionary)

#Calculate the similarity between the Shinpuku Saikan main store and the LDA topics of ramen shops nationwide
from sklearn.metrics.pairwise import cosine_similarity
similarities = cosine_similarity(sinpuku_vec, vecs)

#Show similar stores
import pandas as pd
df = pd.DataFrame({'name': names,
                   'pref': prefs,
                   'similarity': similarities})
relate_store_list = df[df.similarity > MIN_SIMILARITY] \
                      .sort_values(by="similarity", ascending=False) \
                      .head(RELATE_STORE_NUM)
print(relate_store_list)

==============

id	similarity	pref	name
0	0.934 toyama Makoto
1	0.898 hokkaido Isono Kazuo
2	0.891 shiga Kinkuemon Mitsui Outlet Park Shiga Ryuo
3	0.891 kyoto Shinfukusaikan Higashi Tsuchikawa store
4	0.888 osaka Kingemon Dotombori store
5	0.886 chiba Charcoal ramen
6	0.874 osaka Kingemon Esaka Esaka store
7	0.873 toyama Iroha Shosui Main Store
8	0.864 osaka Kingemon Umeda Umeda store
9	0.861 mie Hayashiya
10	0.847 niigata Ramen Tsurikichi
11	0.846 osaka Kingemon Main Store
12	0.838 toyama Menhachi Gotabiya
13	0.837 aichi Kikuya Hotel
14	0.820 hyogo Nakanoya
15	0.814 kyoto Kinkuemon Kyoto Saiin store
16	0.807 aichi yokoji
17	0.804 kumamoto favorite ramen shop
18	0.792 kyoto Shinfukusaikan Kumiyama store
19	0.791 niigata Goingoin

Let's pick up some and check the result.

"Makoto Ya Toyama" スクリーンショット 2016-07-23 2.31.42.png

"Isono Kazuo Hokkaido" スクリーンショット 2016-07-23 2.32.56.png

"Kanekuemon Mitsui Outlet Park Shiga Ryuo Store" スクリーンショット 2016-07-23 2.33.43.png

It's black ... it's superbly black. I also checked other things, but all of them were black ramen except Nakanoya in Hyogo.

Shinpuku Saikan has several branches, and I think it's a good point that the results came in properly even though the store name was completely removed and analyzed.

Speaking of black ramen, Toyama Black is famous and some have appeared, but I wonder if the taste of Shinfukusaikan is a little different from Toyama Black. It is undeniable that a similar restaurant was selected because of the blackness of the ramen rather than the taste. I would like to consider the complexity of this area as a point for improvement in the future.

Summary

This time, using ramen as a theme, we introduced how to obtain various knowledge from character data lying on the net without human intervention (almost).

It may seem like a very unreliable result compared to the method of classifying based on supervised data created by human intervention, but the result that can be easily obtained using the existing library of Python is not so good. I think.

Of course, not only ramen but also various things can be handled if there is a certain amount of sentences or more. If you like, please try various documents.

Recommended Posts

[Python] Try to classify ramen shops by natural language processing
Python: Natural language processing
[Job change meeting] Try to classify companies by processing word-of-mouth in natural language with word2vec
[Natural language processing / NLP] How to easily perform back translation by machine translation in Python
100 Language Processing Knock Chapter 1 by Python
Preparing to start natural language processing
Japanese Natural Language Processing Using Python3 (4) Sentiment Analysis by Logistic Regression
3. Natural language processing with Python 2-1. Co-occurrence network
3. Natural language processing with Python 1-1. Word N-gram
Try to classify O'Reilly books by clustering
3. Natural language processing with Python 1-2. How to create a corpus: Aozora Bunko
3. Natural language processing with Python 2-2. Co-occurrence network [mecab-ipadic-NEologd]
Python inexperienced person tries to knock 100 language processing 14-16
[Python] I played with natural language processing ~ transformers ~
Python: Deep Learning in Natural Language Processing: Basics
Python inexperienced person tries to knock 100 language processing 07-09
Python inexperienced person tries to knock 100 language processing 10 ~ 13
Python inexperienced person tries to knock 100 language processing 05-06
Python inexperienced person tries to knock 100 language processing 00-04
Entry where Python beginners do their best to knock 100 language processing little by little
Introduction to Python language
RNN_LSTM2 Natural language processing
[Keras] Personal memo to classify images by folder [Python]
[Language processing 100 knocks 2020] Summary of answer examples by Python
Try to make a Python module in C language
[Chapter 5] Introduction to Python with 100 knocks of language processing
3. Natural language processing with Python 4-1. Analysis for words with KWIC
Building an environment for natural language processing with Python
[Chapter 3] Introduction to Python with 100 knocks of language processing
[Chapter 2] Introduction to Python with 100 knocks of language processing
Compare how to write processing for lists by language
[Chapter 4] Introduction to Python with 100 knocks of language processing
I tried to classify Mr. Habu and Mr. Habu with natural language processing × naive Bayes classifier
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 09 Memo "Identifier by Neural Network"
The first artificial intelligence. I wanted to try natural language processing, so I will try morphological analysis using MeCab with python3.
100 Language Processing with Python Knock 2015
3. Natural language processing with Python 5-1. Concept of sentiment analysis [AFINN-111]
Try to understand Python self
100 Language Processing Knock Chapter 1 (Python)
Natural language processing 1 Morphological analysis
Natural language processing 3 Word continuity
Image processing by python (Pillow)
Python: Natural language vector representation
Try to select a language
Natural language processing 2 Word similarity
3. Natural language processing with Python 5-2. Emotion intensity analysis tool VADER
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 06 Memo "Identifier"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 02 Memo "Pre-processing"
I will write a detailed explanation to death while solving 100 natural language processing knock 2020 with Python
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 07 Memo "Evaluation"
processing to use notMNIST data in Python (and tried to classify it)
[Introduction to RasPi4] Environment construction; natural language processing system mecab, etc. .. .. ♪
Dockerfile with the necessary libraries for natural language processing in python
Loose articles for those who want to start natural language processing
Summarize how to preprocess text (natural language processing) with tf.data.Dataset api
Try to implement and understand the segment tree step by step (python)
Stack processing speed comparison by language
Study natural language processing with Kikagaku
Leave the troublesome processing to Python
100 natural language processing knocks Chapter 4 Commentary
100 Language Processing Knock with Python (Chapter 1)