[PYTHON] [WIP] Pre-processing memo in natural language processing

This article is a memorandum / memo when reading Preprocessing in Natural Language.

Numerical replacement

The numbers are also normalized so that 99, 1.235, etc. are set to 0. Certainly, it seems that it has nothing to do with what you want to analyze with natural language processing.

Stop word removal

It's not limited to this, but it would be nice to have a code sample. Also, how to choose a stop word is just right for review based on meaning and frequency.

Comparison with and without pretreatment

The execution time difference is large. Isn't it more important than accuracy?

Recommended Posts

[WIP] Pre-processing memo in natural language processing
Performance verification of data preprocessing in natural language processing
Convenient goods memo around natural language processing
Python: Natural language processing
RNN_LSTM2 Natural language processing
Python: Deep Learning in Natural Language Processing: Basics
Unbearable shortness of Attention in natural language processing
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 02 Memo "Pre-processing"
Model using convolutional neural network in natural language processing
Natural language processing 1 Morphological analysis
Natural language processing 3 Word continuity
Overview of natural language processing and its data preprocessing
Natural language processing 2 Word similarity
■ [Google Colaboratory] Preprocessing of Natural Language Processing & Morphological Analysis (janome)
100 natural language processing knocks Chapter 4 Commentary
100 Language Processing Knock Chapter 1 in Python
Artificial language Lojban and natural language processing (artificial language processing)
Preparing to start natural language processing
Natural language processing analyzer installation summary
Dockerfile with the necessary libraries for natural language processing in python
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 06 Memo "Identifier"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 07 Memo "Evaluation"
Natural language processing of Yu-Gi-Oh! Card name-Yu-Gi-Oh!
100 Knocking Natural Language Processing Chapter 1 (Preparatory Movement)
3. Natural language processing with Python 1-1. Word N-gram
I tried natural language processing with transformers.
You become an engineer in 100 days ――Day 66 ――Programming ――About natural language processing
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 04 Memo "Feature Extraction"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 4 Step 15 Memo "Data Collection"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 08 Memo "Introduction to Neural Networks"
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 11 Memo "Word Embeddings"
100 language processing knocks 03 ~ 05
100 language processing knocks (2020): 40
3. Natural language processing with Python 2-2. Co-occurrence network [mecab-ipadic-NEologd]
100 language processing knocks (2020): 35
100 language processing knocks (2020): 47
100 language processing knocks (2020): 39
100 language processing knocks (2020): 22
100 language processing knocks (2020): 26
100 language processing knocks (2020): 34
100 Language Processing Knock (2020): 28
100 language processing knocks (2020): 42
[Python] [Natural language processing] I tried Deep Learning ❷ made from scratch in Japanese ①
100 language processing knocks (2020): 29
100 language processing knocks (2020): 49
100 language processing knocks 06 ~ 09
100 language processing knocks (2020): 43
100 language processing knocks (2020): 24
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 12 Memo "Convolutional Neural Networks"
[Python] I played with natural language processing ~ transformers ~
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 3 Step 13 Memo "Recurrent Neural Networks"
100 language processing knocks (2020): 45
100 language processing knocks (2020): 10-19
100 language processing knocks (2020): 30
100 language processing knocks (2020): 00-09
100 language processing knocks (2020): 31
Python: Deep learning in natural language processing: Implementation of answer sentence selection system
100 Language Processing Knock (2020): 38
100 language processing knocks (2020): 48
100 language processing knocks Morphological analysis learned in Chapter 4
100 language processing knocks (2020): 44