[PYTHON] Machine learning meeting information for HRTech

TL; DR

We will create a learning model to create something like this by utilizing meeting data. membersearch-min.png

The entire article is here.

1. Premise

The following is the meeting data based on the data of the parliamentary minutes.

2. Machine learning

Calculate the sentences that are characteristic of each member and the similarity between each member. Let's think about what makes the sentences similar, but here we will evaluate the degree of similarity by the following two methods.

Hey there you! You thought ** TF-IDF is not machine learning **! !!

Yes, that's right. But people in the street don't know the difference, so if the result looks good, you won't notice it. So

"AI worked hard! (Smiley)"

That said, there should be almost no problem. There are as many images of AI as there are people (Kiri!

2.1 Calculation of similarity between people using TF-IDF

Please see here. If you use the chan scipy library, even sparse matrices that tend to be large can be processed at a stress-free speed.

2.2 Calculation of similarity between people with doc2vec

Please see here. It's easy to learn with the gensim package.

3. Network analysis

Please see here.

Create a Network Graph and use a technique called the louvain method to cluster people with similar remarks from the Netowrk Graph.

When visualized, you will get the following result.

path_to_fig.png

When this Netowrk Graph is written in Json, the specific contents are as follows.

{
  "directed": false,
  "multigraph": false,
  "graph": {},
  "nodes": [
    {
      "size": 3,
      "cluster": 1,
      "id": "Tomomi Inada"
    },
    {
      "size": 54,
      "cluster": 3,
      "id": "Taro Aso"
    },
    {
      "size": 142,
      "cluster": 0,
      "id": "Hiroshige Seko"
    },
    {
      "size": 39,
      "cluster": 4,
      "id": "Yasuhisa Shiozaki"
    },
    {
      "size": 30,
      "cluster": 1,
      "id": "Sanae Takaichi"
    },
    {
      "size": 95,
      "cluster": 1,
      "id": "Shinzo Abe"
    }
  ],
  "links": [
    {
      "weight": 0.5984722375869751,
      "source": "Tomomi Inada",
      "target": "Hiroshige Seko"
    },
    {
      "weight": 0.9666371941566467,
      "source": "Tomomi Inada",
      "target": "Shinzo Abe"
    },
    {
      "weight": 0.48173508048057556,
      "source": "Tomomi Inada",
      "target": "Yasuhisa Shiozaki"
    },
    {
      "weight": 0.4896692633628845,
      "source": "Tomomi Inada",
      "target": "Sanae Takaichi"
    },
    {
      "weight": 0.7263149619102478,
      "source": "Taro Aso",
      "target": "Hiroshige Seko"
    },
    {
      "weight": 0.6178034543991089,
      "source": "Taro Aso",
      "target": "Shinzo Abe"
    },
    {
      "weight": 0.46518972516059875,
      "source": "Taro Aso",
      "target": "Yasuhisa Shiozaki"
    },
    {
      "weight": 0.8961162567138672,
      "source": "Hiroshige Seko",
      "target": "Yasuhisa Shiozaki"
    },
    {
      "weight": 1.2007122039794922,
      "source": "Hiroshige Seko",
      "target": "Shinzo Abe"
    },
    {
      "weight": 0.945235550403595,
      "source": "Hiroshige Seko",
      "target": "Sanae Takaichi"
    },
    {
      "weight": 0.9955565333366394,
      "source": "Yasuhisa Shiozaki",
      "target": "Shinzo Abe"
    },
    {
      "weight": 0.9067516922950745,
      "source": "Yasuhisa Shiozaki",
      "target": "Sanae Takaichi"
    },
    {
      "weight": 1.053189754486084,
      "source": "Sanae Takaichi",
      "target": "Shinzo Abe"
    }
  ]
}

Recommended Posts

Machine learning meeting information for HRTech
Data set for machine learning
Japanese preprocessing for machine learning
14 e-mail newsletters useful for gathering information on machine learning
<For beginners> python library <For machine learning>
[Recommended tagging for machine learning # 4] Machine learning script ...?
Amplify images for machine learning with python
First Steps for Machine Learning (AI) Beginners
An introduction to OpenCV for machine learning
Why Python is chosen for machine learning
"Usable" one-hot Encoding method for machine learning
[Python] Web application design for machine learning
An introduction to Python for machine learning
Creating a development environment for machine learning
[Memo] Machine learning
Machine learning classification
Machine Learning sample
Beginning of machine learning (recommended teaching materials / information)
Recommended study order for machine learning / deep learning beginners
Machine learning starting from 0 for theoretical physics students # 1
[Python] Collect images with Icrawler for machine learning [1000 images]
Classify machine learning related information by topic model
Machine learning starting from 0 for theoretical physics students # 2
Collect images for machine learning (Bing Search API)
[For beginners] Introduction to vectorization in machine learning
Machine learning tutorial summary
Image collection Python script for creating datasets for machine learning
About machine learning overfitting
Build an interactive environment for machine learning in Python
[Recommended tagging for machine learning # 2.5] Modification of scraping script
Machine learning ⑤ AdaBoost Summary
Machine Learning: Supervised --AdaBoost
Machine learning logistic regression
Python learning memo for machine learning by Chainer Chapters 1 and 2
Reinforcement learning for tic-tac-toe
Machine learning support vector machine
Studying Machine Learning ~ matplotlib ~
Machine learning linear regression
Machine learning course memo
Preparing to start "Python machine learning programming" (for macOS)
Machine learning (TensorFlow) + Lotto 6
Somehow learn machine learning
Study method for learning machine learning from scratch (March 2020 version)
Summary for learning RAPIDS
Memo for building a machine learning environment using Python
xgboost: A valid machine learning model for table data
Machine learning library Shogun
Machine learning rabbit challenge
Introduction to machine learning
Everything for beginners to be able to do machine learning
Machine Learning: k-Nearest Neighbors
What is machine learning?
Rebuilding an environment for machine learning with Miniconda (Windows version)
Build an environment for machine learning using Python on MacOSX
Performance verification of data preprocessing for machine learning (numerical data) (Part 2)
Made icrawler easier to use for machine learning data collection
Site summary where you can learn machine learning for free
I tried using Tensorboard, a visualization tool for machine learning
For those who want to start machine learning with TensorFlow2
How to use machine learning for work? 03_Python coding procedure
Feature Engineering for Machine Learning Beginning with Part 3 Google Colaboratory-Scaling