[PYTHON] I tried to summarize the graphical modeling.

Introduction

I thought that there was no particular article on graphical modeling in Qiita, Here's a quick overview of graphical modeling.

"Introduction to graphical modeling that I can't hear anymore" http://www.slideshare.net/Kawamoto_Kazuhiko/ss-35483453

1. 1. Graphical modeling overview

1) What is graphical modeling?

Roughly, it is a model for Bayesian inference.

(Additionally…) A graphical representation of the conditional independence between random variables. Conditional independence is the key to modeling and efficient algorithms. (There is a connection, but there is no combinatorial explosion)

2) Latent variable (hidden variable)

For the observable observable variable x By defining an unobservable latent variable z, It is possible to express values that are not originally observed.

2. Types of graphical modeling

1) Model type

Markov random field: Graphical graph with undirected graph Bayesian network: Graphical modeling with valid graphs

-Supplement- Intuitively, the Bayesian network is easier to understand, ・ Markov random field: Adjacent nodes should be given ・ Basic network: Dependencies are occurring between children Therefore, it is complicated to determine whether it is conditional independence. The Markov random field is a simpler model than the Bayesian network.

2) Types of latent variables

The latent variable of the Markov random field is ・ Discrete probability: Hidden Markov model ・ For continuous probability: Kalman filter (normal distribution), Particle filter (other than normal distribution) is.

If the graph has a loop structure An approximate solution can be derived by using the belief propagation method (sum of products algorithm).

3. 3. Implementation

↓ The commentary article is easy to understand. "Implementing Markov Random Field / Belief Propagation with Python networkx" http://sinhrks.hatenablog.com/entry/2014/12/27/232506

As the title suggests, implement using the networkx module in Python It is built on the subject of image processing.

4. Summary

Visual modeling is possible by using graphs It is said to be the strength of graphical modeling, I feel that skill is required to actually demonstrate our strengths.

Utilization in web advertising

Since I wrote a description of graphical modeling and latent variables, How Graphical Modeling is Used in the Advertising Industry Now An example of using it in the current trend of native ads, I would like to write based on the article and thesis (Yahoo!).

    1. article [Series] Adventures over Insight (Utilization of topic model for native advertising) The 5th Future Forecast of Content Marketing -The adventure of pursuing potential interest continues- http://marketing.itmedia.co.jp/mm/articles/1410/02/news001.html
  1. paper Paper title: Search for content-linked advertising using a topic model Author: Hiroshi Yamamoto Masaki Noguchi Shingo Ono Koji Tsukamoto (Yahoo Japan Corporation) http://www.anlp.jp/proceedings/annual_meeting/2014/pdf_dir/P2-14.pdf

1. 1. Article introduction

1). Current issues with the "Topic Marketing System"

a) Improved Topic interpretation accuracy and speed

Topic Model made it possible to summarize sentences quickly and easily, To read potential interests from the list of feature words included in Topic Human analog work such as prior knowledge and imagination will occur, so improvement is necessary.

b) Rapid content generation from Topic (context building)

Uses word vec2 (state-of-the-art natural language processing technology developed and published by Google).

c) Fusion with native advertising

Fusion with native advertising platform

2). How will the approach to potential interests change?

a) From static to dynamic approach (anticipating signs of change in latent interest)

It becomes necessary to use a dynamic learning model.

b) More accurate targeting

Due to the omni-channelization of purchasing information and the evolution of behavior prediction by machine learning Highly accurate targeted advertising is possible.

2. Thesis introduction

1) Assumption:

Criteria for ad selection ・ Words in the content ・ Similarity of words in advertisement sentences There is.

2) Problem:

Even if you have ads that are suitable for your content If the words used in the content and the advertisement are different The ad is not a candidate for display.

3) Proposed method:

To solve the problem that appropriate advertisements cannot be displayed as candidates due to word mismatch Convert words from both the page content and the ad text into topics. ⇒Proposed a method to search for advertisements in the converted topic space.

4) Novelty of the paper:

By combining word search and topic search, It was shown that the accuracy of ad search is improved compared to the case where both are used alone.

5) Supplement (Natural language processing and topic model)

・ Natural language processing: What is "natural language processing" that is also used for smartphones? (http://logmi.jp/45207)

・ Topic model: Beginning of topic model (https://speakerdeck.com/yamano357/tokyowebmining46th) I think you can understand it somehow by reading.

Recommended Posts

I tried to summarize the graphical modeling.
I tried to summarize the umask command
I tried to summarize SparseMatrix
LeetCode I tried to summarize the simple ones
I tried to summarize the basic form of GPLVM
I tried to summarize the string operations of Python
I tried to move the ball
I tried to estimate the interval.
[First COTOHA API] I tried to summarize the old story
I tried to summarize the code often used in Pandas
I tried to summarize the commands often used in business
[Machine learning] I tried to summarize the theory of Adaboost
I tried to summarize how to use the EPEL repository again
I tried to summarize Python exception handling
I tried to recognize the wake word
Python3 standard input I tried to summarize
I tried to estimate the pi stochastically
I tried to touch the COTOHA API
I tried to summarize Ansible modules-Linux edition
[Linux] I tried to summarize the command of resource confirmation system
I tried to summarize the commands used by beginner engineers today
I tried to summarize the frequently used implementation method of pytest-mock
I tried to debug.
I tried to paste
I tried web scraping to analyze the lyrics.
I tried to optimize while drying the laundry
I tried to save the data with discord
I tried to touch the API of ebay
I tried to correct the keystone of the image
Qiita Job I tried to analyze the job offer
I tried to implement the traveling salesman problem
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to summarize until I quit the bank and became an engineer
I tried to summarize various sentences using the automatic summarization API "summpy"
I tried to summarize the logical way of thinking about object orientation.
I tried to summarize the Linux commands used by beginner engineers today-Part 1-
I tried to learn the sin function with chainer
I tried to graph the packages installed in Python
I tried to summarize the relationship between probability distributions starting from the Bernoulli distribution
I tried to learn PredNet
I tried to summarize how to use matplotlib of python
I tried to touch the CSV file with Python
I tried to predict the J-League match (data analysis)
I tried to organize SVM.
I tried to solve the soma cube with python
I tried to implement PCANet
I tried to approximate the sin function using chainer
I tried the changefinder library!
I tried to summarize four neural network optimization methods
I tried to put pytest into the actual battle
[Python] I tried to graph the top 10 eyeshadow rankings
I tried to reintroduce Linux
I tried to visualize the spacha information of VTuber
I tried to introduce Pylint
I tried to summarize how to use pandas in python
I tried to erase the negative part of Meros
I tried to solve the problem with Python Vol.1
I tried to simulate the dollar cost averaging method
I tried to redo the non-negative matrix factorization (NMF)
I didn't understand the Resize of TensorFlow so I tried to summarize it visually.