[PYTHON] AI beginners try to make professional student bots

In the first place, it is delicate whether this is AI or not, so there is a possibility that the title is wrong ... I tried to make a professional student ChatBot by collecting the comments of professional student on Twitter. ↓ If you say something on Slack like this, a professional student will reply.

In-house Slack has created a bot channel for a blogger, and if you ask, you're making a ChatBot from Twitter. There is no choice but to make Chatbot from professional student's Twitter! That's why I started making it.

Try to make

Roughly, it seems that you can make it with the following flow.

  1. Use the Twitter crawler to summarize your remarks in a DB
  2. Let ElasticSearch eat DB data
  3. Use Python's SlackBot to pass your remarks to ElasticSearch and spit out the returned text

I just used the source that my colleague made, so I will explain in detail this time in Skip. .. .. : sweat_drops:


I don't want to run ElasticSearch on my machine, so this time I will start CentOS with Vagrant and run it there.

Preparing Vagrant (CentOS)

cd <Appropriate directory>
mkdir pronama-chan-bot && cd $_
vagrant init <CentOS Box file name>
vagrant up
vagrant ssh
sudo yum update -y

After that, work in CentOS of ↑

Java installation

It seems that Java is required to install ʻanalysis-kuromoji` described later, so install it

sudo yum install -y wget
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u65-b17/jdk-8u65-linux-x64.rpm
sudo rpm -ivh jdk-8u65-linux-x64.rpm
java -version

Install ElasticSearch

* What is ElasticSearch?

Full-text search engine provided by Elastic (a mechanism for searching document data including the target word from a large amount of document data).

Install by referring to here

sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
sudo vi /etc/yum.repos.d/elasticsearch.repo


name=Elasticsearch repository for 2.x packages
sudo yum install -y elasticsearch

ElasticSearch plugin installation

#Install kuromoji for full-text search in Japanese
sudo /usr/share/elasticsearch/bin/plugin install analysis-kuromoji

#An extended version of the ipa dictionary called neologd? Install because it uses Toyara
sudo /usr/share/elasticsearch/bin/plugin install org.codelibs/elasticsearch-analysis-kuromoji-neologd/2.4.1

#A plugin that allows you to view the results in a web browser
sudo /usr/share/elasticsearch/bin/plugin install polyfractal/elasticsearch-inquisitor

#Plugin that can monitor ElasticSearch
sudo /usr/share/elasticsearch/bin/plugin install royrusso/elasticsearch-HQ
ElasticSearch settings

sudo vi /etc/elasticsearch/elasticsearch.yml

Changed as follows

http.compression: true
network.publish_host: ""
network.host: ""
network.bind_host: ""
transport.tcp.port: 9300
transport.tcp.compress: true
http.port: 9200

Startup settings

sudo chkconfig elasticsearch on
sudo service elasticsearch start

Python installation (use pyenv)

sudo yum install -y git
git clone https://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile
pyenv install anaconda3-4.1.1
pyenv rehash
pyenv global anaconda3-4.1.1
python --version

Run Twitter Crawler

Use Python's library for Twitter API called tweepy to get the past timeline and save it in the DB.

Approximate image.py

#Twitter API settings
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth)

#Obtain TL (execute the following until all TLs can be obtained)
statuses = api.user_timeline('pronama', max_id = None, count = 200)

#Process the acquired data for DB storage

#Save the acquired TL in DB

Feed the DB of crawling results to ElasticSearch

template settings?

PUT the JSON set in localhost: 9200 / _template / <template_name> and it's OK Set the index name in template and set kuromoji in tokenizer. I'm not sure about this yet, so I'll investigate later (flag not to)

Delete index once

DELETE localhost: 9200 / <index_name> and it's OK

Save the data pulled from DB to json once

Convert DB data to json using Python

Bulk Insert json data

POST with localhost: 9200 / <index_name> / speech / _bulk --data-binary <json data>

ElasticSearch itself should be working so far, so hit the following command and check if the result is returned.

curl -XGET 'http://localhost:9200/<index_name>/_search?pretty' -d '
  "query": {
      "function_score": {
          "functions": [
                  "random_score": {
                    "seed" : "999999999"
          "query": {
              "query_string": {
                  "query": "text.kuromoji:<Text>^100 OR text.2gram:$<Text>^10"
          "score_mode": "multiply"
  "size": 1, 
  "sort": {
      "_score": {
          "order": "desc"
  "track_scores": true

Yay! !! For some reason, the reply from "Yahho" is "I did it!", But it seems to be working!

SlackBot settings

Now let's set this up as a SlackBot.

Register Bot users in Slack

Register the bot user from [Add Configuration] in here. Enter each item appropriately. This time, register with the name "@pronama_chan".

Don't forget to make a note of the "API Token" displayed on the next screen.

Create a bot channel in Slack

Don't forget to invite the @pronama_chan created in ↑.

Create SlackBot

Create a Slack Bot using Python's slack bot library.

This is also a rough image.py

from slackbot.bot import Bot
from slackbot.bot import respond_to,default_reply

bot_response(userid, word):
    #POST to ElasticSearch
    response = requests.post(
        'http://{}/{}/_search'.format(hostname, index_name),
        <JSON string to be thrown to ElasticSearch generated from word>.encode('utf-8'))

    #From what was POSTed
    return <Extracted character string hit from response>

def chat(message, word):
    response = bot_response(message._get_user_id(), word)

def main():
    bot = Bot()

if __name__ == "__main__":

Tried to make it

Try throwing from Slack

Kita━━━ ヽ (∀ ゚) people (゚ ∀ ゚) people (゚ ∀) ノ ━━━ !!


I'm a beginner of AI at a web shop, but I managed to make a professional student Chatbot. It's fun but tiring to use technologies that you don't normally use. .. ..

