[PYTHON] I implemented Human In The Loop ― Part ① Dashboard ―

Introduction

Human in the loop (HITL) is the intervention of human operations in the judgment and control of AI, and is considered as one of the means of social implementation of AI where quality control is difficult. [Reference 1] In this article, I would like to implement a machine learning model, a dashboard for monitoring, and a verification tool with a simple web application so that I can grasp the image of HITL. (Scheduled 3 times in total) In HITL, a dashboard that allows you to check actual data and AI behavior at the same time is effective for monitoring, so this time in Part 1, we will set goals, introduce usage data, and implement a dashboard for monitoring.


HITL Implementation Table of Contents
Part ① Dashboard ← This time Part② Verification tool Part ③ HITL (① and ② + model re-learning mechanism)

■ Implementation result simpledashboard.gif ■ Environment

Python 3.7.7 dash 1.16.1 dash-bootstrap-components 0.10.7 dash-core-components 1.12.1 dash-html-components 1.1.1 dash-renderer 1.8.1 dash-table 4.10.1 plotly 4.10.0 Flask 1.1.2 lightgbm 3.0.0

setting of the goal

Various constructions of HITL are expected depending on the task, but in this series, we will adopt the one based on anomaly detection.

■ Implement the following in the WEB application

--Monitoring --Dashboard (Chart) --Machine learning model --Known anomaly detection with supervised model --Detection model --Unknown detection with no teacher model (unknown without verification of normal or abnormal) --Expert verification --Implemented with awareness of annotation tools --Check whether the verifier is normal or abnormal for the unknown detection part --Scenario --Detection model is unknown (machine learning model is judged to be abnormal) --The verifier verifies the unknown detection part and determines that it is normal. --Receive the judgment and relearn the relevant part so that the model judges it as normal.

HITL_1.png

Usage data

The implementation used data from Kaggle's Credit Card Fraud Detection. [Reference 2]

This data is unbalanced data with the objective variable of credit card fraud, and is often used to try anomaly detection models.

The original data includes PCA V1 to V28 data, This time, the purpose is to demonstrate how to grasp the image, so the input features are set to 2 variables (V4, V14) * for easy understanding when visualized.

In the demo, we will monitor, verify, and relearn the above data plus unknown normal data artificially. The unknown normal data created artificially must satisfy the following in the scenario.

① The supervised model falsely detects (erroneously determines that it is abnormal) (2) The detection model can be detected (that is, the artificial data is clearly different from the original learning data) ③ Even if re-learning, the supervised model can keep the existing abnormality judgment.

Based on the above, we created the demo data. (Less than) (1 to 8s: known (abnormal at 3s), 9 to 11s: unknown, repeated twice)

HITL_2.png

Technology used

Describes the framework and model used for implementation.

■ Visualization part

■ Machine learning model, detection model

--Machine learning model - LightGBM --Display the score of binary classification (normal: 0, invalid: 1) --At first, use what you learned in advance --Detection model - KNN --The neighborhood distance is set as the degree of abnormality, and the threshold is cut and binarized (normal: 0, unknown: -1). --At first, use what you learned in advance

Implementation

We will implement it separately as follows.

    1. Dashboard ← this time
    1. Verification tool
    1. HITL ((1) and (2) + model re-learning mechanism)

1. 1. Dashboard

■ Implementation details

--Goal: Simple dashboard (Chart + α) --Input: Table data (csv file) --Output: Verification data with high (observed) anomaly score --Abnormality judgment: Display the model's abnormality score side by side with the actual data --Other: Real-time update -+ Α: Here, in addition to charts, pie charts, distplots, bar graphs, and tables are implemented.

■ Implementation image HITL_3.png

■ Implementation result simpledashboard.gif

Summary

This time, I implemented a dashboard to be used for monitoring in Human In The Loop. The dashboard visualizes the actual data information and the AI score in real time, making it easier to understand where the AI is focusing on the actual data. (In this data, AI seems to judge that it is invalid when V4 is positive and V14 is negative) By using Dash and plotly, I found it relatively easy to code the HTML and CSS parts. Above all, I would like you to experience the feeling that you can easily implement such a web application that runs in real time.

If you have any improvements or questions, I would appreciate it if you could comment.

reference

  1. Human-in-the-loop AI that creates a better business, society, and future https://note.com/masayamori/n/n2764e3cecc05

  2. Kaggle - Credit Card Fraud Detection https://www.kaggle.com/mlg-ulb/creditcardfraud

  3. Create a web application that can do machine learning with Dash [Step1] https://wimper-1996.hatenablog.com/entry/2019/10/28/dash_machine_learning1

  1. Code published http://github.com/utmoto

  2. Dash https://dash.plotly.com/

  3. plotly https://plotly.com/

Recommended Posts

I implemented Human In The Loop ― Part ① Dashboard ―
I implemented the inverse gamma function in python
I implemented N-Queen in various languages and measured the speed
I got lost in the maze
I participated in the ISUCON10 qualifying!
I wrote the queue in Python
I wrote the stack in Python
I implemented the K-means method (clustering method)
Experience Part I "Multinational Currencies" in the book "Test Driven Development" in Python
I want to handle the rhyme part1
Implement part of the process in C ++
I want to handle the rhyme part3
I saved the scraped data in CSV!
I implemented Cousera's logistic regression in Python
I wrote the selection sort in C
I can't get the element in Selenium!
I wrote the sliding wing in creation.
I implemented the VGG16 model in Keras and tried to identify CIFAR10
I want to handle the rhyme part2
I want to handle the rhyme part5
I want to handle the rhyme part4
Implemented DQN in TensorFlow (I wanted to ...)
I implemented Robinson's Bayesian Spam Filter in python
I tried simulating the "birthday paradox" in Python
I want to handle the rhyme part7 (BOW)
I can't enter characters in the text area! ?? !! ?? !! !! ??
I wrote the hexagonal architecture in go language
Loop variables at the same time in the template
I implemented Google's Speech to text in Django
I checked the calendar deleted in Qiita Advent Calendar 2016
I want to display the progress in Python!
I implemented CycleGAN (1)
I implemented ResNet!
I implemented breadth-first search in python (queue, drawing self-made)
[Deep Learning from scratch] I implemented the Affine layer
I tried to graph the packages installed in Python
Find the part that is 575 from Wikipedia in Python
I want to write in Python! (3) Utilize the mock
Implemented the algorithm of "Algorithm Picture Book" in Python3 (Heapsort)
I tried to erase the negative part of Meros
I want to handle the rhyme part6 (organize once)
What I learned by participating in the ISUCON10 qualifying
I can't use the darknet command in Google Colaboratory!
I want to handle the rhyme part8 (finished once)
I implemented a Vim-like replacement command in Slackbot #Python
"Deep Learning from scratch" Self-study memo (Part 8) I drew the graph in Chapter 6 with matplotlib