[PYTHON] zoom I tried to quantify the degree of excitement of the story at the meeting

Introduction

Recently, the number of meetings and classes at zoom has been increasing, but I feel that I don't know how much interest I have in talking unless I'm face-to-face, so why not try quantifying it? I thought about it and made it.

Since this is my first post, there are some parts that are not good, but I hope you read it to the end: sweat:

Purpose

zoom Acquires an image or video of a meeting, recognizes the face in the picture, and measures the degree of interest in the story.

Implementation

As a test

I decided to use Amazon Rekognition to recognize the faces of people attending the zoom conference this time.

I referred to this article for how to use it. https://qiita.com/G-awa/items/477f2324552cb908ecd0

detect_face.py


import cv2
import numpy as np
import boto3

#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')

#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
#font
fontface = cv2.FONT_HERSHEY_DUPLEX

#Loop until you press q.
while(True):

    #Capture frame
    ret, frame = cap.read()
    height, width, channels = frame.shape

    #Convert to jpg Image files are sent via the Internet via API, so keep the size small.
    small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
    ret, buf = cv2.imencode('.jpg', small)

    #Throw API to Amazon Rekognition
    faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])

    #Draw a box around the face
    for face in faces['FaceDetails']:
        smile = face['Smile']['Value']
        cv2.rectangle(frame,
                      (int(face['BoundingBox']['Left']*width),
                       int(face['BoundingBox']['Top']*height)),
                      (int((face['BoundingBox']['Left']+face['BoundingBox']['Width'])*width),
                       int((face['BoundingBox']['Top']+face['BoundingBox']['Height'])*height)),
                      green if smile else red, frame_thickness)
        emothions = face['Emotions']
        i = 0
        for emothion in emothions:
            cv2.putText(frame,
                        str(emothion['Type']) + ": " + str(emothion['Confidence']),
                        (25, 40 + (i * 25)),
                        fontface,
                        fontscale,
                        color)
            i += 1

    #Show the result on the display
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

For the time being, when I tried moving the code, I was able to perform face recognition and sentiment analysis! However, when it came to video acquisition, it was heavy and stopped halfway. So I decided to load the image. (This is the code of the article I referred to.)

Screen capture

I referred to this article for image capture. https://qiita.com/koara-local/items/6a98298d793f22cf2e36

I used PIL to capture the screen.

capture.py


from PIL import ImageGrab

ImageGrab.grab().save("./capture/PIL_capture.png ")

I created a separate folder called capture and saved it in that folder.

Implementation

face_detect.py


import cv2
import numpy as np
import boto3

#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
#cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')

#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
#font
fontface = cv2.FONT_HERSHEY_DUPLEX


from PIL import ImageGrab

ImageGrab.grab().save("./capture/PIL_capture.png ")

#Capture frame
#ret, frame = cap.read()
frame = cv2.imread("./capture/PIL_capture.png ")
height, width, channels = frame.shape
frame = cv2.resize(frame,(int(width/2),int(height/2)),interpolation = cv2.INTER_AREA)

    #Convert to jpg Image files are sent via the Internet via API, so keep the size small.
small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
ret, buf = cv2.imencode('.jpg', small)

    #Throw API to Amazon Rekognition
faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])

    #Draw a box around the face
for face in faces['FaceDetails']:
    smile = face['Smile']['Value']
    cv2.rectangle(frame,
                    (int(face['BoundingBox']['Left']*width/2),
                    int(face['BoundingBox']['Top']*height/2)),
                    (int((face['BoundingBox']['Left']/2+face['BoundingBox']['Width']/2)*width),
                    int((face['BoundingBox']['Top']/2+face['BoundingBox']['Height']/2)*height)),
                    green if smile else red, frame_thickness)
    emothions = face['Emotions']
    i = 0
    score = 0
    for emothion in emothions:
        
        if emothion["Type"] == "HAPPY":
            score = score + emothion["Confidence"]
        elif emothion["Type"] == "DISGUSTED":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "SURPRISED":
            score = score + emothion["Confidence"]
        elif emothion["Type"] == "ANGRY":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "CONFUSED":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "CALM":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "SAD":
            score = score - emothion["Confidence"]
        i += 1
        if i == 7:
            cv2.putText(frame,
            "interested" +":"+ str(round(score,2)),
            (int(face['BoundingBox']['Left']*width/2),
            int(face['BoundingBox']['Top']*height/2)),
            fontface,
            fontscale,
            color)


        

#Show the result on the display
cv2.imshow('frame', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()

I used OpenCV to read the image itself. Amazon Rekognition can read 6 emotions of HAPPY, DISGUSETED, SURPRISED, ANGRY, CONFUSED, CALM, SAD, so HAPPY and SURPRISED are calculated as positive emotions (high interest level) and other emotions as negative emotions (low interest level). Finally, it was displayed on the face that recognized the degree of interest in the range of -100 to 100. スクリーンショット 2020-11-17 172257.png I am borrowing an image of a person because I could not gather people with zoom. https://tanachannell.com/4869

Amazon Rekognition has other features, so if you are interested, please take a look! https://docs.aws.amazon.com/ja_jp/rekognition/latest/dg/faces-detect-images.html

problem

-If the number of participants in zoom is large, the displayed characters will overlap and it will be very difficult to see. -Since it is not a capture of the Zoom screen, the command prompt will appear in the image unless the command prompt is minimized immediately after execution.

Finally

I made it so much that I want people to see it! I started writing with that in mind, but when I wrote it, I was able to relive the experience while I was making it, which was a learning experience. It might be a lot of fun if something that I made like this permeates the world!

GitHub https://github.com/r-301/zoom-response-check

Recommended Posts

zoom I tried to quantify the degree of excitement of the story at the meeting
[Horse Racing] I tried to quantify the strength of racehorses
I tried to touch the API of ebay
I tried to correct the keystone of the image
I tried to extract and illustrate the stage of the story using COTOHA
I tried the common story of using Deep Learning to predict the Nikkei 225
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
The story of making soracom_exporter (I tried to monitor SORACOM Air with Prometheus)
I tried to summarize the basic form of GPLVM
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to classify the voices of voice actors
I tried to summarize the string operations of Python
I tried to find the entropy of the image with python
[First COTOHA API] I tried to summarize the old story
I tried to get the location information of Odakyu Bus
[Python] I tried to visualize the follow relationship of Twitter
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
Quantify the degree of self-restraint required to contain the new coronavirus
I tried to automatically post to ChatWork at the time of deployment with fabric and ChatWork Api
I tried to move ROS (Melodic) with the first Raspberry Pi (Stretch) at the beginning of 2021
I tried to move the ball
I tried to estimate the interval.
[Linux] I tried to summarize the command of resource confirmation system
I tried to get the index of the list using the enumerate function
I tried to automate the watering of the planter with Raspberry Pi
I tried to build the SD boot image of LicheePi Nano
I looked at the meta information of BigQuery & tried using it
I tried to expand the size of the logical volume with LVM
I tried to summarize the frequently used implementation method of pytest-mock
I tried to improve the efficiency of daily work with Python
I tried to visualize the common condition of VTuber channel viewers
The story of trying to reconnect the client
I tried the asynchronous server of Django 3.0
I tried to summarize the umask command
I tried to recognize the wake word
The story of adding MeCab to ubuntu 16.04
I tried to summarize the graphical modeling.
I tried to estimate the pi stochastically
I tried to touch the COTOHA API
The story of pep8 changing to pycodestyle
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried to get the batting results of Hachinai using image processing
I tried transcribing the news of the example business integration to Amazon Transcribe
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I tried how to improve the accuracy of my own Neural Network
I tried to solve the 2020 version of 100 language processing [Chapter 3: Regular expressions 25-29]
I tried to get the authentication code of Qiita API with Python.
I tried to automatically extract the movements of PES players with software
(Python) I tried to analyze 1 million hands ~ I tried to estimate the number of AA ~
I tried to summarize the logical way of thinking about object orientation.
I tried to find the optimal path of the dreamland by (quantum) annealing
I tried to verify and analyze the acceleration of Python by Cython
I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]
I tried to streamline the standard role of new employees with Python
I tried to visualize the text of the novel "Weathering with You" with WordCloud
[Linux] I tried to verify the secure confirmation method of FQDN (CentOS7)
I tried to get the RSS of the top song of the iTunes store automatically
I tried to get the movie information of TMDb API with Python