Reading and creating a mark sheet using Python OpenCV (Tips for reading well)

Since I decided to do a paper questionnaire at work, I mainly refer to this article (Making a simple OMR (mark sheet reader) with Python and OpenCV) I made it. I often misrecognized blank lines, so I would appreciate it if you could refer to them.

Overall flow of the questionnaire

  1. Create a questionnaire with a QR code embedded in Excel I put information in the QR code and embedded it to identify the page number and individual of the questionnaire

  2. Print and distribute

  3. Scan the collected questionnaire and go from PDF to JPG I converted it properly at a free conversion site on the net

  4. Read the survey results from the converted JPG file

environment

1. Create questionnaire

2. Scan

3. Read the mark sheet

point

1. Points when creating a questionnaire

2. Points when scanning

3. Points when reading the mark sheet

Details

1. Create questionnaire

1. Create QR code

Since there are multiple question sheets this time, in order to determine which question is for which question and who wrote the question sheet, we decided to embed the QR code in the question sheet and capture the answer in CSV along with the information when reading. did.

--Reference: Generate and save QR code image with Python, Pillow, qrcode

def makeQr(qrMessage, fileName='result.png', filePath='resultQrCode/'):
    """Create a QR code with the argument qrMessage and save it in resultQrCode
    Args:
        qrMessage (str):QR code to make
        fileName (str, optional):Output file name. Defaults to 'result.png'.
        filePath (str, optional):Output file path * At the end of the list, "/」. Defaults to 'resultQrCode/'.
    """
    import qrcode
    import os

    img = qrcode.make(qrMessage)
    if not os.path.isdir(filePath):
        os.makedirs(filePath)
    if not(filePath[-1] == '\\' or filePath[-1] == '/'):
        filePath = filePath + '\\'
    
    img.save(filePath + fileName)
    print('File out:' + filePath + fileName)

if __name__ == '__main__':
    import re
    import sys
    
    args = sys.argv
    if 1 < len(args):
        if re.findall('[/:*?"<>|]', args[1]):
            print('[Error]Kinsoku character "/:*?"<>|」')
        elif 2 == len(args):
            makeQr(args[1])
        elif re.findall('[:*?"<>|]', args[2]):
            print('[Error]Prohibition characters in the file name ":*?"<>|」')
        elif 3 == len(args):
            makeQr(args[1],args[2])
        elif re.findall('[*?"<>|]', args[3]):
            print('[Error]Kinsoku characters "*?"<>|」') 
        elif 4 == len(args):
            makeQr(args[1],args[2],args[3])
        else:
            qrMessage = args[1]
            for qrMessageList in args[4:]:
                qrMessage = qrMessage + '\r\n' + qrMessageList
            makeQr(qrMessage,args[2],args[3])
    else:
        print('error: args is one')
        print('usage1: makeQr.exe [QR_Text]')
        print('usage2: makeQr.exe [QR_Text] [Output_FileName]')
        print('usage3: makeQr.exe [QR_Text] [Output_FileName] [Output_FilePath]')
        print('usage4: makeQr.exe [QR_Text] [Output_FileName] [Output_FilePath] [QR_Text_Line2...]')

Use this for anyone to use later. I made it into an exe with py2exe. (Pyinstaller is fine, but it was heavy, so I chose py2exe here.) --Reference: Using py2exe with python 3.5 --3.7

2. Preparation of markers

Cut out the range of the mark sheet to put the question and the mark sheet together. Prepare characteristic black-and-white images at the four corners of the cutout area. Since the QR code is used this time, the marker (enclosed in red frame) used in the QR code shown below cannot be used. Also, since I am using Excel, I decided that if I set ★ as a text in a figure (auto shape), it would be a chestnut, so I prepared ★ as a figure. Since it is necessary to pass the marker ★ as an image file to the analysis program, I pasted the Excel shape with paint etc. and saved it. Since the size and margins need to be the same as the auto shape, it is recommended to minimize the height and width before pasting.

3. Preparation of mark

Prepare questions and marks in Excel. The points are as follows.

4. Resize the marker to fit the paper width and paste the marker

Since it is Excel, the print is enlarged or reduced, and the size of the marker saved as an image and the marker at the time of printing are different, and it may not be recognized well, so it is necessary to obtain (see) the enlargement magnification for each sheet. -Reference ExecuteExcel4Macro "Page.Setup ()"

Public Function getPrintZoomPer(sheetName As String) As Integer
    Worksheets(sheetName).Activate
    ExecuteExcel4Macro "Page.Setup(,,,,,,,,,,,,{1,#N/A})"
    ExecuteExcel4Macro "Page.Setup(,,,,,,,,,,,,{#N/A,#N/A})"
    getPrintZoomPer = ExecuteExcel4Macro("Get.Document(62)")
End Function

Place the marker sauce on some sheet and paste it on the four corners of the sheet you want to attach the marker sauce to. When pasting the marker, multiply the acquired enlargement ratio by the reciprocal. Since the number of questions and options are variable, I made it with vba without fixing it.

Public Sub insertMaker(sheetName As String, pasteCellStr As String, _
         printZoomPer As Integer)
    ' sheetName:Sheet name of the paste destination
    ' paseteCellStr:Cell character string to paste to Example: A4, B1, etc.
    Dim srcShape As shape
    Set srcShape = Worksheets("sheet1").Shapes("marker") 
    ' sheet1:Sheet name with the marker Shape of the paste source
    ' marker:Original marker Shapenamae name
    srcShape.Copy
    With Worksheets(sheetName).Pictures.Paste
        .Top = Worksheets(sheetName).Range(pasteCellStr).Top
        .Left = Worksheets(sheetName).Range(pasteCellStr).Left
        .Name = "marker"
        .Width = .Width * 100 / printZoomPer
    End With
End Sub

5. Insert QR code

Create by entering "questionnaire type + questionnaire page number + branch office + person number + number of choices + number of questions" in the QR code. Call the exe file created in "1." from the Excel macro with WScript.Shell.

Public Function makeQr(ByVal QrMessage As String, ByVal fileName As String) As String
    Dim WSH, wExec, sCmd As String
    Set WSH = CreateObject("WScript.Shell")
    
    sCmd = ThisWorkbook.Path & "makeQR.exe " & QrMessage & " " & fileName & " " & _
           ThisWorkbook.Path & "resultQrCode"
    Set wExec = WSH.Exec("%ComSpec% /c " & sCmd)
    
    Do While wExec.Status = 0
        DoEvents
    Loop
    makeQr = wExec.StdOut.readall

    Set wExec = Nothing
    Set WSH = Nothing

End Function

After doing various things, the result is as below Enquete.PNG

I will print and distribute what I made like this.

2. Scan

After completing the survey, we will scan. Since there is a resolution setting on the mark sheet reading side, I fixed it to 200dpi this time. (In addition, since the multifunction device at work could not be saved directly in jpg format, it was converted by Save as PDF and PDF => JPG conversion site. Also, I use openCV on the mark sheet reading side, but please note that double-byte characters could not be used in the file name.

3. Read the mark sheet

1. Read QR code

Put the aggregated JPG files in one folder, read the QR code, and return it as an argument.

def qrCodeToStr(filePath):
"""Read the string from the QR code
Args:
    filePath (String):The path of the image file containing the QR code
Returns:
    String:Result of reading the QR code(Failed nullString)
"""
import cv2

img = cv2.imread(filePath, cv2.IMREAD_GRAYSCALE)
#QR code decoding
qr = cv2.QRCodeDetector()
data,_,_ = qr.detectAndDecode(img)

if data == '':
    print('[ERROR]' + filePath + 'QR code was not found from')
else:
    print(data)
return data

2. Read the mark sheet

This is almost the same as this article (Making a simple OMR (mark sheet reader) with Python and OpenCV). The changes are as follows.

--Looping the threshold value from the highest in the for statement to find a value that works ――This time, we accepted unanswered answers and did not accept multiple answers, so we extracted once with 4 times or more of the average value, and if there were multiple answers, half of the maximum value. That's all.

def changeMarkToStr(scanFilePath, n_col, n_row, message):
    """Read the mark sheet, False the result,Returns as a True 2D array
    Args:
        scanFilePath (String):Path of JPEG file including mark sheet format
        n_col (int):Number of choices(Number of columns)
        n_row (int):Number of questions(Number of lines)
    Returns:
        list:Result of reading the mark sheet False,True 2D array
    """
    ### n_col = 6 #Number of marks per line
    ### n_row = 9 #Number of lines of mark
    import numpy as np
    import cv2

    ###Marker settings
    marker_dpi = 120 #Screen resolution(Marker size)
    scan_dpi = 200 #Scanned image resolution

    #grayscale(mode = 0)Read the file with
    marker=cv2.imread('img/setting/marker.jpg',0) 

    #Get the size of the marker
    w, h = marker.shape[::-1]

    #Resize markers
    marker = cv2.resize(marker, (int(h*scan_dpi/marker_dpi), int(w*scan_dpi/marker_dpi)))

    ###Load scanned image
    img = cv2.imread(scanFilePath,0)

    res = cv2.matchTemplate(img, marker, cv2.TM_CCOEFF_NORMED)

    ##Repeat extraction from 3 points of maker The conditions for extraction are as follows
    margin_top = 1 #Number of top margin lines
    margin_bottom = 0 #Number of bottom margin lines
 
    for threshold in [0.8, 0.75, 0.7, 0.65, 0.6]:
    
        loc = np.where( res >= threshold)
        mark_area={}
        try:
            mark_area['top_x']= sorted(loc[1])[0]
            mark_area['top_y']= sorted(loc[0])[0]
            mark_area['bottom_x']= sorted(loc[1])[-1]
            mark_area['bottom_y']= sorted(loc[0])[-1]

            topX_error = sorted(loc[1])[1] - sorted(loc[1])[0]
            bottomX_error = sorted(loc[1])[-1] - sorted(loc[1])[-2]
            topY_error = sorted(loc[0])[1] - sorted(loc[0])[0]
            bottomY_error = sorted(loc[0])[-1] - sorted(loc[0])[-2]
            img = img[mark_area['top_y']:mark_area['bottom_y'],mark_area['top_x']:mark_area['bottom_x']]

            if (topX_error < 5 and bottomX_error < 5 and topY_error < 5 and bottomY_error < 5):    
                break
        except:
            continue

    #Next, in order to facilitate the subsequent processing, mark the cut out image.
    #Resize to an integral multiple of the number of columns and rows.
    #Here, the number of columns and rows is 100 times.
    #When counting the number of lines, consider the margin from the mark area to the marker.

    n_row = n_row + margin_top + margin_bottom
    img = cv2.resize(img, (n_col*100, n_row*100))

    ###Blur
    img = cv2.GaussianBlur(img,(5,5),0)

    ###Binarized with 50 as the threshold
    res, img = cv2.threshold(img, 50, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

    ###Black and white inversion
    img = 255 - img
    cv2.imwrite('img/res.png',img)

    #Mark recognition

    ###Prepare an array to put the result
    result = []
    
    ###Row-by-line processing(Process by excluding margin lines)
    for row in range(margin_top, n_row - margin_bottom):
        
        ###Cut out only the line to be processed
        tmp_img = img [row*100:(row+1)*100,]
        area_sum = [] #Array to put the total value

        ###Processing of each mark
        for col in range(n_col):

            ###Find the total value of the images in each mark area with NumPy
            area_sum.append(np.sum(tmp_img[:,col*100:(col+1)*100]))

        ###Judge whether the total value of the image area is 4 times or more of the average value
        ###If you are actually sewing the mark, 4.9 to 6 times, there was 3 times because it was not painted at all
        ###If it is 3 times the median, it cannot be used when 0 continues.
        ressss = (area_sum > np.average(area_sum) * 4)
        #Since it is easy to extract multiple conditions under the above conditions, extract more than half of the maximum value.
        if np.sum(ressss == True) > 1:
            ressss = (area_sum > np.max(area_sum) * 0.5)
        result.append(ressss)

    for x in range(len(result)):
        res = np.where(result[x]==True)[0]+1
        if len(res)>1:
            message.append('multi answer:' + str(res))
        elif len(res)==1:
            message.append(res[0])
        else:
            message.append('None')
    message.insert(0,scanFilePath)
    print(message)
    return message

It's a scribble because it's for my own memo, but if it helps

Recommended Posts

Reading and creating a mark sheet using Python OpenCV (Tips for reading well)
Make a simple OMR (mark sheet reader) with Python and OpenCV
Try creating a compressed file using Python and zlib
Shoot time-lapse from a PC camera using Python and OpenCV
[Python] Accessing and cropping image pixels using OpenCV (for beginners)
[TouchDesigner] Tips for for statements using python
Create a striped illusion with gamma correction for Python3 and openCV3
Tips for using Selenium and Headless Chrome in a CUI environment
Tips for using python + caffe with TSUBAME
Procedure for creating a LineBot made with Python
Create a web map using Python and GDAL
I tried reading a CSV file using Python
[Python for Hikari] Chapter 09-02 Classes (Creating and instantiating classes)
Commands for creating a python3 environment with virtualenv
Procedure for creating a Python quarantine environment (venv environment)
Create a Mac app using py2app and Python3! !!
Tips for using ElasticSearch in a good way
A memo for creating a python environment by a beginner
Let's make a module for Python using SWIG
Initial settings for using Python3.8 and pip on CentOS8
Searching for pixiv tags and saving illustrations using Python
Extendable skeletons for Vim using Python, Click and Jinja2
Python text reading for multiple lines and one line
Building a Docker working environment for R and Python
I made a VM that runs OpenCV for Python
Creating a graph using the plotly button and slider
Implementing a generator using Python> link> yield and next ()> yield
[Introduction for beginners] Reading and writing Python CSV files
About creating and modifying custom themes for Python IDLE
Reading, displaying and speeding up gifs with python [OpenCV]
Memo for building a machine learning environment using Python
Make a Sato Yohei discriminator using OpenCV and TensorFlow
OpenCV for Python beginners
Building a Python environment on a Mac and using Jupyter lab
[Python] Create a date and time list for a specified period
[Python] Chapter 01-03 About Python (Write and execute a program using PyCharm)
Try a similar search for Image Search using the Python SDK [Search]
A memo when creating a directed graph using Graphviz in Python
Tips for coding short and easy to read in Python
Library for specifying a name server and dig with python
I made a Chatbot using LINE Messaging API and Python
This and that for using Step Functions with CDK + Python
[Python] Using OpenCV with Python (Basic)
OpenCV3 installation for Python3 @macOS
[Python + Selenium] Tips for scraping
~ Tips for beginners to Python ③ ~
Using OpenCV with Python @Mac
Building a Docker working environment for R and Python 2: Japanese support
Build and test a CI environment for multiple versions of Python
Python: Introduction to Flask: Creating a number identification app using MNIST
Build a local development environment for Lambda + Python using Serverless Framework
Pydroid 3 --I tried OpenCV and TensorFlow options for IDE for Python 3 (Android)
Process Splunk execution results using Python and save to a file
Try using virtualenv, which can build a virtual environment for Python
uproot: Python / Numpy based library for reading and writing ROOT files
How to make a surveillance camera (Security Camera) with Opencv and Python
A little more about references ~ Using Python and Java as examples ~
Create a simple scheduled batch using Docker's Python Image and parse-crontab
[Image processing] Edge detection using Python and OpenCV makes Poo naked!
Draw a watercolor illusion with edge detection in Python3 and openCV3
Build and try an OpenCV & Python environment in minutes using Docker