Since I decided to do a paper questionnaire at work, I mainly refer to this article (Making a simple OMR (mark sheet reader) with Python and OpenCV) I made it. I often misrecognized blank lines, so I would appreciate it if you could refer to them.

I refer to various other things, but I write them individually.

Overall flow of the questionnaire

Create a questionnaire with a QR code embedded in Excel I put information in the QR code and embedded it to identify the page number and individual of the questionnaire
Print and distribute
Scan the collected questionnaire and go from PDF to JPG I converted it properly at a free conversion site on the net
Read the survey results from the converted JPG file

environment

1. Create questionnaire

Python 3.5
QrCode
pillow
Excel

2. Scan

Workplace scanner (multifunction device)
PDF => JPG conversion site Example: https://pdftoimage.com/ja/

3. Read the mark sheet

Python 3.5
OpenCV-Contrib-Python 4.4
numpy 1.15

point

1. Points when creating a questionnaire

The size of the position marker (feature point) matches the size of the paper and the image file of the position marker for program reference.
Set the mark positions on the questionnaire at equal intervals
Place the upper position markers (feature points) so that they are exactly equal to the marker position by a constant multiple.

2. Points when scanning

Do not include double-byte characters in the file name. OpenCV is not supported. I think that it can be handled by adding some processing, but since it is a file in the middle, I just had to insert full-width characters, and I did nothing

3. Points when reading the mark sheet

Set the extraction conditions according to the answer conditions (whether multiple selections are made, whether there are no answers) This time, there were no multiple selections and no answers.

Details

1. Create questionnaire

1. Create QR code

Since there are multiple question sheets this time, in order to determine which question is for which question and who wrote the question sheet, we decided to embed the QR code in the question sheet and capture the answer in CSV along with the information when reading. did.

--Reference: Generate and save QR code image with Python, Pillow, qrcode

def makeQr(qrMessage, fileName='result.png', filePath='resultQrCode/'):
    """Create a QR code with the argument qrMessage and save it in resultQrCode
    Args:
        qrMessage (str):QR code to make
        fileName (str, optional):Output file name. Defaults to 'result.png'.
        filePath (str, optional):Output file path * At the end of the list, "/」. Defaults to 'resultQrCode/'.
    """
    import qrcode
    import os

    img = qrcode.make(qrMessage)
    if not os.path.isdir(filePath):
        os.makedirs(filePath)
    if not(filePath[-1] == '\\' or filePath[-1] == '/'):
        filePath = filePath + '\\'
    
    img.save(filePath + fileName)
    print('File out:' + filePath + fileName)

if __name__ == '__main__':
    import re
    import sys
    
    args = sys.argv
    if 1 < len(args):
        if re.findall('[/:*?"<>|]', args[1]):
            print('[Error]Kinsoku character "/:*?"<>|」')
        elif 2 == len(args):
            makeQr(args[1])
        elif re.findall('[:*?"<>|]', args[2]):
            print('[Error]Prohibition characters in the file name ":*?"<>|」')
        elif 3 == len(args):
            makeQr(args[1],args[2])
        elif re.findall('[*?"<>|]', args[3]):
            print('[Error]Kinsoku characters "*?"<>|」') 
        elif 4 == len(args):
            makeQr(args[1],args[2],args[3])
        else:
            qrMessage = args[1]
            for qrMessageList in args[4:]:
                qrMessage = qrMessage + '\r\n' + qrMessageList
            makeQr(qrMessage,args[2],args[3])
    else:
        print('error: args is one')
        print('usage1: makeQr.exe [QR_Text]')
        print('usage2: makeQr.exe [QR_Text] [Output_FileName]')
        print('usage3: makeQr.exe [QR_Text] [Output_FileName] [Output_FilePath]')
        print('usage4: makeQr.exe [QR_Text] [Output_FileName] [Output_FilePath] [QR_Text_Line2...]')

Use this for anyone to use later. I made it into an exe with py2exe. (Pyinstaller is fine, but it was heavy, so I chose py2exe here.) --Reference: Using py2exe with python 3.5 --3.7

2. Preparation of markers

Cut out the range of the mark sheet to put the question and the mark sheet together. Prepare characteristic black-and-white images at the four corners of the cutout area. Since the QR code is used this time, the marker (enclosed in red frame) used in the QR code shown below cannot be used. Also, since I am using Excel, I decided that if I set ★ as a text in a figure (auto shape), it would be a chestnut, so I prepared ★ as a figure. Since it is necessary to pass the marker ★ as an image file to the analysis program, I pasted the Excel shape with paint etc. and saved it. Since the size and margins need to be the same as the auto shape, it is recommended to minimize the height and width before pasting.

3. Preparation of mark

Prepare questions and marks in Excel. The points are as follows.

Make the height and width of the cell to be marked and the upper left of the marker evenly spaced.
Make sure that the height of the upper left of the marker does not protrude
Mark symbols ([a], etc.) are light letters, vertical writing

4. Resize the marker to fit the paper width and paste the marker

Since it is Excel, the print is enlarged or reduced, and the size of the marker saved as an image and the marker at the time of printing are different, and it may not be recognized well, so it is necessary to obtain (see) the enlargement magnification for each sheet. -Reference ExecuteExcel4Macro "Page.Setup ()"

Public Function getPrintZoomPer(sheetName As String) As Integer
    Worksheets(sheetName).Activate
    ExecuteExcel4Macro "Page.Setup(,,,,,,,,,,,,{1,#N/A})"
    ExecuteExcel4Macro "Page.Setup(,,,,,,,,,,,,{#N/A,#N/A})"
    getPrintZoomPer = ExecuteExcel4Macro("Get.Document(62)")
End Function

Place the marker sauce on some sheet and paste it on the four corners of the sheet you want to attach the marker sauce to. When pasting the marker, multiply the acquired enlargement ratio by the reciprocal. Since the number of questions and options are variable, I made it with vba without fixing it.

Public Sub insertMaker(sheetName As String, pasteCellStr As String, _
         printZoomPer As Integer)
    ' sheetName:Sheet name of the paste destination
    ' paseteCellStr:Cell character string to paste to Example: A4, B1, etc.
    Dim srcShape As shape
    Set srcShape = Worksheets("sheet1").Shapes("marker") 
    ' sheet1:Sheet name with the marker Shape of the paste source
    ' marker:Original marker Shapenamae name
    srcShape.Copy
    With Worksheets(sheetName).Pictures.Paste
        .Top = Worksheets(sheetName).Range(pasteCellStr).Top
        .Left = Worksheets(sheetName).Range(pasteCellStr).Left
        .Name = "marker"
        .Width = .Width * 100 / printZoomPer
    End With
End Sub

5. Insert QR code

Create by entering "questionnaire type + questionnaire page number + branch office + person number + number of choices + number of questions" in the QR code. Call the exe file created in "1." from the Excel macro with WScript.Shell.

Public Function makeQr(ByVal QrMessage As String, ByVal fileName As String) As String
    Dim WSH, wExec, sCmd As String
    Set WSH = CreateObject("WScript.Shell")
    
    sCmd = ThisWorkbook.Path & "makeQR.exe " & QrMessage & " " & fileName & " " & _
           ThisWorkbook.Path & "resultQrCode"
    Set wExec = WSH.Exec("%ComSpec% /c " & sCmd)
    
    Do While wExec.Status = 0
        DoEvents
    Loop
    makeQr = wExec.StdOut.readall

    Set wExec = Nothing
    Set WSH = Nothing

End Function

After doing various things, the result is as below

I will print and distribute what I made like this.

2. Scan

After completing the survey, we will scan. Since there is a resolution setting on the mark sheet reading side, I fixed it to 200dpi this time. (In addition, since the multifunction device at work could not be saved directly in jpg format, it was converted by Save as PDF and PDF => JPG conversion site. Also, I use openCV on the mark sheet reading side, but please note that double-byte characters could not be used in the file name.

3. Read the mark sheet

1. Read QR code

Put the aggregated JPG files in one folder, read the QR code, and return it as an argument.

def qrCodeToStr(filePath):
"""Read the string from the QR code
Args:
    filePath (String):The path of the image file containing the QR code
Returns:
    String:Result of reading the QR code(Failed nullString)
"""
import cv2

img = cv2.imread(filePath, cv2.IMREAD_GRAYSCALE)
#QR code decoding
qr = cv2.QRCodeDetector()
data,_,_ = qr.detectAndDecode(img)

if data == '':
    print('[ERROR]' + filePath + 'QR code was not found from')
else:
    print(data)
return data

2. Read the mark sheet

This is almost the same as this article (Making a simple OMR (mark sheet reader) with Python and OpenCV). The changes are as follows.

--Looping the threshold value from the highest in the for statement to find a value that works ――This time, we accepted unanswered answers and did not accept multiple answers, so we extracted once with 4 times or more of the average value, and if there were multiple answers, half of the maximum value. That's all.

def changeMarkToStr(scanFilePath, n_col, n_row, message):
    """Read the mark sheet, False the result,Returns as a True 2D array
    Args:
        scanFilePath (String):Path of JPEG file including mark sheet format
        n_col (int):Number of choices(Number of columns)
        n_row (int):Number of questions(Number of lines)
    Returns:
        list:Result of reading the mark sheet False,True 2D array
    """
    ### n_col = 6 #Number of marks per line
    ### n_row = 9 #Number of lines of mark
    import numpy as np
    import cv2

    ###Marker settings
    marker_dpi = 120 #Screen resolution(Marker size)
    scan_dpi = 200 #Scanned image resolution

    #grayscale(mode = 0)Read the file with
    marker=cv2.imread('img/setting/marker.jpg',0) 

    #Get the size of the marker
    w, h = marker.shape[::-1]

    #Resize markers
    marker = cv2.resize(marker, (int(h*scan_dpi/marker_dpi), int(w*scan_dpi/marker_dpi)))

    ###Load scanned image
    img = cv2.imread(scanFilePath,0)

    res = cv2.matchTemplate(img, marker, cv2.TM_CCOEFF_NORMED)

    ##Repeat extraction from 3 points of maker The conditions for extraction are as follows
    margin_top = 1 #Number of top margin lines
    margin_bottom = 0 #Number of bottom margin lines
 
    for threshold in [0.8, 0.75, 0.7, 0.65, 0.6]:
    
        loc = np.where( res >= threshold)
        mark_area={}
        try:
            mark_area['top_x']= sorted(loc[1])[0]
            mark_area['top_y']= sorted(loc[0])[0]
            mark_area['bottom_x']= sorted(loc[1])[-1]
            mark_area['bottom_y']= sorted(loc[0])[-1]

            topX_error = sorted(loc[1])[1] - sorted(loc[1])[0]
            bottomX_error = sorted(loc[1])[-1] - sorted(loc[1])[-2]
            topY_error = sorted(loc[0])[1] - sorted(loc[0])[0]
            bottomY_error = sorted(loc[0])[-1] - sorted(loc[0])[-2]
            img = img[mark_area['top_y']:mark_area['bottom_y'],mark_area['top_x']:mark_area['bottom_x']]

            if (topX_error < 5 and bottomX_error < 5 and topY_error < 5 and bottomY_error < 5):    
                break
        except:
            continue

    #Next, in order to facilitate the subsequent processing, mark the cut out image.
    #Resize to an integral multiple of the number of columns and rows.
    #Here, the number of columns and rows is 100 times.
    #When counting the number of lines, consider the margin from the mark area to the marker.

    n_row = n_row + margin_top + margin_bottom
    img = cv2.resize(img, (n_col*100, n_row*100))

    ###Blur
    img = cv2.GaussianBlur(img,(5,5),0)

    ###Binarized with 50 as the threshold
    res, img = cv2.threshold(img, 50, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

    ###Black and white inversion
    img = 255 - img
    cv2.imwrite('img/res.png',img)

    #Mark recognition

    ###Prepare an array to put the result
    result = []
    
    ###Row-by-line processing(Process by excluding margin lines)
    for row in range(margin_top, n_row - margin_bottom):
        
        ###Cut out only the line to be processed
        tmp_img = img [row*100:(row+1)*100,]
        area_sum = [] #Array to put the total value

        ###Processing of each mark
        for col in range(n_col):

            ###Find the total value of the images in each mark area with NumPy
            area_sum.append(np.sum(tmp_img[:,col*100:(col+1)*100]))

        ###Judge whether the total value of the image area is 4 times or more of the average value
        ###If you are actually sewing the mark, 4.9 to 6 times, there was 3 times because it was not painted at all
        ###If it is 3 times the median, it cannot be used when 0 continues.
        ressss = (area_sum > np.average(area_sum) * 4)
        #Since it is easy to extract multiple conditions under the above conditions, extract more than half of the maximum value.
        if np.sum(ressss == True) > 1:
            ressss = (area_sum > np.max(area_sum) * 0.5)
        result.append(ressss)

    for x in range(len(result)):
        res = np.where(result[x]==True)[0]+1
        if len(res)>1:
            message.append('multi answer:' + str(res))
        elif len(res)==1:
            message.append(res[0])
        else:
            message.append('None')
    message.insert(0,scanFilePath)
    print(message)
    return message

It's a scribble because it's for my own memo, but if it helps

Reading and creating a mark sheet using Python OpenCV (Tips for reading well)

Overall flow of the questionnaire

environment

1. Create questionnaire

2. Scan

3. Read the mark sheet

point

1. Points when creating a questionnaire

2. Points when scanning

3. Points when reading the mark sheet

Details

1. Create questionnaire

1. Create QR code

2. Preparation of markers

3. Preparation of mark

4. Resize the marker to fit the paper width and paste the marker

5. Insert QR code

2. Scan

3. Read the mark sheet

1. Read QR code

2. Read the mark sheet