Overview

After performing GCP Text Detection using Power Automate Desktop (PAD), I tried mosaic processing the characters in the image with Python's OpenCV. It is a point to note at that time. ezgif.com-gif-maker (4).gif The numbers are mosaic processed.

Prerequisites

Windows10pro 20H2 Power Automate Desktop 2.2.20339.22608。 GCP　Vison API

Python 3.8.5 pandas 1.1.4 numpy 1.18.5 opencv-python 4.4.0.46 .py file can be executed

I'm using the https://www.python.org/ installer instead of the distribution. The library is installed separately Information for December 2020.

Notes

Python2 can be used as an action for PAD, but since the built-in library such as OpenCV is unknown, the method of calling Python in my environment is used. As I introduced in Article before, WinAutomation was able to generate code using Write Text and use any Python. However, with PAD, there are restrictions when trying to use the same method. The same method is used, but it is not officially supported at this time due to restrictions.

Flow image

Flow creation

Create two folders "Landscape" and "LMOut" on the desktop.

4th line Enter the obtained API key in the "Text detection" action.

Folder.GetSpecialFolder SpecialFolder: Folder.SpecialFolder.DesktopDirectory SpecialFolderPath=> SpecialFolder
Folder.GetFiles Folder: $'''%SpecialFolder%\\Landscape''' FileFilter: $'''*.jpg''' IncludeSubfolders: False FailOnAccessDenied: True SortBy1: Folder.SortBy.NoSort SortDescending1: False SortBy2: Folder.SortBy.NoSort SortDescending2: False SortBy3: Folder.SortBy.NoSort SortDescending3: False Files=> Files
LOOP FOREACH CurrentItem IN Files
    Cognitive.Google.Vision.TextDetectionFromFile APIKey: $'''''' ImageFile: CurrentItem Timeout: 30 Response=> JSONResponse StatusCode=> StatusCode
    File.WriteText File: $'''%SpecialFolder%/Landscape/%CurrentItem.NameWithoutExtension%.json''' TextToWrite: JSONResponse AppendNewLine: False IfFileExists: File.IfFileExists.Overwrite Encoding: File.FileEncoding.UTF8NoBOM
    File.WriteText File: $'''%SpecialFolder%/Landscape/%CurrentItem.NameWithoutExtension%.py''' TextToWrite: $'''aaaaaaaa''' AppendNewLine: False IfFileExists: File.IfFileExists.Overwrite Encoding: File.FileEncoding.UTF8
    System.RunDOSCommand DOSCommandOrApplication: $'''%SpecialFolder%/Landscape/%CurrentItem.NameWithoutExtension%.py''' WorkingDirectory: $'''%SpecialFolder%/Landscape''' StandardOutput=> CommandOutput StandardError=> CommandErrorOutput ExitCode=> CommandExitCode
    File.Delete Files: $'''%SpecialFolder%/Landscape/%CurrentItem.NameWithoutExtension%.py'''
    File.Delete Files: $'''%SpecialFolder%/Landscape/%CurrentItem.NameWithoutExtension%.json'''
    File.Move Files: CurrentItem Destination: $'''%SpecialFolder%\\LMOut''' IfFileExists: File.IfExists.Overwrite MovedFiles=> MovedFiles
END

Put python code

This is the main subject of this time. In the whole flow of the image, the Python code is included in the 6th line, but in the above code, it is intentionally set to aaaaaaaa.

There is no problem if the python code is written in the place of aaaaaaaa, but in PAD, the "text to write" field cannot be broken.

The reason why I want to use "text to write" is that if I generate python code from this action, I can pass variables on the PAD side to Python.

Fortunately, PAD runs on Robin, the RPA language. Actions are visually blocked on the PAD and are visible to the user, but are actually text-based, as in the code above. So you can treat it as just text by pasting "text to write" into the editor once.

You can then paste the Python code into the aaaaaaaaaaa part and re-paste the whole thing into the PAD to store the Python code in the "text to write" action.

ezgif.com-gif-maker (5).gif Python code for the aaaaaaaaaaa part

#Library load
import json
import pandas as pd
import numpy as np
import cv2

#JSON reading
with open("%CurrentItem.NameWithoutExtension%.json", "r", encoding="utf-8") as content:
    data = json.loads(content.read())

#Loading images
img = cv2.imread("%CurrentItem.Name%")
xmax = img.shape[1]
ymax = img.shape[0]

#Mosaic definition

def mosaic(img, rect, size):
    #Get the area to apply the mosaic
    (x01, y01, x02, y02) = rect
    w = x02-x01
    h = y02-y01
    i_rect = img[y01:y02, x01:x02]
    #Reduce and expand once
    i_small = cv2.resize(i_rect, (size, size))
    i_mos = cv2.resize(i_small, (w, h), interpolation=cv2.INTER_AREA)
    #Overlay the image on the mosaic
    img2 = img.copy()
    img2[y01:y02, x01:x02] = i_mos
    return img2


#Coordinate extraction and mosaic processing
for c in range(1, len(data["responses"][0]["textAnnotations"])):
    pointdata = data["responses"][0]["textAnnotations"][c]["boundingPoly"]["vertices"]
    df_target = pd.read_json(json.dumps(pointdata))

    x0 = int(df_target.fillna(0).iat[0, 0])
    x1 = int(df_target.fillna(xmax).iat[1, 0])
    x2 = int(df_target.fillna(xmax).iat[2, 0])
    x3 = int(df_target.fillna(0).iat[3, 0])

    xlist = np.array([x0, x1, x2, x3])
    xp1 = np.min(xlist)
    xp2 = np.max(xlist)

    y0 = int(df_target.fillna(0).iat[0, 1])
    y1 = int(df_target.fillna(0).iat[1, 1])
    y2 = int(df_target.fillna(ymax).iat[2, 1])
    y3 = int(df_target.fillna(ymax).iat[3, 1])

    ylist = np.array([y0, y1, y2, y3])
    yp1 = np.min(ylist)
    yp2 = np.max(ylist)

    pointlist = np.array([xp1, yp1, xp2, yp2])

    img = mosaic(img, pointlist, 5)
#Save image
cv2.imwrite("/Users/Username/Desktop/LMOut/%CurrentItem.NameWithoutExtension%_masked.jpg ", img)

About the above code Extracts the position information of the JSON-read characters returned from the Vision API and performs mosaic processing. I put it in Pandas because the characters are out of the image and I wanted to fill it with 0 or the maximum value of the image because the location information returned from the Vision API was missing.

important point

Characters enclosed in% are processed as variables in PAD. Most importantly, in the Robin language,'single quotation marks already have meaning as character quotes, so Python code makes all character quotes "double quotes". Note that the \ backslash is an escape character in the Robin language. % Cannot be escaped.

Summary

As of December 2020, the "Write text to file" action cannot use multiple lines.
It's just a guess, but I think it's limited because it doesn't work when single quotation is used in the Robin language.
See here for Robin language.
The file triggers that can be used with Winautomation's drag and drop were immediate and powerful. You can also use Power Automate's file triggers in PAD, but I don't think it's for a flow like this one.
The flow of creating a Python environment may be difficult in practice, but I thought it would be better to have a hand, so I wrote this article.
It seems that there is no demand, but it is an output with the intention of cleaning up on the next two days this year.

[PYTHON] Precautions when using OpenCV from Power Automate Desktop