Use Python and MeCab with Azure Functions

Purpose of this article

I want to perform simple natural language processing (morphological analysis + α) using MeCab in the pre-processing of Azure Data Factory. It would be convenient if you could implement it as a function and call it later from various services such as LogicApps. So I considered two implementation methods.

  1. Azure Functions (this article)
  2. Azure DataBricks (Using Python and MeCab with Azure Databricks)

Since it does not perform heavy processing like machine learning, Azure Functions will be sufficient, so I implemented it.

If you write the conclusion first ** ・ Functions triggered by HTTP Request of Azure Functions can be implemented by referring to the following URL **

Create an Azure Functions project using Visual Studio Code https://docs.microsoft.com/ja-jp/azure/azure-functions/functions-create-first-function-vs-code?pivots=programming-language-csharp

** ・ Mecab can also be used on the Functions side by adding "mecab-python3" to requirements.txt of .vscode **

** ・ The monthly free tier of Azure Functions is rather generous, so you can try it for free for a while **

The rest is a memorandum of some stumbling points.

There are many points of lack of understanding, so please point out any mistakes.

Azure Functions overview

An event-driven serverless computing platform. It is activated by the set trigger and consumes computing resources only during the execution of the function, so wasteful costs can be reduced.

Billing system

There are three hosting plans for Functions, but the leftmost "pay-as-you-go plan" is the so-called normal serverless. This time I chose this. スクリーンショット 2020-05-04 19.32.56.png

Azure Functions-Price setting https://azure.microsoft.com/ja-jp/services/functions/#pricing

Billing is determined by the number of executions, execution time, and memory usage. There is a free frame that is reset every month, and it seems that it is free up to "1 million executions" and "400,000 GB seconds".

Runtime stack (language)

You can choose from .Net Core, Node.js, Python, Java, Powershell Core. One thing to note is that if you choose Python for the runtime stack, the OS will only support Linux. Currently, Linux's pay-as-you-go plan is not supported in both regions of Japan, so it is necessary to select another region. (As of May 2020)

Supported regions are "[Available products by region](https://azure.microsoft.com/en-us/global-infrastructure/services/?products=functions&regions=us-east,us-east-2,us -central, us-north-central, us-south-central, us-west-central, us-west, us-west-2, japan-west, japan-east) ".

trigger

Triggers for launching Functions include HTTP Request, Timer, BLOB, eventhub, etc. This time it seems to be easy to use, so I implemented it with HTTP Request for the time being. It seems easy to use when the BLOB file is created.

Create Functions Apps from the Azure portal

I wanted to use Python for a pay-as-you-go plan, so I created a region in East Asia for the time being. (Anyway, as long as it supports the pay-as-you-go plan Linux. Geographically, Korea Central in Seoul was closer.) スクリーンショット 2020-05-04 20.08.52.png

You can now choose Linux pay-as-you-go. We also need to prepare a storage account to link from Functions, so we created a new one this time. スクリーンショット 2020-05-04 20.10.01.png

Application Insight for monitoring is also created at this timing. Now you can see the execution log.

スクリーンショット 2020-05-04 20.12.17.png

This will create four resources.

  1. Functions app (Functions)
  2. Application Insights
  3. Storage account
  4. App Service Plan

It is unclear why the resource called App Service Plan is created even though the consumption (serverless) is selected in the price plan.

Local development and deployment to Functions

In the case of .NET, you can create a function on the screen of Azure portal, but unfortunately it is not supported in the case of Python. Therefore, the procedure is to develop and test locally and then deploy to Functions.

スクリーンショット 2020-05-04 20.53.27.png スクリーンショット 2020-05-04 20.56.30.png

The specific procedure was advanced by referring to the following article.

Create an Azure Functions project using Visual Studio Code https://docs.microsoft.com/ja-jp/azure/azure-functions/functions-create-first-function-vs-code?pivots=programming-language-csharp

The following environment is required locally, so install it. ・ Because. js -Python 3.8, Python 3.7, or Python 3.6 ・ Visual Studio Code · Python extensions for Visual Studio Code · Azure Functions extension for Visual Studio Code ・ Azure Functions Core Tools

Stumble point

1. Error when running Azure Functions in local environment 1

Permission error when running Functions Local.

Error message `The file cannot be read because script execution is disabled on this system. ``

I had to change the PowerShell Execution Policy. I raised ExecutionPlicy to RemoteSigned only for the first time, and then returned to Restricted and the error disappeared. image.png

About Execution Policies https://docs.microsoft.com/ja-jp/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7

2. Error running Azure Functions in local environment 2

Error with the following code on \ _ \ _ init \ _ \ _. Py import azure.functions as func

Error message Unable to import 'azure.functions' pylint(import-error) [3, 1]

It seems that the problem was caused by a mixture of multiple Python versions locally, and it was solved by switching to another runtime. image.png

3. Add MeCab to Functions

If you add the library to requirements.txt of .vscode, it seems that pip install will be done on the Functions side. So I added the following. ・ Mecab-python3

The rest is as usual.

__init__.py


import MeCab
mecab = MeCab.Tagger("-Ochasen")
parsedsentence = mecab.parse(sentence)

Summary

The use of MeCab in Functions was surprisingly quick. Functions are sufficient for those that meet this purpose. (However, I don't understand why it works, including where the mecab dictionary itself is ...)

The pay-as-you-go plan has a maximum timeout of 10 minutes, so it's just for lightweight execution. If you want to do something a little more elaborate, it seems better to use another hosting plan for Functions or even adopt DataBricks.

Azure Functions scale and hosting https://docs.microsoft.com/ja-jp/azure/azure-functions/functions-scale#service-limits

image.png

Confirmation of cost

You can see the cost by calling Metric from the Azure portal. The result of executing the simple function "Return the result of morphological analysis with Mecab for one input sentence" once is as follows.

image.png

The unit may be Function Execution Unit. Since the unit of Function Execution Unit is [MB milliseconds], convert this to [GB seconds], which is the unit of billing.

This time 163.58 k = 163,580 MB milliseconds, so 163,580 / 1,024,000 = 0.15974609375 【GB seconds】

It's free of charge up to "1 million executions" and "400,000 GB seconds" every month, so if you do some testing, it seems that it will fit in the free frame at all. Of course, please note that you will be charged for BLOBs created at the same time.

Recommended Posts

Use Python and MeCab with Azure Functions
Use mecab with Python3
Using Python and MeCab with Azure Databricks
Use Python and word2vec (learned) with Azure Databricks
Use PIL and Pillow with Cygwin Python
Tweet analysis with Python, Mecab and CaboCha
[Python] Use JSON with Python
Use DynamoDB with Python
Use python with docker
Use Python / Django with Windows Azure Cloud Service!
[Azure Functions / Python] Chain functions with Queue Storage binding
Curry arbitrary functions with Python ....
Programming with Python and Tkinter
Encryption and decryption with Python
Getting Started with Python Functions
Use Trello API with python
[Python] Morphological analysis with MeCab
Python and hardware-Using RS232C with Python-
Use TUN / TAP with Python
Ubuntu 20.04 on raspberry pi 4 with OpenCV and use with python
Storage I / O notes in Python with Azure Functions
Email hipchat with postfix, fluentd and python on Azure
python with pyenv and venv
Use subsonic API with python3
Python 3 sorted and comparison functions
Python higher-order functions and comprehensions
Works with Python and R
Easy to use Nifty Cloud API with botocore and python
[Python] Summary of how to use split and join functions
Comparison of how to use higher-order functions in Python 2 and 3
This and that for using Step Functions with CDK + Python
Communicate with FX-5204PS with Python and PyUSB
Shining life with Python and OpenCV
Python: How to use async with
Use Azure SQL Database with SQLAlchemy
Robot running with Arduino and python
Use PointGrey camera with Python (PyCapture2)
Use vl53l0x with Raspberry Pi (python)
Install Python 2.7.9 and Python 3.4.x with pip.
Neural network with OpenCV 3 and Python 3
AM modulation and demodulation with python
[Python] font family and font with matplotlib
Scraping with Node, Ruby and Python
[Python / matplotlib] Understand and use FuncAnimation
Scraping with Python, Selenium and Chromedriver
Scraping with Python and Beautiful Soup
About python dict and sorted functions
[Python] Use Basic/Digest authentication with Flask
Use NAIF SPICE TOOLKIT with Python
Read and use Python files from Python
JSON encoding and decoding with python
Use rospy with virtualenv in Python3
Hadoop introduction and MapReduce with Python
[GUI with Python] PyQt5-Drag and drop-
Reading and writing NetCDF with Python
10 functions of "language with battery" python
Use Python in pyenv with NeoVim
How to use FTP with Python
Use Windows 10 speech synthesis with Python
I played with PyQt5 and Python3
Reading and writing CSV with Python