[PYTHON] You can use assert and Enum (or) decorators to check compliance with type annotation constraints without the help of mypy.

0. Introduction

For Python3 type checking, typing and mypy are known.


@ Mski_iksm's Qiita article "Python type checking using typehint and mypy"@ Papi_tokei's Qiita article "Practice !! Introduction to Python type (Type Hints)"SWEet "Let's type check with Python"Mizzsugar ’s blog “The first step of generics learning with Python and TypeScript”


__ This * mypy * checks if the type annotation constraint is being followed from outside the Python script file. __

You can use the __ * assert * statement and the enum * Enum * to check the type and value of method arguments and return values in a closed (self-contained) form inside the script file. __

__ You can also implement the same thing with a decorator. __

1. How to use * assert * and * Enum *

(Reference)

-Note.nkmk.me "type function to get / determine type with Python, isinstance function" ・ [CodeZine "Explanation of how to use the really useful function" assertions "in Python! From "Python Trick"]](https://codezine.jp/article/detail/12179) ・ What is an assert statement?

(Problem setting)

  1. Use * spacy * to extract specific "named entity" words from a sentence. When multiple words belonging to the specified "named entity" are found, the number of times each word appears in the sentence is counted, and the words are returned in the order of appearance frequency.
  2. There is a limit to the types (labels) of "named entities" that * spacy * can recognize.
  3. The user passes as an argument which "named entity" word he wants to extract from the sentence.
  4. The method takes two arguments. The first is the text to be analyzed (must be of type * str *), and the second is the label (name) of the "named entity".
  5. If the text data passed as an argument is not of type * str *, an error is returned. Also, if the "named entity" label name passed as an argument is not found in the defined "named entity" label, an error is returned.

Python3


from enum import Enum
from typing import List, Dict
import spacy

class NamedEntityLabel(Enum):
	Jinmei : str = "PERSON"
	Chimei : str = "LOC"

	def extract_named_entity_wordlist(text : str, ne_label : str) -> List[str]:
		#Check the type of the first argument
		assert type(text) is str, 'The text data to be entered must be a string.'
		#Check the value of the second argument
		right_value_list = [e.name for e in NamedEntityLabel]
		assert ne_label in right_value_list, 'The named entity label entered has not yet been defined.'
		#If there is no problem with the type and value of the two arguments received, execute the following
    	nlp = spacy.load('ja_ginza')
   	 	text = text.replace("\n", "")
    	doc = nlp(text)
    	word_list = [ent.text for ent in doc.ents if ent.label_ == NamedEntityLabel[ne_label].value]
    	return word_list
    	

policy

You can write __ * if ne_label not in ["PERSON", "LOC"] , but here, in order to explicitly describe the "defined entity type" in the code, the "named entity class" Is defined by an enumeration type ( Enum *). __

Try using

(When the intended value is passed as an argument)

__ Extract the words corresponding to the indicated named entity label from the received * text * sentence, and return each word in order of frequency of occurrence. (Named Entity Recognition *) __

Python3


text = """Today I came to America with Mary. I went through Paris, France."""
 
NamedEntityLabel.extract_named_entity_wordlist(text, "Chimei")
#result
['America', 'France', 'Paris']

NamedEntityLabel.extract_named_entity_wordlist(text, "Jinmei")
#result
['Mary']

(When an unintended value is passed as an argument)

Python3


NamedEntityLabel.extract_named_entity_wordlist(text, "Soshikimei")
#result
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in extract_named_entity_wordlist
AssertionError:The named entity label entered has not yet been defined

Python3


NamedEntityLabel.extract_named_entity_wordlist(127, "Soshikimei")
#result
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in extract_named_entity_wordlist
AssertionError:The text data to be entered must be a string.

Python3


NamedEntityLabel.extract_named_entity_wordlist(127, "Jinmei")
#result
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in extract_named_entity_wordlist
AssertionError:The text data to be entered must be a string.

You can also use a decorator like the web page below.

-[NANSYSTEM "Define enum in # Python3.7, generate Enum from string, get string from Enum value, get list, give behavior"](https://nansystem. com / python-enum-definition-and-convert-from-string-and-get-list /) ・ Python learning channel by PyQ "What is a Python class method (@classmethod)? Explaining the difference between the usage and the method"

Next, let's see how to use the decorator.

2. How to use decorator and * Enum * without using * assert *

Python3


class NamedEntityLabel(Enum):
	Jinmei : str = "Person"
	Chimei : str = "LOC"
	Soshikimei :str = "ORG"
    @classmethod
    def value_check(cls, target_value):
        for e in NamedEntityLabel:
            if e.name == target_value:
                return e
        raise ValueError('{}Is not a defined named entity label'.format(target_value))

(How to use)

Python3


NamedEntityLabel.value_check("Jinmei")

#Execution result
NamedEntityLabel.value_check("Jinmei")
<NamedEntityLabel.Jinmei: 'Person'>

Python3


NamedEntityLabel.value_check("EmailAddress")

#Execution result
ValueError:EmailAddress is not a defined named entity label

Create the following code using the above

Python3


class NamedEntityLabel_2(Enum):
	Jinmei : str = "PERSON"
	Chimei : str = "LOC"
    @classmethod
    def value_check(cls, target_value):
        for e in NamedEntityLabel_2:
            if e.name == target_value:
                return e
        raise ValueError('{}Is not a defined named entity label
'.format(target_value))
        
	def extract_named_entity_wordlist(text : str, ne_label : str) -> List[str]:
		#Check the type of the first argument
		assert type(text) is str, 'The text data to be entered must be a string.'
		#Check the value of the second argument
		e = NamedEntityLabel_2.value_check(ne_label)
		#If there is no problem with the type and value of the two arguments received, execute the following
    	nlp = spacy.load('ja_ginza')
   	 	text = text.replace("\n", "")
    	doc = nlp(text)
    	word_list = [ent.text for ent in doc.ents if ent.label_ == e.value]
    	return word_list

(When the intended value is passed as an argument)

Python3


text = """Today I came to America with Mary. I went through Paris, France."""
 
NamedEntityLabel_2.extract_named_entity_wordlist(text, "Chimei")
#result
['America', 'France', 'Paris']

(When an unintended value is passed as an argument)

Python3


NamedEntityLabel_2.extract_named_entity_wordlist(text, "Soshikimei")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 12, in extract_named_entity_wordlist
  File "<stdin>", line 9, in value_check
ValueError:Soshikimei is not a defined named entity label

Python3


NamedEntityLabel_2.extract_named_entity_wordlist(127, "Soshikimei")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in extract_named_entity_wordlist
AssertionError:The text data to be entered must be a string.

(Reference)

The code that uses a decorator to check that the method's type annotation constraints are being followed is also suggested on the following web page:

-CosmoSonic21 blog "Implement function argument type checking with decorator in Python"


3. How to use the * Enum * class instance generated by passing the received argument to the constructor in the processing of the main body

The following is the simplest.

( Method )

  1. Pass the received named entity label name to the instance constructor of the enumeration class (where the named entity is defined) to create an instance of the named entity class.
  2. Data processing uses the generated instance of the named entity class.
  3. If an undefined named entity label is received, a Key error will occur at the stage of creating an instance of the named entity class, and no further processing will be performed.

__ First, pass an appropriate named entity label name to the constructor and see what happens when you try to create an instance of the enum class (where the named entity is defined). __

__ Enumeration (* Enum *) class declaration __

Python3


from enum import Enum
import enum
from typing import List, Dict

@enum.unique
class NamedEntityLabel(Enum):
Jinmei : str = "PERSON"
Chimei : str = "LOC"

__ Pass arbitrary data to the constructor to instantiate an enum (* Enum *) class __

__ (If the value passed is * Name * defined in the * Enum * class) __ __ Instance is successfully created __

Python3


named_entity_instance_test = NamedEntityLabel["Jinmei"]
print(named_entity_instance_test)
#Execution result: The instance was created successfully
NamedEntityLabel.Jinmei
#Extract the name of the created instance
print(named_entity_instance_test.name)
#Execution result
Jinmei
#Extract the value of the generated instance
print(named_entity_instance_test.value)
#Execution result
PERSON

__ (If the value passed is * Name * undefined in the * Enum * class) __ __ Instance is not created and error occurs __

Python3


named_entity_instance_test = NamedEntityLabel["EMAIL"]
#Execution result: An error occurred because an undefined name was passed in the NamedEntityLabel class (enumeration type Enum type).
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ocean/.pyenv/versions/3.9.0/lib/python3.9/enum.py", line 355, in __getitem__
    return cls._member_map_[name]
KeyError: 'EMAIL'

__ (If the value passed is * Name * defined in the * Enum * class) __ __ Instance is successfully created __

Python3


#Variables passed to the constructor when creating an instance of the NamedEntityLabel class
#This variable assumes the case of storing the value received from the user
input_data_ok = "Jinmei"
input_data_ng = "Emailaddress"

named_entity_instance_ok = NamedEntityLabel[input_data_ok]
#No error occurs
print(named_entity_instance_ok)
#Execution result: The instance has been created successfully.
NamedEntityLabel.Jinmei
#Extract the name of the created instance
print(named_entity_instance_ok.name)
#Execution result
Jinmei
 
#Extract the value of the generated instance
print(named_entity_instance_ok.value)
#Execution result
PERSON

__ (If the value passed is * Name * undefined in the * Enum * class) __ __ Instance is not created and error occurs __

Python3


named_entity_instance_ng = NamedEntityLabel[input_data_ng]
#Execution result: An error occurred because an undefined name was passed in the NamedEntityLabel class (enumeration type Enum type).
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ocean/.pyenv/versions/3.9.0/lib/python3.9/enum.py", line 355, in __getitem__
    return cls._member_map_[name]
KeyError: 'Emailaddress'

After receiving the above, rewrite the entire script __

@ Ksato9700's Qiita article "What's new in Python 3.4.0 (2) --enum"

(1) Define * Enum * class

__ Defined as a named class that is available as a named entity __

Python3


class NamedEntityLabel_3(Enum):
    Jinmei : str = "PERSON"
    Chimei : str = "LOC"

(2) Define a method to process the main body

__ Pay attention to the following two lines __

-Using an instance of NamedEntityLabel class

__ (1st place) __ named_entity_instance = NamedEntityLabel_3[ne_label]

__ (2nd place) __ word_list = [ent.text for ent in doc.ents if ent.label_ == named_entity_instance.value]

Python3


def extract_named_entity_wordlist(text : str, ne_label : str) -> List[str]:
	#Check the type of the first argument
    assert type(text) is str, 'The text data to be entered must be a string.'
    #Check the value of the second argument
    #If the word received as an argument is an undefined word as the Name of the NamedEntityLabel class, a Key error will occur if you try to generate an instruction for NamedEntityLabel by passing this word to the constructor.
    named_entity_instance = NamedEntityLabel_3[ne_label]
    #If there is no problem with the type and value of the two arguments received, execute the following
    nlp = spacy.load('ja_ginza')
    text = text.replace("\n", "")
    doc = nlp(text)
    word_list = [ent.text for ent in doc.ents if ent.label_ == named_entity_instance.value]
    return word_list

(When the intended value is passed as an argument)

Python3


extract_named_entity_wordlist(text, "Chimei")
#Execution result
['America', 'France', 'Paris']

(When an unintended value is passed as an argument)

Python3


extract_named_entity_wordlist(text, "Soshikimei")
#Execution result
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in extract_named_entity_wordlist
  File "/Users/ocean/.pyenv/versions/3.9.0/lib/python3.9/enum.py", line 355, in __getitem__
    return cls._member_map_[name]
KeyError: 'Soshikimei'

-A pair of __unique expression label name (Name) and value (Value) may be defined as a pair of * key * and * value * in a dictionary type (* dict * type) object. -However, here, in order to explicitly describe the "predefined type of named entity" in the code, a class called "named entity class" is defined as an enumeration type (* Enum *). __

Recommended Posts

You can use assert and Enum (or) decorators to check compliance with type annotation constraints without the help of mypy.
You can also check the communication of DB and cache with curl
Technical English> you use the boolean operators [and, or, and not] to ...> Boolean Operations — and, or, not
Use bash-completion to type long commands without looking at man or help
Check the type of the variable you are using
I tried to check with the help of neural networks whether "Japanese" only "unreadable fonts" can really be read only by Japanese.
Site notes to help you use NetworkX with Python
Check the type and version of your Linux distribution