Create a JSON object mapper in Python

This article is the 23rd day article of Python Part 2 Advent Calendar 2020.

Self-introduction

It's been almost eight years as a back-end engineer, my name is Kouki Hoshi. Although the range of defense is server side, it is a wide and shallow engineer who has experience in improving application, infrastructure, testing, and development environment. Recently I've been working with python for almost a year using a framework called chalice.

Writing this time

When creating the API, I sometimes want to map the JSON received as input to the intended class ... I didn't have a library like that in python, so I made it myself. The version of python is 3.8 (from typing import get_origin get_args/can only be 3.8 or later).

Why make it?

It seems that you can easily do it with json.loads of json, which is a general-purpose library. For that reason, I seemed to have to work hard to write the settings, and I felt that it would be difficult to write them universally.

Even in Java, there is a famous library called Jackson for a long time, so I thought that I could find it, but there is no such description. So, I made it by trying to make it better.

I will make

Overall design policy

Dropping an instance of a class into some form of data is called serialization. Conversely, converting some data format into an instance of a class is called deserialization.

Here, let's assume the following usage.

@dataclass
class Hello:
    hello: str

objectMapper = ObjectMapper()
instance = objectMapper.deserialize('{"hello": "mapper"}', Hello)
print(instance.hello) ##output: mapper

This time, we will make it with the following class structure. スクリーンショット 2020-12-23 1.28.23.png

First write the interface

import json
from typing import Type, TypeVar, List
from abc import ABCMeta, abstractmethod

class NotImplementedError(Exception):
    def __init__(self, message):
        super().__init__(message)    

class JsonDeserializer(metaclass=ABCMeta):
    @abstractmethod
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        pass

    @abstractmethod
    def deserialize(self, json: object, mappingClass: type) -> object:
        pass

T = TypeVar('T')

class ObjectMapper:
    deserializers: List[JsonDeserializer] = []

    def deserialize(self, jsonText: str, mappingClass: Type[T]) -> T:
        jsonData = json.loads(jsonText)
        for deserializer in ObjectMapper.deserializers:
            if deserializer.canDeserialize(jsonData, mappingClass):
                return deserializer.deserialize(jsonData, mappingClass)
        raise NotImplementedError(f'Cannot deserialize json({jsonData}) to class({mappingClass}).')

In this way, you can have the JsonDeserializer object in the ObjectMapper in list format. If you find a compatible JsonDeserializer (canDeserialize (json, mappingClass) == true), Use it to make it available for various classes.

Simple case capture

First, write a unit test for the literal.

from main.json_deserializer import ObjectMapper

class TestObjectMapper:
    def test_deserializeInt(self):
        actual = ObjectMapper().deserialize('1', int)
        assert actual == 1

    def test_deserializeStr(self):
        actual = ObjectMapper().deserialize('"1"', str)
        assert actual == '1'

    def test_deserializeFloat(self):
        actual = ObjectMapper().deserialize('1.5', float)
        assert actual == 1.5

    def test_deserializeNull(self):
        actual = ObjectMapper().deserialize('null', str)
        assert actual is None
$ python -m pytest
================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeInt - main.json_deserializer.NotImplementedError: Cannot deserialize json(1) to c...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeStr - main.json_deserializer.NotImplementedError: Cannot deserialize json(1) to c...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeFloat - main.json_deserializer.NotImplementedError: Cannot deserialize json(1.5) ...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeNull - main.json_deserializer.NotImplementedError: Cannot deserialize json(Null) ...
=================================================================== 5 failed in 0.21s ====================================================================```

I haven't implemented it yet, so of course I can do it. Implements JsonDeserializer for literals.

class ObjectMapper:
    deserializers: List[JsonDeserializer] = [
 LiteralDeserializer () ## added
    ]

class LiteralDeserializer(JsonDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return type(json) in [int, float, str, type(None)]

    def deserialize(self, json: object, mappingClass: type) -> object:
        return json
collected 4 items                                                                                                                                        

tests/test_object_mapper.py ....

=================================================================== 4 passed in 0.03s ====================================================================

###It's a literal, but if you want to make it a type such as date and time / Enum

Now add the following test.

from datetime import datetime, date
from enum import Enum

class Member(Enum):
    JOHN = 'john'
    BOB = 'bob'

 ----(abridgement)----

    def test_deserializeDate(self):
        actual = ObjectMapper().deserialize('"2020-12-23"', date)
        assert actual == date(2020, 12, 23)

    def test_deserializeNaiveDateTime(self):
        actual = ObjectMapper().deserialize('"2020-12-23T03:00:00"', datetime)
        assert actual == datetime(2020, 12, 23, 3, 0, 0)

    def test_deserializeAwareDateTime(self):
        actual = ObjectMapper().deserialize('"2020-12-23T03:00:00+0900"', datetime)
        assert actual == datetime(2020, 12, 23, 3, 0, 0, tzinfo=timezone(timedelta(hours=+9), 'JST'))

    def test_deserializeEnum(self):
        actual = ObjectMapper().deserialize('"john"', Member)
        assert actual is Member.JOHN

At daytime, Naive(No time zone), Aware(There is a time zone)there is. It may be confusing because it can not be calculated between different dates and times, so it may be better to use Aware for everything, but This time, we will make it possible to create the corresponding data in either case.

Test execution as a trial

================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeDate - AssertionError: assert '2020-12-23' == datetime.date(2020, 12, 23)
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeNativeDateTime - AssertionError: assert '2020-12-23T03:00:00' == datetime.datetim...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeAwareDateTime - AssertionError: assert '2020-12-23T03:00:00+0900' == datetime.dat...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeEnum - AssertionError: assert 'john' is <Member.JOHN: 'john'>
============================================================== 4 failed, 4 passed in 0.12s ===============================================================

...That's why I implemented it.

class DateDeserializer(JsonDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return mappingClass == date

    def deserialize(self, json: object, mappingClass: type) -> object:
        dt = datetime.strptime(json, '%Y-%m-%d')
        return date(dt.year, dt.month, dt.day)

class DatetimeDeserializer(JsonDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return mappingClass == datetime

    def deserialize(self, json: object, mappingClass: type) -> object:
        try:
 return datetime.strptime (json,'% Y-% m-% dT% H:% M:% S% z') # Quite miscellaneous ..
        except ValueError:
            return datetime.strptime(json, '%Y-%m-%dT%H:%M:%S')

class EnumDeserializer(JsonDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return issubclass(mappingClass, Enum)

    def deserialize(self, json: object, mappingClass: type) -> object:
        for enum in mappingClass:
            if enum.value == json:
                return enum

class ObjectMapper:
    deserializers: List[JsonDeserializer] = [
 DatetimeDeserializer (), ## added
 EnumDeserializer (), ## added
 DateDeserializer (), ## added
        LiteralDeserializer()
    ]
collected 8 items                                                                                                                                        

tests/test_object_mapper.py ........

=================================================================== 8 passed in 0.07s ====================================================================

###Handle container classes

Now we will deal with container classes. This time, create the following container class.

First, create a container common class. To do something like this This is because the container class has to request deserialization again for another internal structure. (It's a little tricky, but again for the internal structure ObjectMapper._I'm calling deserialize)

class ContainerDeserializer(JsonDeserializer):
    def deserializeChild(self, json: object, mappingClass: type) -> object:
        return ObjectMapper._deserialize(json, mappingClass)


class ObjectMapper:
    deserializers: List[JsonDeserializer] = [
       ...JsonDeserializer
    ]
    def deserialize(self, jsonText: str, mappingClass: Type[T]) -> T:
        return self._deserialize(json.loads(jsonText), mappingClass)

    @staticmethod
    def _deserialize(jsonData: object, mappingClass: Type[T]) -> T:
        for deserializer in ObjectMapper.deserializers:
            if deserializer.canDeserialize(jsonData, mappingClass):
                return deserializer.deserialize(jsonData, mappingClass)
        raise NotImplementedError(f'Cannot deserialize json({jsonData}) to class({mappingClass}).')

And first, since we changed the structure, make sure that the existing structure is not broken.

collected 8 items                                                                                                                                        

tests/test_object_mapper.py ........                                                                                                               [100%]

=================================================================== 8 passed in 0.07s ====================================================================

It looks okay, so add the test again.

@dataclass
class Person:
    name: str
    age: int

@dataclass
class Group:
    name: str
    leader: Person


## --- Add test ----

    def test_deserializeRawList(self):
 actual = ObjectMapper (). deserialize ('[{"age": 35, "name": "Suzuki"}, {"age": 21, "name": "Yamada"}]', list)
        assert len(actual) == 2
        assert actual[0].age == 35
 assert actual [0] .name =='Suzuki'
        assert actual[1].age == 21
 assert actual [1] .name =='Yamada'

    def test_deserializeTypedList(self):
 actual = ObjectMapper (). deserialize ('[{"age": 35, "name": "Suzuki"}, {"age": 21, "name": "Yamada"}]', List [Person])
        assert len(actual) == 2
        assert type(actual[0]) == Person
        assert actual[0].age == 35
 assert actual [0] .name =='Suzuki'
        assert type(actual[1]) == Person
        assert actual[1].age == 21
 assert actual [1] .name =='Yamada'

    def test_deserializeRawDict(self):
 actual = ObjectMapper (). deserialize ('{"ID1": {"age": 35, "name": "Suzuki"}, "ID3": {"age": 21, "name": "Yamada"}} ', dict)
        assert len(actual) == 2
        assert actual['ID1'].age == 35
 assert actual ['ID1']. name =='Suzuki'
        assert actual['ID3'].age == 21
 assert actual ['ID3']. name =='Yamada'

    def test_deserializeTypedDict(self):
        actual = ObjectMapper().deserialize(
 '{" ID1 ": {" age ": 35," name ":" Suzuki "}," ID3 ": {" age ": 21," name ":" Yamada "}}',
            Dict[str, Person]
        )
        assert len(actual) == 2
        assert type(actual['ID1']) == Person
        assert actual['ID1'].age == 35
 assert actual ['ID1']. name =='Suzuki'
        assert type(actual['ID3']) == Person
        assert actual['ID3'].age == 21
 assert actual ['ID3']. name =='Yamada'

    def test_deserializeRawObject(self):
 actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', object)
 assert actual.name =='group'
        assert actual.leader.age == 35
 assert actual.leader.name =='Suzuki'

    def test_deserializeTypedObject(self):
 actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', Group)
        assert type(actual) == Group
 assert actual.name =='group'
        assert type(actual.leader) == Person
        assert actual.leader.age == 35
 assert actual.leader.name =='Suzuki'

Test execution as a trial...Naturally implemented n (ry

================================================================ short test summary info =================================================================
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawList - main.json_deserializer.NotImplementedError: Cannot deserialize json([{'...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedList - main.json_deserializer.NotImplementedError: Cannot deserialize json([...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawDict - main.json_deserializer.NotImplementedError: Cannot deserialize json({'I...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedDict - main.json_deserializer.NotImplementedError: Cannot deserialize json({...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeTypedObject - main.json_deserializer.NotImplementedError: Cannot deserialize json...
FAILED tests/test_object_mapper.py::TestObjectMapper::test_deserializeRawObject - main.json_deserializer.NotImplementedError: Cannot deserialize json({...
============================================================== 6 failed, 8 passed in 0.36s ===============================================================

Implementation.

from inspect import signature, _ParameterKind
from typing import Type, TypeVar, List, Dict, Set, get_origin, get_args

class ListDeserializer(ContainerDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return get_origin(mappingClass) == list or mappingClass == list

    def deserialize(self, json: object, mappingClass: type) -> object:
        genericParams = get_args(mappingClass)
        hasGenericParams = genericParams is not None and len(genericParams) > 0
        param = genericParams[0] if hasGenericParams else object
        return [self.deserializeChild(el, param) for el in json]

class DictDeserializer(ContainerDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return get_origin(mappingClass) == dict or mappingClass == dict

    def deserialize(self, json: object, mappingClass: type) -> object:
        genericParams = get_args(mappingClass)
        hasKeyParam = genericParams is not None and len(genericParams) > 0
        hasValueParam = genericParams is not None and len(genericParams) > 1
        return {
            self.deserializeChild(k, genericParams[0] if hasKeyParam else object)
            : self.deserializeChild(v, genericParams[1] if hasValueParam else object)
            for k, v in json.items()
        }

class ObjectDeserializer(ContainerDeserializer):
    def canDeserialize(self, json: object, mappingClass: type) -> bool:
        return type(json) == dict

    def deserialize(self, json: object, mappingClass: type) -> object:
        if mappingClass == object:
            return self.createRawObject(json)
        else:
            return self.createObject(json, mappingClass)
    
    def createRawObject(self, args: Dict[str, object]) -> object:
        className = ''.join(key.title() for key in args.keys()) + 'Obejct'
        newInstance = type(className, (object,), {})()
        for k, v in args.items():
            setattr(newInstance, k, self.deserializeChild(v, object))
        return newInstance

    def createObject(self, args: Dict[str, object], mappingClass: type) -> object:
        annotations = self.findAnnotations(mappingClass)
        requireArgs = self.findInitRequireArgs(mappingClass)

        result = object.__new__(mappingClass)
        initArgs = {}
        for k, v in args.items():
            val = self.deserializeChild(v, annotations.get(k, object))
            if k in requireArgs:
                initArgs[k] = val
            else:
                setattr(result, k, val)
        for req in requireArgs:
            if req not in initArgs:
                initArgs[req] = None
        result.__init__(**initArgs)
        return result
    
    @staticmethod
    def findAnnotations(mappingClass: type) -> Dict[str, type]:
        if hasattr(mappingClass, '__annotations__'):
            return mappingClass.__annotations__
        for k, v in signature(mappingClass.__init__).parameters.items():
            return {k:v.annotation for k,v in signature(mappingClass.__init__).parameters.items()}

    @staticmethod
    def findInitRequireArgs(mappingClass: type) -> Set[str]:
        return {
            k for k, v in signature(mappingClass.__init__).parameters.items()
            if v.name != 'self' and v.kind == _ParameterKind.POSITIONAL_OR_KEYWORD
        }


class ObjectMapper:
    deserializers: List[JsonDeserializer] = [
        DateDeserializer(),
        DatetimeDeserializer(),
        EnumDeserializer(),
        LiteralDeserializer(),
 ListDeserializer (), ## added
 DictDeserializer (), ## added
 ObjectDeserializer () ## added
    ]

Test run

collected 14 items                                                                                                                                       

tests/test_object_mapper.py ..............                                                                                                         [100%]

=================================================================== 14 passed in 0.18s ===================================================================

####Behavior description of Object deserializer

When it comes to mapping Objects, the difficulty suddenly increased...So I will explain it briefly.

Also, to deserialize the data contained in the object, You have to get the class of data, but there are two places where it may be stored:

-If class has a type definition(Group of code below, GroupWithoutDataclass)

@dataclass
class Person:
    name: str
    age: int

# pattern 1
@dataclass
class Group:
    name: str
    leader: Person

# Pattern 2
class GroupWithoutDataclass:
    name: str
    leader: Person

# Pattern 3
class GroupInitDef:
    def __init__(self, name: str, leader: Person):
        self.name = name
        self.leader = leader

Also in pythonPerson('AA', 23)And runPersonclass__init__The method is executed. Variadic arguments in this method(*args, **kwargs etc.)An error will occur if you do not specify any variables other than those defined in. for that reason,def findInitRequireArgs(mappingClass: type) -> Set[str]I am getting the variable name to be defined in.

And this time, when the JSON data and the mapping data have the following relationship, the following implementation is used.

-JSON data does not have the data required for mapping → Fill with None -JSON data has unnecessary data for mapping → Attribute is added with setattr

This time, I'm doing this, but in some cases it may be better to make an error.

Added a test just in case.(Group has been tested earlier, so omitted)

    def test_deserializeTypedObjectWithoutDataclass(self):
 actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', GroupWithoutDataclass)
        assert type(actual) == GroupWithoutDataclass
 assert actual.name =='group'
        assert type(actual.leader) == Person
        assert actual.leader.age == 35
 assert actual.leader.name =='Suzuki'

    def test_deserializeTypedObjectInitDef(self):
 actual = ObjectMapper (). deserialize ('{"name": "group", "leader": {"age": 35, "name": "Suzuki"}}', GroupInitDef)
        assert type(actual) == GroupInitDef
 assert actual.name =='group'
        assert type(actual.leader) == Person
        assert actual.leader.age == 35
 assert actual.leader.name =='Suzuki'

    def test_deserializeTypedObjectEmptyJson(self):
        actual = ObjectMapper().deserialize('{}', Group)
        assert type(actual) == Group
        assert actual.name == None
        assert actual.leader == None

    def test_deserializeTypedObjectRedundantData(self):
 actual = ObjectMapper (). deserialize ('{"name": "group", "id": 2, "leader": {"age": 35, "name": "Suzuki", "role": "chief" }}', Group)
 assert actual.name =='group'
        assert actual.id == 2
        assert type(actual.leader) == Person
        assert actual.leader.age == 35
 assert actual.leader.name =='Suzuki'
 assert actual.leader.role =='Chief'
collected 18 items                                                                                                                                       

tests/test_object_mapper.py ..................                                                                                                     [100%]

=================================================================== 18 passed in 0.17s ===================================================================

###If you want to devise more...

Actually, there are some more complicated specifications, so it may be necessary to adjust the specifications according to the project.

However, this time, the purpose is to map the JSON received by API etc. to the class. JSON models usually don't need that complicated requirements.

##Finally

This time, I created an object mapper that instantiates a class from JSON. It's not a very difficult implementation, and you can handle anything by adding deserialization logic each time. I don't think maintenance is that hard.

Type definitions are becoming more and more convenient in python, so there may be something good in the future if you keep this mechanism in mind.

Recommended Posts

Create a JSON object mapper in Python
Create a function in Python
Create a dictionary in Python
How to create a JSON file in Python
Create a datetime object from a string in Python (Python 3.3)
Create a DI Container in Python
Create a binary file in Python
Create a Kubernetes Operator in Python
Create a random string in Python
Create a simple GUI app in Python
[GPS] Create a kml file in Python
[Python / Django] Create a web API that responds in JSON format
Create a Python module
Handling json in python
Create SpatiaLite in Python
Object oriented in python
Create a Python environment
Create a Vim + Python test environment in 1 minute
Create a GIF file using Pillow in Python
I want to create a window in Python
Create a standard normal distribution graph in Python
Create a virtual environment with conda in Python
How to generate a Python object from JSON
Create a simple momentum investment model in Python
Create a new page in confluence with Python
Create a package containing global commands in Python
Create a MIDI file in Python using pretty_midi
Create a loop antenna pattern in Python in KiCad
Insert an object inside a string in Python
[Docker] Create a jupyterLab (python) environment in 3 minutes!
Take a screenshot in Python
Parse a JSON string written to a file in Python
Create a data collection bot in Python using Selenium
Create a Wox plugin (Python)
Easily format JSON in Python
String object methods in Python
I tried to create a class that can easily serialize Json in Python
[LINE Messaging API] Create a rich menu in Python
Null object comparison in Python
Create gif video in Python
Create a plugin to run Python Doctest in Vim (2)
Create a plugin to run Python Doctest in Vim (1)
In Python, create a decorator that dynamically accepts arguments Create a decorator
Make a bookmarklet in Python
Create a python numpy array
Convert / return class object to JSON format in Python
Python script to create a JSON file from a CSV file
Draw a heart in Python
Create a directory with python
Create a fake Minecraft server in Python with Quarry
Create a local scope in Python without polluting the namespace
Create a list in Python with all followers on twitter
Temporarily save a Python object and reuse it in another Python
Create a child account for connect with Stripe in Python
Let's create a script that registers with Ideone.com in Python.
Create code that outputs "A and pretending B" in python
Create a tool to check scraping rules (robots.txt) in Python
[Python] Use JSON format data as a dictionary type object
Maybe in a python (original title: Maybe in Python)
Write a binary search in Python
[python] Manage functions in a list