[PYTHON] Google App Engine Datastore and Search API integration

Google App Engine Datastore and Search API integration

It seems that there are quite a few situations where you want to store data in the Datastore and register it in the Index of the Search API. However, if you are not careful about transactions, it is possible that even though it is stored in the Datastore, it is not registered in the Search API Index (hereinafter, Index means the Search API Index). It ends up.

Therefore, let us consider a method of registering data in the Datastore and Index (almost) at the same time.

  1. How to use _post_put_hook (consistency is not guaranteed)
  2. Simple way to use @ ndb.transacational (consistency guaranteed)
  3. How to use Task Queue (consistency guaranteed)
  4. Conclusion

** 1 method is not recommended **. However, there are some examples of using method 1 (one was submitted as an answer on Stack Overflow, and Ferris 2.2 uses method 1), so I'm introducing it for the time being.

1. How to use _post_put_hook

** Not recommended ** unless you have a specific reason.

class User(ndb.Model):
    name = ndb.StringProperty()
    
    def _post_put_hook(self, future):
        result = future.get_result()

#Register in Search API Index ...

post_put_hook is called after User is put. If put fails, an error will be sent in the future.get_result () part, so it will not be registered in Index. However, there is a possibility that the registration of User will fail even though the registration of User is successful.

In other words, this method does not guarantee the consistency of Datastore and Index. It's okay if consistency doesn't have to be guaranteed, but it should be avoided otherwise.

2. How to simply use @ ndb.transacational

user = User(name='Yohei')

@ndb.transactional
def put():
    user.put()
    doc = search.Document(
        doc_id = person.key.urlsafe(),
        fields=[
            search.TextField(name='name', value=user.name),
        ],)
    index.put(doc)
    
put()

This guarantees consistency. However, you have to be careful in the following examples.

user = User(name='Yohei')

@ndb.transactional
def put():
    user.put()
    doc = search.Document(
        doc_id = person.key.urlsafe(),
        fields=[
            search.TextField(name='name', value=user.name),
        ],)
    index.put(doc)
    # do something
    ...
    
put()

If an error occurs in the do something part here, it may not be registered in the Datastore, but it may be registered in the Index.

3. How to use Task Queue

** Task Queue can be processed transactional. [^ 2-1] ** Therefore, if you add the task "Register Search API to Index" in Task Queue, consistency can be guaranteed.

user = User(name='Yohei')
user2 = User(name='Yohei2')

@ndb.transactional(xg=True)
def put():
    user.put()
    user2.put()
    taskqueue.add(
        url='/put_to_index',
        params={
            'key': user.key.urlsafe(),
            'name': user.name},
        transactional=True,)
    taskqueue.add(
        url='/put_to_index',
        params={
            'key': user2.key.urlsafe(),
            'name': user2.name},
        transactional=True,)
    # do something
    ...
    
put()

This guarantees consistency, even if do something causes an error. As in the example, it is possible to register to two indexes at the same time. Note that you can only stack up to 5 transactional Task Queues. It should be avoided in situations where there are many writes at the same time.

4. Conclusion

Use 2 or 3 methods. Use properly according to the situation.

Recommended Posts

Google App Engine Datastore and Search API integration
Tweet (API 1.1) on Google App Engine for Python
(Beginner) Basic usage of Datastore on Google App Engine
Google App Engine development with Docker
Java 1 1 support from Google App Engine
Use ndb.tasklet on Google App Engine
Google API access token and refresh token
Google App Engine webapp.RequestHandler cannot get parameters when receiving put and delete
Deploy a Python app on Google App Engine and integrate it with GitHub
[Python] Run Flask on Google App Engine
[Google App Engine] User Objects (Japanese translation)
Use external modules on Google App Engine
I can't deploy with google app engine
Image collection using Google Custom Search API
Display Google Maps API with Rails and pin display
Google App Engine SDK / Go (Linux version) and Python installed by linuxbrew are incompatible
Various memorandums when using sdk of LINE Messaging API with Python (2.7.9) + Google App Engine
[Python] Easy Google Translate app using Eel and Googletrans
Deploy a Django application on Google App Engine (Python3)
Google App Engine / Python development environment construction procedure (late 2014)
Crawling with Python and Twitter API 1-Simple search function
Book registration easily with Google Books API and Rails
Getting Started with Google App Engine for Python & PHP
How to use Django on Google App Engine / Python
Speech transcription procedure using Python and Google Cloud Speech API
Runtime version of Google App Engine / Python Standard Environment