[PYTHON] Create a private DMP with zero initial cost and zero development with BigQuery

Private DMPs are popular these days, but most of them are treasure data (YBI). Treasure data is very useful (especially td-js-sdk), but it's a bit expensive.

On the other hand, BigQuery is attractive for its low price and query execution speed, but it is very inconvenient to input and output data compared to treasure data.

So, I made an application with GAE (Python) that realizes ease of use like treasure data on BigQuery. If you use your own source code, you can build it with zero development.

Source code: https://github.com/mats116/ElasticBigQuery

To be able to

Get logs from td-js-sdk

--It works just like treasure data just by changing the endpoint as shown below. --writeKey is static at the moment, but we plan to make it possible to issue and manage permissions from the screen. --Only JSONP, which has become the default since v1.4.0, is supported. --BigQuery is not schemaless, but it automatically identifies it in the API and automatically generates datasets and tables. --The generated table name has a date (UTC) at the end, such as table_id + YYYYMMDD, making it a daily table. --As a unique function, the following parameters are acquired in the URL-decoded state. - td_path - td_referrer - td_url

pageviews.js


<script type="text/javascript">
  var td = new Treasure({
    host: 'elasticbigquery.appspot.com',
    pathname: '/dmp/v1/event/',
    writeKey: 'thie_is_static_setting_yet',
    database: '<dataset_id>'
  });

  td.trackPageview('<table_id>');
</script>

Get logs from web beacons

--Returns a transparent GIF with the endpoint below. - //elasticbigquery.appspot.com/dmp/v1/beacon/<dataset_id>/measurement --You can get the GET parameter. --There are some parameters such as referrer that are acquired by default.

Issuing a cookie ID

--Issue bqid under the GAE domain xxx-xxx.appspot.com. --Please change the domain and id name according to the purpose. --You can check it at the following endpoint. - http://elasticbigquery.appspot.com/dmp/v1/bqid/get --If you add callback = hoge to the GET parameter, you can use it like JSONP.

What I want to be able to do from now on

Account control

--I want to be able to grant permissions for BigQuery datasets to the account signed up with Oauth (Google) from the UI.

control of writeKey

--I want to be able to issue a writeKey from the UI. (Currently static) ――It's hard to refer to Cloud Datastore every time, so maybe memocache?

Export function

--I want to be able to export query results to S3 and Google Cloud Storage. ――I wonder if it can be integrated with Apps Script in a nice way

Impressions

--Completely freeride to td-js-sdk. .. ――When you fail to put it in BigQuery, it is loaded in the Task queue, so it is made quite firmly. ――Because it is a personal project, we are looking for a company that can do GAE load verification. ――I'm not good at making UI, so please help me.

* GAE construction method

For your reference. If you are new to GAE, do your best.

What to prepare

--Google account (@gmail address is also OK) -PC with Google App Engine SDK

Creating a project

-Create a new project in the Google Developers Console. --App Engine location is ** us-central ** is the closest to Japan --By default ** BigQuery API ** should be enabled, but just in case

Get source code

--Since it is published on GitHub, please clone it appropriately. - https://github.com/mats116/ElasticBigQuery

setting change

--Open ʻapp.yaml` and rename the project --In terms of source code, here

Deploy

--Deploy from Google AppEngine Launcher. GoogleAppEngineLauncher_と_BigQueryで初期費用ゼロ,開発ゼロでプライベートDMPを作る.png

Recommended Posts

Create a private DMP with zero initial cost and zero development with BigQuery
Create a simple Python development environment with VS Code and Docker
Create a private repository with AWS CodeArtifact
Create a native GUI app with Py2app and Tkinter
Create a python development environment with vagrant + ansible + fabric
Create a batch of images and inflate with ImageDataGenerator
Create a 3D model viewer with PyQt5 and PyQtGraph
[Linux] Create a self-signed certificate with Docker and apache
Create a web surveillance camera with Raspberry Pi and OpenCV
Create applications, register data, and share with a single email
Let's create a PRML diagram with Python, Numpy and matplotlib.
Create a GO development environment with [Mac OS Big Sur]
Create a simple Python development environment with VSCode & Docker Desktop
Create a deploy script with fabric and cuisine and reuse it
Create a homepage with django
Create a heatmap with pyqtgraph
Create a directory with python
Let's create a tic-tac-toe AI with Pylearn 2-Save and load models-
Create a temporary file with django as a zip file and return it
Create a striped illusion with gamma correction for Python3 and openCV3
Web App Development Practice: Create a Shift Creation Page with Django! (Shift creation page)
I tried to create Bulls and Cows with a shell program
Create a development environment for Go + MySQL + nginx with Docker (docker-compose)
Create a C ++ and Python execution environment with WSL2 + Docker + VSCode
Create and return a CP932 CSV file for Excel with Chalice
[DynamoDB] [Docker] Build a development environment for DynamoDB and Django with docker-compose
Create a virtual environment with Python!
Create a poisson stepper with numpy.random
Create a file uploader with Django
[AWS] Create a Python Lambda environment with CodeStar and do Hello World
Web App Development Practice: Create a Shift Creation Page with Django! (Introduction)
I tried to create a plug-in with HULFT IoT Edge Streaming [Development] (2/3)
Create a stack with a queue and a queue with a stack (from LetCode / Implement Stack using Queues, Implement Queue using Stacks)
Create a Todo app with Django ④ Implement folder and task creation functions
Create a Python3 environment with pyenv on Mac and display a NetworkX graph
Create a decision tree from 0 with Python and understand it (5. Information Entropy)