[PYTHON] [memo] Send table information to GCS with BigQuery Schedule and Cloud Functions

TODO

After the schedule set in BigQuery is completed, convert the table information to CSV and send the file to GCS.

Implementation of Cloud Functions transmission processing triggered by Pub / Sub

Pub / Sub settings

PUBSUB.png
  1. Enter the Topic ID that triggers Cloud Functions

Example: Set "example-1"

Cloud Functions settings

FUNK1.png

When the trigger ignites Refer to BQ table in GCS Place a script to send CSV

  1. Set the trigger in "Cloud Pub / Sub" Select "example-1" above
  2. I like the language, this time inline implementation in Python

REQUIREMENTS.TXT describes necessary modules etc. Describe what actually works with MAIN.PY

The implementation content is such as "Send yesterday's table information to GCS on BQ"

#What is installed with pip
google-cloud-bigquery
from google.cloud import bigquery
from datetime import date, timedelta
def export_table(event, context):
    client = bigquery.Client()
    yesterday = date.today() - timedelta(days=1)
    project = "project_name"
    dataset_id = "data_set_name"
    table_id = "table_name_yyyymmdd_" + yesterday.strftime('%Y%m%d')
    destination_uri = "gs://{}/{}".format("dir_name", yesterday.strftime('%Y%m%d')+".csv")
    dataset_ref = client.dataset(dataset_id, project=project)
    table_ref = dataset_ref.table(table_id)
    extract_job = client.extract_table(
    table_ref,
    destination_uri,
    location="US",
    )
    extract_job.result()

Finally, enter "export_table" as an execution function

Scheduling with BigQuery

There is a "schedule" provided as one of the functions of BigQuery, so use it

BQ1.png

Enter a valid query and start by creating a new schedule for it

BQ2.png

By setting "example-1" in the Pub / Sub input at the bottom, The trigger will fire when the table is created.

With the above, after the schedule set in BigQuery is completed, the table information will be converted to CSV and the file will be sent to GCS.

Afterword

When migrating Rundeck's on-premise server to the cloud The service account has been switched, and GS execution authority has not been granted. The content itself is such as "Create a table that is completed within BQ" and "Send the table data to GCS" Since it was a fairly small-scale use of functions, I decided to complete it with GCP.

It's not a big deal, so I'll leave a memo implemented here.

Recommended Posts

[memo] Send table information to GCS with BigQuery Schedule and Cloud Functions
Convert the spreadsheet to CSV and upload it to Cloud Storage with Cloud Functions
[GCF + Python] How to upload Excel to GCS and create a new table in BigQuery
termux × AWS Send smartphone location information to AWS with IoT
Easy to use Nifty Cloud API with botocore and python
Cloud Functions to resize images using OpenCV with Cloud Storage triggers
Precautions when outputting to BigQuery table every hour with Luigi
Send experiment results (text and images) to slack with Python
Upload and delete files to Google Cloud Storages with django-storage
Connect to BigQuery with Python
Extract bigquery dataset and table list with python and output as CSV
I want to use an external library with IBM Cloud Functions
How to connect to Cloud Firestore from Google Cloud Functions with python code
Try to display google map and geospatial information authority map with python
Upload file to GCP's Cloud Storage (GCS) ~ Load with local Python