A python script that imports a dated csv file into BigQuery as a time partition table

background

I want a python script that imports a csv file with a date in the file name into BigQuery, such as xxxx_20200930.csv, with a partition time. This time, I created it on the assumption that a large number of csv files are in the directory and below.

Sample script

main.py



from google.cloud import bigquery
import json
import glob

client = bigquery.Client()

job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.CSV,
    skip_leading_rows=1,
    autodetect=True,
    allow_quoted_newlines=True,
    time_partitioning=bigquery.TimePartitioning()
)

path = "../some/dir/*"
files = glob.glob(path + '*')

for file_name in files:
    date = file_name.split('_')[-1][0:8]
    table_id = 'dataset.table_name$' + date #Partition specification

    with open(file_name, "rb") as source_file:
        job = client.load_table_from_file(
            source_file,
            table_id,
            job_config=job_config
    )

    job.result()  # Waits for the job to complete.

    table = client.get_table(table_id)  # Make an API request.
    print(
        "Loaded {} rows and {} columns to {}".format(
            table.num_rows, len(table.schema), table_id
        )
    )

reference

Load data from local data source (https://cloud.google.com/bigquery/docs/loading-data-local?hl=ja#loading_data_from_a_local_data_source)

Recommended Posts

A python script that imports a dated csv file into BigQuery as a time partition table
A Python script that reads a SQL file, executes BigQuery and saves the csv
Python script that outputs all records of Oracle table to CSV file
Python script to create a JSON file from a CSV file
A memo that implements the job of loading a GCS file into BigQuery in Python
A Python script that saves a clipboard (GTK) image to a file.
A python script that converts Oracle Database data to csv
Extract bigquery dataset and table list with python and output as CSV
[Python] How to store a csv file as one-dimensional array data
Launch a Python script as a service
Download Pandas DataFrame as a CSV file
[Python, PyPDF2] A script that divides a spread PDF into two left and right
A python script that draws a band diagram from the VASP output file EIGENVAL
How to read a CSV file with Python 2/3
I tried reading a CSV file using Python
"Python Kit" that calls a Python script from Swift
A script that combines multiple pages of a PDF file into one page without margins
A script that combines your favorite python modules and binaries into one Lambda Layer
[Python] A notebook that translates and downloads the ipynb file on GitHub into Japanese.
File overwrite confirmation with option that takes a file object as an argument with Python argparse
A python script that wants to use Mac startup / end time for attendance management
Script python file
A memorandum to run a python script in a bat file
Python that merges a lot of excel into one excel
A shell script that puts Webmin into Alpine Linux
What's in that variable (when running a Python script)
Created a Python library DateTimeRange that handles time ranges
How to save a table scraped by python to csv
Output the output result of sklearn.metrics.classification_report as a CSV file
A Python program that converts ical data into text