Get data via salesforce API (Bulk API) in Python and load it into BigQuery

background

I want to get data from Salesforce using Python. This time, I used a library called simple_salesforce and Bulk API of Salsesorce API. -jp.api_asynch.meta / api_asynch / asynch_api_intro.htm) to get the data and load it into BigQuery.

Around authentication

Prepare the following three

Implementation example


from google.cloud import bigquery
from simple_salesforce import Salesforce
import json
import os


class SalesforceAPI:
    def __init__(self, job_type):
        self.sf = Salesforce(
            username='USERNAME',
            password='PASSWORD',
            security_token='SECURITY_TOKEN'
        )

    def execute(self):
        self.dl_file()
        self.load_to_bq()

    def dl_file(self):
        res = self.sf.bulk.TABLE_NAME.query('SELECT column1, column2 FROM TABLE_NAME')
        with open('dl_file_name', mode='w') as f:
            for d in res:
                f.write(json.dumps(d, ensure_ascii=False) + "\n") #Corresponding to garbled Japanese characters

    def load_to_bq(self):
        client = bigquery.Client('project')
        filename = 'file_name'
        dataset_id = 'dataset'
        dataset_ref = client.dataset(dataset_id)
        table_id = 'table_name'
        table_ref = dataset_ref.table(table_id)

        job_config = bigquery.LoadJobConfig()
        job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
        job_config.autodetect = True #Specify schema if necessary

        with open(filename, "rb") as source_file:
            job = client.load_table_from_file(
                source_file, table_ref, job_config=job_config
            )
            job.result()
            print("Loaded {} rows into {}:{}.".format(
                job.output_rows, dataset_id, table_id))

reference

I made the strongest SOQL execution tool I thought

--The SOQL execution tool introduced here is convenient --Use this to create SOQL for DL

Recommended Posts

Get data via salesforce API (Bulk API) in Python and load it into BigQuery
Get Google Fit API data in Python
Get Youtube data in Python using Youtube Data API
Get LEAD data using Marketo's REST API in Python
Get Leap Motion data in Python.
Get Salesforce data using REST API
Get data from Quandl in Python
processing to use notMNIST data in Python (and tried to classify it)
Get additional data in LDAP with python
Sample code to get the Twitter API oauth_token and oauth_token_secret in Python 2.7
Easily graph data in shell and Python
[Python] Precautions when retrieving data by scraping and putting it in the list
[Python] How to name table data and output it in csv (to_csv method)
Get time series data from k-db.com in Python
Get the weather in Osaka via WebAPI (python)
Python variables and data types learned in chemoinformatics
[Python] Get all comments using Youtube Data API
Receive and display HTML form data in Python
[Python] Swapping rows and columns in Numpy data
Get stock price data with Quandl API [Python]
Load and execute command from yml in python
POST JSON in Python and receive it in PHP
Get Gmail subject and body with Python and Gmail API
Try using ChatWork API and Qiita API in Python
Until you get daily data for multiple years of Japanese stocks and save it in a single CSV (Python)
Full-width and half-width processing of CSV data in Python
Power BI visualization of Salesforce data entirely in Python
Get data from GPS module at 10Hz in Python
[Python] Get insight data using Google My Business API
Send and receive Gmail via the Gmail API using Python
Until you insert data into a spreadsheet in Python
Get your current location and user agent in Python
Get comments and subscribers with the YouTube Data API
Get mail from Gmail and label it with Python3
Get stock prices and create candlestick charts in Python
[Python] Get user information and article information with Qiita API
Get data from database via ODBC with Python (Access)
Recursively get the Excel list in a specific folder with python and write it to Excel.
[Python / Ruby] Understanding with code How to get data from online and write it to CSV
Evernote API in Python
C API in Python 3
Try to make it using GUI and PyQt in Python
Get your heart rate from the fitbit API in Python!
Temporarily save a Python object and reuse it in another Python
Get data using Ministry of Internal Affairs and Communications API
Get the MIME type in Python and determine the file format
Get a Python web page, character encode it, and display it
Poisson distribution and Poisson cumulative distribution plot via sqlite in Python and Java
Get rid of dirty data with Python and regular expressions
Manipulate the clipboard in Python and paste the table into Excel
Get the current date and time in Python, considering the time difference
Hit REST in Python to get data from New Relic
Predict gender from name using Gender API and Pykakasi in Python
Graph time series data in Python using pandas and matplotlib
Get data from analytics API with Google API Client for python
Object-oriented in C: Refactored "○ ✕ game" and ported it to Python
Specification generation and code generation in REST API development (Python edition)
Install CaboCha in Ubuntu environment and call it with Python.
Get additional data to LDAP with python (Writer and Reader)
Call github api in python to get pull request information