Get exchange rates on Heroku regularly and upload logs to Amazon S3

Build a development environment

Create virtualenv

$ mkdir cronapp && cd cronapp
$ mkvirtualenv venv

Log in to Heroku

(venv)$ heroku login

Added Python library

(venv)$ cat >> requirements.txt << EOF
APScheduler==3.0.4
awscli==1.9.11
boto3==1.2.2
botocore==1.3.11
colorama==0.3.3
docutils==0.12
futures==3.0.3
httplib2==0.9
jmespath==0.9.0
pyasn1==0.1.9
python-dateutil==2.4.2
pytz==2015.7
requests==2.8.1
rsa==3.2.3
six==1.10.0
tzlocal==1.2
wheel==0.26.0
EOF
(venv)$ pip install -r requirements.txt

Create a script for operation check

(venv)$ vi cron.py
from apscheduler.schedulers.blocking import BlockingScheduler

sched = BlockingScheduler()

@sched.scheduled_job('interval', minutes=3)
def job_3min():
    print('[cron.py:job_3min] Start.')

sched.start()

Added Procfile to periodically execute the created script

(venv)$ echo "bot: python cron.py" > Procfile

Added .gitignore

(venv)$ cat >> .gitignore << EOF
venv
*.pyc
.idea
EOF

Create repository locally

(venv)$ git init && git add . && git commit -m "initial commit"

Deploy to Heroku

Create a repository on Heroku

(venv)$ heroku create

Deploy application

(venv)$ git push heroku master

dyno process assignment

(venv)$ heroku ps:scale bot=1

Operation check

(venv)$ heroku logs
2015-12-07T01:36:20.343967+00:00 app[bot.1]: [cron.py:job_3min] Start.
2015-12-07T01:39:20.346373+00:00 app[bot.1]: [cron.py:job_3min] Start.
2015-12-07T01:42:20.344067+00:00 app[bot.1]: [cron.py:job_3min] Start.

Get currency exchange data on openexchangerates.org and upload to S3

Changed cron.py to the following contents

--Create an AWS IAM user and obtain an API key for Open Exchange Rates in advance. --Open Exchange Rates data is updated around 1 to 2 minutes per hour, but it is specified to execute regularly at 10 minutes per hour with a margin.

import requests, json, datetime, pytz, logging
import boto3, botocore
from apscheduler.schedulers.blocking import BlockingScheduler

logging.basicConfig()
sched = BlockingScheduler()

@sched.scheduled_job('cron', minute='10', hour='*/1')
def job_crawl():
    print('[cron.py:job_crawl] Start.')
    
    ####################################
    # API Keys
    ####################################

    OPEN_EXCHANGE_API_URL = 'https://openexchangerates.org/api/latest.json?app_id='
    OPEN_EXCHANGE_APP_ID = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'

    AWS_ACCESS_KEY_ID = 'xxxxxxxxxxxxxxxx'
    AWS_SECRET_ACCESS_KEY = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
    AWS_REGION_NAME = 'xx-xxxxx-x'
    AWS_S3_BUCKET_NAME = 'xxxxxxxxxxx'

    ####################################
    # Retrieve json data from openexchangerates.com
    ####################################

    res = requests.get(OPEN_EXCHANGE_API_URL + OPEN_EXCHANGE_APP_ID)
    json_data = json.loads(res.text.decode('utf-8'))
    del json_data['disclaimer']
    del json_data['license']
    json_text = json.dumps(json_data)

    timestamp = json_data['timestamp']
    exchange_date = datetime.datetime.fromtimestamp(timestamp, tz=pytz.utc)

    ####################################
    # Upload json data to S3 bucket
    ####################################

    if json_text:

        #
        # AWS Session
        #
        session = boto3.session.Session(aws_access_key_id=AWS_ACCESS_KEY_ID,
                                        aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                                        region_name=AWS_REGION_NAME)
        s3 = session.resource('s3')
        bucket = s3.Bucket(AWS_S3_BUCKET_NAME)

        #
        # Upload Latest
        #
        bucket_latest_key_name = 'exchange/latest.json'
        obj = bucket.Object(bucket_latest_key_name)
        response = obj.put(
            Body=json_text.encode('utf-8'),
            ContentEncoding='utf-8',
            ContentType='application/json'
        )

        #
        # Upload Daily Data
        #
        bucket_prefix_daily = "{0:%Y-%m-%d}".format(exchange_date)
        bucket_daily_key_name = 'exchange/' + bucket_prefix_daily + '/' + bucket_prefix_daily + '.json'
        obj = bucket.Object(bucket_daily_key_name)
        response = obj.put(
            Body=json_text.encode('utf-8'),
            ContentEncoding='utf-8',
            ContentType='application/json'
        )

        #
        # Upload Hourly Data
        #
        bucket_hourly_prefix = "{0:%Y-%m-%d-%H}".format(exchange_date)
        bucket_hourly_key_name = 'exchange/' + bucket_prefix_daily + '/' + bucket_hourly_prefix + '.json'
        try:
            # If json file already exists, do nothing
            s3.Object(AWS_S3_BUCKET_NAME, bucket_hourly_key_name).load()
        except botocore.exceptions.ClientError as e:
            # If json file doesn't exists
            obj = bucket.Object(bucket_hourly_key_name)
            response = obj.put(
                Body=json_text.encode('utf-8'),
                ContentEncoding='utf-8',
                ContentType='application/json'
            )

    print('[cron.py:job_crawl] Done.')


sched.start()

Deploy to reflect the updated contents

(venv)$ git add . && git commit -m "changed cron job"
(venv)$ git push heroku master

Operation check

(venv)$ heroku logs
2015-12-07T03:10:00.003862+00:00 app[bot.1]: [cron.py:job_crawl] Start.
2015-12-07T03:10:01.856428+00:00 app[bot.1]: [cron.py:job_crawl] Done.

bonus

Rename App

(venv)$ heroku apps:rename cronapp

Change repository

(venv)$ git remote rm heroku
(venv)$ heroku git:remote -a cronapp

Reference site

-How to deploy a Django app on heroku in just 5 minutes -Miscellaneous notes about deploying the django app on Heroku -cron (Python) without add-ons on heroku

Recommended Posts

Get exchange rates on Heroku regularly and upload logs to Amazon S3
How to get the key on Amazon S3 with Boto 3, implementation example, notes
Steps to measure coverage and get badges on codecov.io
POST the image selected on the website with multipart / form-data and save it to Amazon S3! !!
I get [Error 2055] when trying to connect to MySQL on Heroku
[Python] Regularly export from CloudWatch Logs to S3 with Lambda
Move CloudWatch logs to S3 on a regular basis with Lambda
[Introduction to Systre] Exchange rates and stocks: Behavior during a crash ♬
Install Anaconda on Mac and upload Jupyter (IPython) notebook to Anaconda Cloud
Upload data to s3 of aws with a command and update it, and delete the used data (on the way)