2020.11.16 update Alibaba Cloud version I made it.
TL;DR HTML that describes SSI (include virtual) is stored in a specific bucket of S3, and the sources in the include are combined with Lambda and then stored in another bucket.
Since S3 is a static site hosting service, it is not possible in principle to process dynamically on the server side. Of course, the CDN side (CloudFront), which is the cache destination after that, cannot be used either.
In that case, I think that the standard method is to build a local development environment, keep it as a separate file locally, and combine it when compiling, but S3 + is an existing site that originally used SSI. Unfortunately, that flow cannot be introduced for all projects, including when transferring to CloudFront.
So, if SSI was originally used, I tried to adjust it on the AWS side so that it can be used as it is without replacing it as it is.
(As a premise, IAM settings have been completed)
Basically, I refer to this article and customize it for myself. Do something like SSI with S3 and Lambda
On the reference site, I was a little worried that I had to add .ssi
as a suffix to the extension of the original file, so
A bucket for temp is prepared separately and adjusted so that it can be processed without adding a suffix.
Basically, the only services I use are S3 and Lambda. CloudFront if you need it.
S3 Prepare two buckets for temp to upload files and a bucket for publishing.
The name can be anything.
This time it is s3-ssi-include
.
The name can be anything.
This time it is s3-ssi-include-base
.
It is assumed that the access permissions are set appropriately. The temp bucket stores files and is just passed to the public bucket, so there is no need to publish it. If you also use CloudFront for your publishing bucket, you don't need to publish it.
From the details page of the bucket for temp, go to "Properties"-> "Event Notification"-> "Create Event Notification". Lambda function is now started when the file is uploaded (PUT event)
--Event type: PUT --Destination: Lambda function
After selecting up to, complete the setting once. After creating the Lambda function later, I returned to this screen again and
--Specify Lambda function: Specify the Lambda function created earlier in "Select from Lambda functions"
It is necessary to set.
Lambda
Lambda detects the PUT event, and if SSI (Server Side Includes) is described in the uploaded HTML file, Lambda will include it and create a function to store it in another bucket. For S3, edit the resource base policy for public for temp and grant the authority. I think the following will be helpful. Lambda resource-based policy when triggered by S3
import json
import os
import logging
import boto3
from botocore.errorfactory import ClientError
import re
import urllib.parse
logger = logging.getLogger()
logger.setLevel(logging.INFO)
s3 = boto3.client('s3')
def lambda_handler(event, context):
logger.info('## ENVIRONMENT VARIABLES')
logger.info(os.environ)
logger.info('## EVENT')
logger.info(event)
input_bucket = event['Records'][0]['s3']['bucket']['name']
output_bucket = os.environ['S3_BUCKET_TARGET']
logger.info('## INPUT BUKET')
logger.info(input_bucket)
input_key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
logger.info('## INPUT KEY')
logger.info(input_key)
try:
#Get input file
response = s3.get_object(Bucket=input_bucket, Key=input_key)
if not input_key.endswith('.html'):
s3.copy_object(Bucket=output_bucket, Key=input_key, CopySource={'Bucket': input_bucket, 'Key': input_key})
else:
input_html = response[u'Body'].read().decode('utf-8')
output_html = input_html
#Get SSI description
include_path_base = re.findall(r'<!--#include virtual="/(.*?)" -->.*?\n', input_html, flags=re.DOTALL)
logger.info('## PATH BASE')
logger.info(include_path_base)
if len(include_path_base) > 0:
for path in include_path_base:
include_path = path
logger.info('## PATH')
logger.info(include_path)
#Get SSI file
try:
include = s3.get_object(Bucket=input_bucket, Key=include_path)
include_html = include[u'Body'].read().decode('utf-8')
#Run SSI
output_html = output_html.replace('<!--#include virtual="/' + include_path + '" -->', include_html)
except ClientError:
pass
#File output
logger.info('## OUTPUT BUKET')
logger.info(output_bucket)
output_key = input_key
logger.info('## OUTPUT KEY')
logger.info(output_key)
out_s3 = boto3.resource('s3')
s3_obj = out_s3.Object(output_bucket, output_key)
response = s3_obj.put(Body = bytes(output_html, 'UTF-8'))
except Exception as e:
logger.info(e)
raise e
Set the public bucket name (this time s3-ssi-include
) to the environment variable S3_BUCKET_TARGET
on the management screen.
Also, save it once so far, and specify the __Lambda function on the S3 side: Specify the Lambda function created earlier in "Select from Lambda functions" __.
This completes the function that Lambda embeds the include file at the timing of uploading the file to the S3 bucket for temp and transfers it to the original public S3 bucket.
I think that it can be stored in S3 as if it were uploaded to a normal Web server, so if you need to migrate a site that uses SSI to S3 + CloudFront for site migration etc., it is common that is described in SSI. You can migrate files without having to replace them all at once. If you replace the common files that are originally managed by SSI with each file, the subsequent common files will be hard-coded, which will increase the operation man-hours and risk. To be honest, I don't want to do much because there is a risk of human error in batch replacement itself. Given that, I'm wondering if this function is quite convenient. However, the disadvantage is that it costs a fee because it uses two buckets, and it is a little difficult to understand, so it is necessary to make it known properly.
Nowadays, with the evolution of CI / CD and Docker, I think that there are fewer situations where you are worried about the above. There are not only such sites in the world, so I wonder if there is such a demand for it.
That's all from the field.
Recommended Posts