Serving contents from S3 via CloudFront
2022-06-20 (Updated on 2022-06-27)
This website is generated with Zola and delivered from Amazon S3 via Amazon CloudFront. This blog post shows what I have done to successfully deliver the contents in this configuration.
Plan for contents delivery
I intended to deploy my website to an S3 bucket and deliver it through a CloudFront distribution. This idea itself should be very straightforward.
How Zola locates contents
Zola locates the contents of individual sections and pages at a path like /{parent section path}/{section or page title}/index.html
; e.g., /blog/0002-serving-contents-from-s3-via-cloudfront/index.html
for this page.
And when it refers to the contents, it omits /index.html
from the path like /{parent section path}/{section or page title}
that is supposed to be expanded with trailing /index.html
by a server; e.g., /blog/0002-serving-contents-from-s3-via-cloudfront
for this page.
Unfortunately, this, expanding a subdirectory with index.html
, is not an easy* task for a CloudFront distribution.
(*It turned out not easy at all!)
Introducing CloudFront Functions
To address the above issue, we can use CloudFront Functions. There is an exact use case of a CloudFront Function for this situation in the guide provided by AWS. However, this seemingly easy task turned out not that easy at all. I had to carefully deal with the URI specifications, and my findings were,
- A URI may end with an anchor ID; i.e., followed by a hash (
#
).- You may have to insert
[/]index.html
between the last URI segment and the hash.
- You may have to insert
- An anchor ID may contain any symbols including dots (see Difficulties in anchor IDs in the past post).
- You cannot simply determine that a file extension is specified when you just find a dot in the URI as the above use case does.
- An anchor ID may even contain hashes and slashes because any symbols in a markdown section title are kept.
- You have to first locate the first hash in a URI to separate an actual path and an anchor ID. This processing should be legal if I correctly understand the syntax of a URI.
- As far as I tested, a section or page title may not contain dots because Zola recognizes it as a language code delimiter as soon as Zola finds one.
So you should not supply
/index.html
if the last path segment of a URI excluding an anchor ID contains a dot because it should be a resource other than a section or page. - A URI may contain a query part starting with a question mark (
?
).- You may have to insert
[/]index.html
between the last URI segment and the question mark.
- You may have to insert
Thus, my algorithm was,
- A URI is given →
uri
. - Locate a first optional hash (
#
) inuri
and separate a fragment (substring starting from#
or empty) from it → [uri
,fragment
]. - Locate a first optional question mark (
?
) inuri
and separate a query (substring starting from?
or empty) from it → [uri
,query
]. - Locate the last slash (
/
) inuri
and separate the last path segment (substring starting from/
) from it → [uri
,last path segment
]. - If
last path segment
contains no dots (.
), expandlast path segment
with,"index.html"
iflast path segment
ends with/
,"/index.html"
otherwise
- Return a new URI =
uri
+last path segment
+query
+fragment
The handler
function I implemented can be viewed here.
By the way, the JavaScript engine for CloudFront Functions is based on ECMA v5.1 and you may feel it is outdated.
Unit testing CloudFront Functions
Unit testing CloudFront Functions was also challenging.
I found this article useful.
The problem is that the CloudFront Functions runtime allows neither the module.exports
idiom nor the export
modifier.
So there was no standard manner where I could export any function from a CloudFront Functions script.
A workaround suggested by the above article was to use babel-plugin-rewire
which injects functions to access internal variables and functions in an imported script.
When I tried babel-plugin-rewire
, I faced an issue of babel-plugin-rewire
that an unused internal function was removed.
This was problematic because the handler
function itself that is invoked from the runtime was not called inside the source file.
As I mentioned earlier, neither the module.exports
idiom nor the export
modifier worked.
My workaround was to add another function handlerImpl
and make handler
simply call it, then I could test handlerImpl
instead.
I configured Jest to process *.js
files in a specific folder with Babel + babel-plugin-rewire
.
My jest.config.js
file can be viewed here.