Terraform provisioned static site hosting on AWS S3 with CloudFront

January 30th, 2019

I recently setup a couple of static sites by hand using CloudFront in front of S3 for https. I decided the next time I needed to set one up I'd automate it using Terraform and Terragrunt and this blog post is a brain dump of my notes on that.

If you just want to get a static site up quickly you should use something like Netlify instead. It'll be much quicker and less painful!

These sites are really low volume so the hosting works out as effectively free. The CloudFront free tier in AWS allows for 50Gb and 2 million requests per month which is way more than I actually need. I haven't worked out the costs outside the free tier but if something gets popular enough to start hitting those limits I'll worry about it then.

I registered devwhoops.com so I'd have a new domain to experiment on instead of breaking my existing sites.

My requirements

  • Site contents stored in a bucket on S3
  • DNS hosted on DNSimple as they're my DNS provider
  • https only with automatic redirect from http
  • Free and automatically renewing https certificate
  • Works on the apex domain, i.e. devwhoops.com
  • Redirects www.<apex domain> to the bare domain, i.e. www.devwhoops.com to devwhoops.com.

I have also made this work for just a subdomain with no redirects but have left that part out to make this post shorter. It's straightforward enough to take what I have here and delete the parts that aren't needed for a single subdomain.

The moving parts

There are several moving parts needed to make this work on AWS:

  1. The source bucket, devwhoops.com setup to allow public website access
  2. A private bucket for access logs devwhoops.com-logs
  3. The redirect bucket www.devwhoops.com also setup to allow public website access
  4. Amazon Certificate Manager (ACM) certificate for both devwhoops.com and www.devwhoops.com
  5. Validation DNS CNAME record for devwhoops.com so ACM will issue the certificate
  6. Validation DNS CNAME record for the www.devwhoops.com so ACM will issue the certificate
  7. A CloudFront distribution for the source bucket using a custom origin
  8. A CloudFront distribution for the www -> apex redirect using a custom origin
  9. A DNS ALIAS record for devwhoops.com that points at the CloudFront distribution
  10. A DNS CNAME record for www.devwhoops.com that points at the www redirect CloudFront distribution

The source buckets must be publicly available over HTTP rather than private S3 buckets to allow things like redirects to work. There's a great explanation in the "Is this really necessary?" sidebar here.

Yes, you do need an entire CloudFront distribution to redirect www.devwhoops.com to devwhoops.com. It's the only way to support the redirect via HTTPS as far as I am aware.

The code

I put the code needed to create all the moving parts into a single Terraform module that has enough input variables to customize the solution per site. I use Terragrunt to handle re-using this module and configuring it for each specific site.

The S3 buckets

Three S3 buckets are needed, one for the site content, one for logs and one for the redirect. I also turned on versioning in the site bucket so it's possible to go back in time if required.

The main site bucket index and error documents are configurable as different static sites might need to use these in different ways.

resource "aws_s3_bucket" "site" {
  bucket = "${var.site_domain}"

  website {
    index_document = "${var.bucket_index_document}"
    error_document = "${var.bucket_error_document}"
  }

  logging {
    target_bucket = "${aws_s3_bucket.site_log_bucket.id}"
  }

  versioning {
    enabled = true
  }
}

resource "aws_s3_bucket" "redirect_to_apex" {
  bucket = "www.${var.site_domain}"

  website {
    redirect_all_requests_to = "https://${var.site_domain}"
  }
}

resource "aws_s3_bucket" "site_log_bucket" {
  bucket = "${var.site_domain}-logs"
  acl = "log-delivery-write"
}

Bucket permissions

The code above names the bucket after the site domain. As the bucket has to be public there's nothing stopping someone guessing the access url and going to it directly. I'm not worried about that for the sites I'm working with but if you are you can use the Terraform random_id resource to generate a random string to use as part of the bucket name.

The public permissions are below. I find that a lot of Terraform code by volume is specifying policies for resources!

resource "aws_s3_bucket_policy" "site" {
  bucket = "${aws_s3_bucket.site.id}"
  policy = "${data.aws_iam_policy_document.site_public_access.json}"
}

data "aws_iam_policy_document" "site_public_access" {
  statement {
    actions = ["s3:GetObject"]
    resources = ["${aws_s3_bucket.site.arn}/*"]

    principals {
      type = "AWS"
      identifiers = ["*"]
    }
  }

  statement {
    actions = ["s3:ListBucket"]
    resources = ["${aws_s3_bucket.site.arn}"]

    principals {
      type = "AWS"
      identifiers = ["*"]
    }
  }
}

resource "aws_s3_bucket_policy" "redirect_to_apex" {
  bucket = "${aws_s3_bucket.redirect_to_apex.id}"
  policy = "${data.aws_iam_policy_document.redirect_to_apex.json}"
}

data "aws_iam_policy_document" "redirect_to_apex" {
  statement {
    actions = ["s3:GetObject"]
    resources = ["${aws_s3_bucket.redirect_to_apex.arn}/*"]

    principals {
      type = "AWS"
      identifiers = ["*"]
    }
  }

  statement {
    actions = ["s3:ListBucket"]
    resources = ["${aws_s3_bucket.redirect_to_apex.arn}"]

    principals {
      type = "AWS"
      identifiers = ["*"]
    }
  }
}

Certificates

AWS Certificate Manager can generate and renew the https certificates for free. It needs proof of domain ownership via the ability to write a CNAME record before it will issue the certificate. I adapted this article to use DNSimple. The aws_acm_certificate_validation provider handles waiting for the validation to pass before moving on to creating the CloudFront distribution. This is very handy!

The certificates must be in the us-east-1 region. I usually work in eu-west-1 so need to use the Terraform alias support to have a provider in the correct regions.

ACM Certificate

// Needed because certificate for cloudfront must be in us-east-1
provider "aws" {
  alias = "virginia"
  region = "us-east-1"
}

resource "aws_acm_certificate" "default" {
  provider = "aws.virginia"
  domain_name = "${var.site_domain}"
  subject_alternative_names = ["${var.site_domain}", "www.${var.site_domain}"]
  validation_method = "DNS"
}

DNSimple Validation CNAME records

There are two names to validate, the www and non-www version of the apex domain. This means that ACM wants to see two CNAME records added to the domain to validate them.

resource "dnsimple_record" "validation" {
  domain = "${var.site_domain}"
  // remove the apex domain from the resource_record_name otherwise dnsimple errors
  name  = "${replace(aws_acm_certificate.default.domain_validation_options.0.resource_record_name, ".${var.site_domain}.", "")}"
  type  = "${aws_acm_certificate.default.domain_validation_options.0.resource_record_type}"
  // Remove the trailing . as dnsimple removes it anyway and the domain still gets validated.
  // If the . isn't removed then this will always want to update
  value = "${replace(aws_acm_certificate.default.domain_validation_options.0.resource_record_value, "/\\.$/", "")}"
  ttl = "60"
}

resource "dnsimple_record" "alt_validation" {
  domain = "${var.site_domain}"
  // remove the apex domain from the resource_record_name otherwise dnsimple errors
  name  = "${replace(aws_acm_certificate.default.domain_validation_options.1.resource_record_name, ".${var.site_domain}.", "")}"
  type  = "${aws_acm_certificate.default.domain_validation_options.1.resource_record_type}"
  // Remove the trailing . as dnsimple removes it anyway and the domain still gets validated.
  // If the . isn't removed then this will always want to update
  value = "${replace(aws_acm_certificate.default.domain_validation_options.1.resource_record_value, "/\\.$/", "")}"
  ttl = "60"
}

Waiting for validation

The two sections above will create the certificate and the CNAME records but the certificate won't be available until the validation has passed.

resource "aws_acm_certificate_validation" "default" {
  provider = "aws.virginia"
  certificate_arn = "${aws_acm_certificate.default.arn}"
  validation_record_fqdns = [
    "${dnsimple_record.validation.hostname}",
    "${dnsimple_record.alt_validation.hostname}"
  ]
}

CloudFront

Now the buckets and the validated certificate resources are setup the next step is to create the CloudFront distributions.

For the amount of traffic that the sites I'm putting up will get CloudFront is essentially free. The biggest pain with working with it is how long it takes to make changes when you change something. I saw times between 15 and 30 mins when I was working on this. Now they're setup I hopefully shouldn't have to change them much.

Cloudfront handles compression and with the right configuration it's possible to get really good results on website test tools like the Audit tab built into Chrome.

As mentioned before CloudFront can only use http to talk to the S3 website bucket. This is why origin_protocol_policy is http-only. (The https config is there as it's not optional in Terraform.)

locals {
  s3_origin_id = "S3-${var.site_domain}"
}

resource "aws_cloudfront_distribution" "s3_distribution" {
  origin {
    domain_name = "${aws_s3_bucket.site.website_endpoint}"
    origin_id = "${local.s3_origin_id}"

    // The origin must be http even if it's on S3 for redirects to work properly
    // so the website_endpoint is used and http-only as S3 doesn't support https for this
    custom_origin_config {
      http_port = 80
      https_port = 443
      origin_protocol_policy = "http-only"
      origin_ssl_protocols = ["TLSv1.2"]
    }
  }

  aliases = ["${var.site_domain}"]

  enabled = true
  is_ipv6_enabled = true
  default_root_object = "${var.default_root_object}"

  logging_config {
    bucket = "${aws_s3_bucket.site_log_bucket.bucket_domain_name}"
    include_cookies = false
  }

  default_cache_behavior {
    allowed_methods = ["GET", "HEAD"]
    cached_methods = ["GET", "HEAD"]
    target_origin_id = "${local.s3_origin_id}"

    "forwarded_values" {
      "cookies" {
        forward = "none"
      }
      query_string = false
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl = "${var.min_ttl}"
    max_ttl = "${var.max_ttl}"
    default_ttl = "${var.default_ttl}"
    compress = true
  }

  viewer_certificate {
    acm_certificate_arn = "${aws_acm_certificate_validation.default.certificate_arn}"
    ssl_support_method = "sni-only"
    minimum_protocol_version = "TLSv1.1_2016"
  }

  restrictions {
    "geo_restriction" {
      restriction_type = "none"
    }
  }

  custom_error_response  = "${var.custom_error_response}"
}

Redirect CloudFront Distribution

This is very similar to the block above.

resource "aws_cloudfront_distribution" "redirect_distribution" {
  origin {
    domain_name = "${aws_s3_bucket.redirect_to_apex.website_endpoint}"
    origin_id = "${local.s3_origin_id}"

    // The redirect origin must be http even if it's on S3 for redirects to work properly
    // so the website_endpoint is used and http-only as S3 doesn't support https for this
    custom_origin_config {
      http_port = 80
      https_port = 443
      origin_protocol_policy = "http-only"
      origin_ssl_protocols = ["TLSv1.2"]
    }
  }

  aliases = ["www.${var.site_domain}"]

  enabled = true
  is_ipv6_enabled = true

  default_cache_behavior {
    allowed_methods = ["GET", "HEAD"]
    cached_methods = ["GET", "HEAD"]
    target_origin_id = "${local.s3_origin_id}"

    "forwarded_values" {
      "cookies" {
        forward = "none"
      }
      query_string = false
    }

    viewer_protocol_policy = "redirect-to-https"
    min_ttl = 0
    max_ttl = 31536000
    default_ttl = 86400
  }

  viewer_certificate {
    acm_certificate_arn = "${aws_acm_certificate_validation.default.certificate_arn}"
    ssl_support_method = "sni-only"
    minimum_protocol_version = "TLSv1.1_2016"
  }

  restrictions {
    "geo_restriction" {
      restriction_type = "none"
    }
  }
}

DNS for access

DNSimple provides ALIAS records which are like CNAME but work for apex domains.

resource "dnsimple_record" "access" {
  domain = "${var.site_domain}"
  name = ""
  type = "ALIAS"
  value = "${aws_cloudfront_distribution.s3_distribution.domain_name}"
  ttl = "3600"
}

resource "dnsimple_record" "alt_access" {
  domain = "${var.site_domain}"
  name = "www"
  type = "CNAME"
  value = "${aws_cloudfront_distribution.redirect_distribution.domain_name}"
  ttl = "3600"
}

Terragrunt

With the module setup I can create a site using Terragrunt with code like this:

terragrunt = {
  terraform {
    source = "git::git@github.com:chrismcg/tf_modules.git//services/static-site?ref=v1.0.3"
    // source = "../../../../tf_modules/services//static-site"
  }

  include = {
    path = "${find_in_parent_folders()}"
  }
}

site_domain = "devwhoops.com"

It's really straightforward to re-use the Terraform code for multiple sites without a lot of copy and paste.

Conclusion

As I mentioned at the start using something like Netlify is probably a better choice than setting this up yourself. If there are constraints that mean you have to have everything within AWS then it's not to hard to setup.

This could be taken further by adding CodePipeline/CodeDeploy but for now the occasional aws s3 sync is working fine for me!

The code snippets above have been edited from the real code in my private repo. If you run into trouble with them feel free to drop me a line by email or in the comments below and I'll see if I can help figure out what's going on.