AWS CDK starter project - Configuration, multiple environments and GitHub CI/CD

26 Feb 2024

This blog will go through the thought process and reasoning involved in setting up a CDK starter project. The starter project can be found here: https://github.com/rehanvdm/cdk-starter. We will cover topics like configuration, environments, build systems, CI/CD processes and GitHub Workflows.

Architecture
Structure and Setup
- Build System - Wireit and ESR
- Other
  - TS Paths
  - Husky, ESLint, Prettier
Configuration
Multiple environments
CI/CD with GitHub Workflows

Architecture

The architecture image above shows the architecture for a single environment. Environments can be deployed in the same or different AWS Accounts and regions. A read-only copy of the draw.io file can be found here.

The starter project consists of two CDK stacks defined in /stacks. There is a Backend Stack located at /stacks/backend.ts which deploys a TypeScript Lambda Function that returns a random number when hit through its Function URL. Environment variables for the min and max are used for the range in which the random number is generated. These environment variables are passed in from the configuration determined by the current environment.

The Function URL is then passed using CloudFormation Exports to the Frontend stack located at /stacks/frontend.ts. The Frontend stack has a CloudFront that returns a standard index.html with a link to showcase the API is working. It also reverse proxies the /api path to the Function URL so that the site (index.html) and the API are on the same domain (this prevents CORS).

Frontend `index.html`

Backend API response

The resources defined are just to have some AWS resources to work with. We intentionally splits the stacks to create dependencies which will be talked about in a future blog.

Structure and Setup

The standard CDK structure (cdk init) is not used, it never felt “right” to me, but this is a matter of preference. The current structure is as follows:

/config - The environments and configuration
/scripts - The scripts used by the wireit build system
/src - Contains application source code like the Lambda code and the frontend website
/stacks - AWS CDK stacks and anything CDK related to infrastructure
/tests - Vitest tests for application source and optionally CDK infrastructure (cdk tests are not included)
index.ts - Is the CDK entry point which is set in the cdk.json

There is a top level package.json used for both application and infrastructure. The same applies for ESLint, Prettier, Vite and other auxiliary tools.

Build System - Wireit and ESR

Wireit is used as the build system, it calls actions written in a TS file located at /scripts/index.ts. These scripts can then be called from the package.json as normal npm run commands, for example:

package.json

{
  "name": "cdk-starter",
  "version": "0.0.0",
  "description": "",
  "scripts": {
    ...
    "build-src": "wireit",
    "cdk:diff:dev": "wireit",
    ...
  },
  "wireit": {
    ...
    "build-src": {
      "command": "esr ./scripts/index.ts -c build-src",
      "files": [
        "scripts/index.ts",
        "src/**"
      ],
      "output": [
        "dist/**"
      ]
    },
    "cdk:diff:dev": {
      "command": "esr ./scripts/index.ts -c cdk -o diff -e dev",
      "dependencies": [
        "build-src"
      ]
    },
    ...
  }
}

Calling the npm run cdk:diff:dev command, will in result call the command:

esr ./scripts/index.ts -c cdk -o diff -e dev

The script arguments are:

-c cdk The command to be used is the cdk one. This -c param must always be passed when calling the script to determine the top level action, other options are validate and build-src. See /scripts/index.ts for the implementation.
-o diff The operation to use for the cdk command, must be one of: diff, deploy or hotswap.
-e dev The environment to use for the cdk command, must be one of: dev, stage or production.

The esbuild-runner (esr) package is used to run TS files instead of tsc. It uses esbuild under the hood which just transpiles and executes TS code. It does not verify syntax like tsc does, we leave that to the IDE while developing but there are Husky/Git hooks that check this before committing code as well as GitHub Workflows.

Before wireit calls the cdk:diff:dev command, it will run the commands for its dependencies first, which in this case is build-src.

We also leverage wireit caching, for example: the build-src command specifies the inputs (files) as the source directory(/src) and the build script(/scripts/index.ts) and the output of the command to be in the /dist folder. If the inputs (files) of the command does not change, then the cached output is used, saving valuable time by only running commands when it is needed.

Other

TS Paths

We use TS Paths instead of specifying the relative paths, which can be a nightmare if you refactor or copy and paste within the project and have deep directory structures.

Transforming relative paths that would have looked like:

import { ApiEnv, envToObject as apiEnvToObject } from "../src/backend/lambda/api/environment";
import { EnvironmentConfig } from "../config/index";

To TS Paths that look like:

import { ApiEnv, envToObject as apiEnvToObject } from "@backend/lambda/api/environment";
import { EnvironmentConfig } from "@config/index";

The tsconfig.json that makes this possible is:

{
  "compilerOptions": {
    ...
    "baseUrl": ".",
    "paths": {
      "@tests/*": ["tests/*"],
      "@config/*": ["config/*"],
      "@backend/*": ["src/backend/*"]
    }
  },
  ...
}

Note that you also need to register this with test systems which in this case is Vitest, see vite.config.ts.

Husky, ESLint, Prettier

ESLint and Prettier are set up to do linting and formatting alongside tsc to validate TS syntax. These commands are run by the npm run validate command, it can be run manually or left for Husky to run before committing with Git hooks.

Configuration

Configuration is stored in TS so that we can define the shape/Type of the configuration object. This is so that we do not have to do runtime validation but rather rely on TS compile time checking. Each environment has a configuration, example shown below.

config/index.ts

export type EnvironmentConfig = {
  env: Environment;
  aws: AwsEnvironment;
  randomNumberMin: number;
  randomNumberMax: number;
};
export type Config = Record<Environment, EnvironmentConfig>;

export const config: Config = {
  dev: {
    env: "dev",
    aws: {
      region: "us-east-1",
      account: "111111111111",
      profile: "rehan-dev",
    },
    randomNumberMin: 1,
    randomNumberMax: 100,
  },
  stage: {
    ...
  },
  prod: {
    env: "prod",
    aws: {
      region: "us-east-1",
      account: "333333333333",
      profile: "rehan-prod",
    },
    randomNumberMin: 100,
    randomNumberMax: 1000,
  },
};

Here we can see that each environment is being deployed into there own AWS Account, this is not a requirement but considered best practice. They can also deployed into the same account.

This is made possible by the consistent naming structure used in the project, which just prepends the stack name that includes the environment name. The naming function is not only used to name the AWS resources but also the logical IDs of the constructs, this is not a requirement but just a force of habit.

Their usage is as follows:

index.ts

...
const app = new cdk.App();
const backend = new Backend(app, "starter-backend-" + envConfig.env, { env: awsEnv }, envConfig);
new Frontend(app, "starter-frontend-" + envConfig.env, { env: awsEnv }, envConfig, {
  apiOrigin: backend.apiOrigin,
});
app.synth();

/stacks/backend.ts

...
export class Backend extends cdk.Stack {

 constructor(scope: Construct, id: string, stackProps: cdk.StackProps, config: EnvironmentConfig) {
  super(scope, id, stackProps);

  function name(name: string): string {
    return id + "-" + name;
  }
  
  const apiLambda = new lambda.Function(this, name("lambda-api"), {
    functionName: name("api"),
    ...
  });
  ...
  }

Generates names like:

starter-backend-dev for the Backend stack
starter-backend-dev-api for the Lambda in the Backend stack
starter-frontend-dev for the Frontend stack

An important characteristic of any system is determinism. It’s always better to statically store the config which ensures the same deployment will always produce the same outcome. It is possible to read config dynamically via API with something like AWS SSM Parameters or other means, but is not considered best practice as your system is no longer deterministic.

Multiple environments

There are currently 3 environments, dev, stage and prod defined in config/index.ts. We are tying each environment to a git branch as well as to an AWS Account and Region.

Environment	Git Branch	AWS Account	AWS Region
dev	develop	111111111111	us-east-1
stage	stage	222222222222	us-east-1
prod	main	333333333333	us-east-1

The setup offers the flexibility to deploy all environments into the same AWS accounts (done for example code) or into different AWS accounts (best practice) as shown above. The AWS region can also be varied per environment, but it’s not advised. The current config can be changed to deploy stacks over multiple regions, but this is out of scope for now.

The tie between the CDK and which Environment it uses comes from the -e flag on the esr ./scripts/index.ts -c cdk -o deploy -e prod command.

For the Git Branch association, it stems by creating GitHub Workflows and setting that branch as the starting condition. For example the prod environment uses the .github/workflows/production_deploy.yaml workflow and specifies that it should run when there are pushes to the main branch.

on:
  push:
    branches:
      - "main"

Then further down we set the deploy command as npm run cdk:deploy:prod which in return does the script command mentioned above.

For the sake of simplicity, we won’t create ephemeral environments per developer/feature branch as this can lead to an increase in cost and complexity. It is possible with the current configuration, but it is out of scope for now.

There are lots of benefits to using ephemeral environments like allowing developers to test and experiment with actual deployed resources before committing to the develop branch, but for now we only assume the environments above.

Use feature flags and team communication to collaborate safely on the same main branches.

CI/CD with GitHub Workflows

We follow a modified/simplified gitflow model, the process looks as follows:

Developers start work on feature branches that can be named anything
Make a PR from the feature branch to the develop branch and merge (squash commit). Code committed to develop must be safe to transition to stage at any point in time.
Make a PR from the develop branch to stage and merge (merge commit). This allows QA(quality assurance/test team) to test. Fixes to staging can be made by following the usual process of feature -> develop -> stage.
Make a PR from the stage branch to main and merge (merge commit). Hotfixes to production should be avoided if possible. They can be made by following the usual process of feature -> develop -> stage -> main, or if a feature branch is merged to main then it must be back merged to stage and develop.

Some of the deviations from gitflow include:

Feature branches can start with anything
There is a long-lived stage branch instead of short-lived release/vX.X.X branches
We minimize back merges from higher environments to lower ones and always try to follow the normal route of: feature -> develop -> stage -> main

To help elevate pushing broken code/infra to the main branches (develop, stage, main) the following actions are run on PR open:

A Test workflow that validates the TS syntax, linting and formatting. It also builds the source code and runs tests.
A Validate workflow that validates the PR title to follow the conventional commit naming standard (cheat sheet). Note that we added release to the list of allowed types so that we can easily identify when we merge the main branches.
The CDK diff workflows are per environment and will show the diff for its environment and all higher environments. For example, a PR to develop will do CDK diffs for the dev, stage and prod environments. This is so that you can easily catch errors that are low-hanging fruit, like a name that is too long or to verify certain infra created in only certain environments.

The following recommended process for this setup is not enforced and is out of scope. A future blog will cover GitHub settings to enforce processes with branch protection rules, required checks, rule sets, environments, code reviews, deployment/workflow approvals, merge queues, etc.

For all PRs manually do the following before merging:

Ask for a code review, do not merge if not peer-reviewed.
Look at the CDK diffs to ensure changes are safe.
Fix failed validation and tests.

PR to develop

PR branches: Anything -> develop
PR name: Follow conventional commit, example: feat: the cool new thing. If you track tickets with something external like Jira, include the ticket number in the commit message. That enables a reverse lookup/search between GitHub and your ticketing system allowing anyone to do a lookup and see what ticket the changes relates to.
Merge type: Squash commit. There are likely multiple small commits in your PR, by squashing you create neat/clean history on the develop branch and it also allows your work to be rolled back easily.
Example: https://github.com/rehanvdm/cdk-starter/pull/5

PR to stage

PR branches: develop -> stage. The develop branch should be the only branch that gets merged into stage.
PR name: release(stage): develop to stage
Merge type: Merge commit. This preserves the feature branch commits from develop and makes them visible on the stage branch.
Example: https://github.com/rehanvdm/cdk-starter/pull/6

PR to prod

PR branches: stage -> main. The stage branch should be the only branch that merges into main.
PR name: release(prod): stage to main
Merge type: Merge commit. This preserves the feature branch commits from develop and makes them visible on the main branch.
Example: https://github.com/rehanvdm/cdk-starter/pull/7

GitHub AWS OIDC authentication

The GitHub to AWS authentication is done using IAM roles. The role is set up to be assumable by GitHub for certain repos, branches, environments, etc. This role is usually created manually to prevent a chicken and egg situation. Read this blog from AWS on the process: Use IAM roles to connect GitHub Actions to actions in AWS

At a bare minimum, your role needs the following trust policy and statement policies to deploy CDK. Make replacements with your values below, note that if you CDK bootstrapped your account, then you have the cdk-hnb659fds-XXX roles that are used for deployment.

The following trust policy gives all repos access to assume the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::111111111111:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
                },
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": "repo:GITHUB_USERNAME_OR_ORGANISATION_NAME/*"
                }
            }
        }
    ]
}

The following IAM policies give permissions that allows the role to assume the CDK roles:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AssumeCdkRoles",
            "Effect": "Allow",
            "Action": [
                "sts:AssumeRole",
                "iam:PassRole"
            ],
            "Resource": "arn:aws:iam::111111111111:role/cdk-hnb659fds-*"
        }
    ]
}

Conclusion

This article is considered an updated and more complete solution provided by 4 Methods to configure multiple environments in the AWS CDK. We covered topics like configuration, environments, build systems, CI/CD processes and GitHub Workflows that are needed to go beyond a “hello world” CDK application. Stay tuned for posts that will further improve the CDK starter project.