CDK Shorts #2 – Parallel Deployments
The ability to deploy stacks in parallel is beyond the CDK and CloudFormation scope. It is up to the caller to orchestrate and specify the order of the stack when this granularity is desired.
In this post we show how a basic 3 stack application’s deployment time can be reduced by deploying stacks in parallel where possible. The stacks in question are:
stacks/Infrastructure
this contains the all resources used by the Service stacks, like VPCs, DBs ect. In this example it only contains a DynamoDB Table.stacks/ServiceA
this is one of the Service stacks, it only contains a Lambda that receives the DynamoDB Table name as an environment variable from the Infrastructure stack. It is thus dependent on the Infrastructure stack and needs that to be deployed first.stacks/ServiceB
exactly the same asstacks/ServiceA
.
In theory, we could have looped over an array and created as many stacks as we want (ServiceX
) but the example is keeping it concrete and simple with only two Service stacks.
The project that is referenced in this post can be found here: https://github.com/rehanvdm/aws-cdk-parallel-deploy
The Problem
You have to wait a long time when you have a large CDK project that needs to deploy many stacks. The default
behaviour of the CDK is to deploy the stacks in synchronous order of dependency when you specify the *
, indicating to deploy all. The CDK does
a great job to keep track of which stacks are dependent on each other but can not know which stacks can be deployed in parallel.
The deploy *
command will deploy the stacks in order of: (theoretical time indicated next to each stack)
stacks/Infrastructure
- 1 minutestacks/ServiceA
- 1 minutestacks/ServiceB
- 1 minute
Solution
- Do not specify the
*
when doing thedeploy
command. Explicitly deploy stacks in the correct order, use the--exclusively true
argument on thedeploy
command. - Synthesis the cloud assembly output.
- Pass the cloud assembly output as input to all the
deploy
commands.
1. Deploy Order
So our deployment order needs to change:
stacks/Infrastructure
- 1 minute- Parallel deployment of
stacks/ServiceA
andstacks/ServiceB
- 1 minute
This is entirely up to your build/deployment script. In this project we use a GULP file as a build script to make the process platform-agnostic. This is a basic implementation of the fourth method as explained in one of my other blog posts 4 Methods to configure multiple environments in the AWS CDK
It is important to specify the --exclusively true
property when deploying the ServiceX
stacks so that they don’t both
try to deploy the Infrastructure
stack at the same time.
2. Synth Cloud Assembly outputs
The CDK synth
command produces a Cloud Assembly output when you
specify the --output <cloud_asm_path>
property. AWS mentions Cloud Assembly
but does not highlight the benefits allowing parallel deployments. It allows you to do one synth
command and then specify --app <cloud_asm_path>
for every
subsequent deployment.
This is required when we run the deploy
command in parallel. We are specifying which stacks to deploy (--exclusively true
), but the CDK will rebuild the
cloud assemblies for the whole project everytime. This creates a race conditions and wastes a lot of compute resources. It is thus better
to only do this step once and then pass it down as an artifact to the rest of the deployments which only deploy thier exclusive stacks.
3. Use Cloud Assembly outputs
The --app
variable which reads the cdk.json
file by default needs to be overridden by --app <cloud_asm_path>
.
This instructs the CDK to use the pre generated cloud assembly output instead of using the app
command/property in the cdk.json
file
to regenerate the cloud assembly every time.
Putting it all together:
The high level commands:
> tsc
> cdk synth --output ./cloud_assembly_output
> cdk deploy "parallel-deploy-infra" --app ./cloud_assembly_output
IN PARALLEL
> cdk deploy "parallel-service-a" --app ./cloud_assembly_output
> cdk deploy "parallel-service-b" --app ./cloud_assembly_output
This is what it looks like in my GULP deploy script:
gulp.task("deploy", async callback =>
{
try
{
let config = await getConfig();
printConfig(config);
/* Convert TSC to JS dor CDK */
await CommandExec("npm", ["run build"], paths.workingDir);
/* Create Cloud Assembly */
await CommandExec("cdk",[`synth "${stackNames.infra}" --profile ${config.AWSProfileName} ` +
` --output ${paths.cloudAssemblyOutPath}`], paths.workingDir);
/* Deploy Infra stack */
await CommandExec("cdk",[`deploy "${stackNames.infra}" --require-approval=never ` +
` --profile ${config.AWSProfileName} --progress=events --app ${paths.cloudAssemblyOutPath}`], paths.workingDir);
/* Deploy Service Stacks in parallel */
let serviceStacks = [stackNames.serviceA, stackNames.serviceB];
let arrPromises = [];
for (let stackName of serviceStacks)
{
arrPromises.push(
CommandExec("cdk",[`deploy "${stackName}" --require-approval=never ` +
` --profile ${config.AWSProfileName} --progress=events --app ${paths.cloudAssemblyOutPath} --exclusively true`],
paths.workingDir, true, process.env, `[${stackName}] `)
);
}
await Promise.all(arrPromises);
callback();
}
catch (e)
{
callback(e);
}
});
Section of the GULP deploy config file, complete file can be found here
TL;DR
CDK stacks can be deployed in parallel by generating a cloud assembly output and then specifying the order explicitly.
The project that is referenced in this post can be found here: https://github.com/rehanvdm/aws-cdk-parallel-deploy