Amazon CodeBuild boilerplate for cached Docker image builds
How to cache docker builds on AWS CodeBuild?
I switched a lot of my spin off projects to AWS recently. One of the components I love is AWS CodeBuild - true godsend for the lazy :) CodeBuild provides a cost effective way to manage your builds, especially if you build smaller images infrequently. You do not have to spin up a new EC2 host, and wait for provisioning, instead, you just hit the Build button (or automate the pipeline), and magic happens. Since Amazon gives you 100 free build minutes a month, chances are that you will stay in the free tier forever, and you can also automate push to your private ECR repository.
My expectations from a good build pipeline are not exaggerated:
- Build fast so we can be cheap - this means, I want Docker cache. This is important for me, because I work mostly with R, and downloading/installing packages is a royal pain (even binary ones)
- Make a nice pipeline that can be copypasted to multiple projects, because I am lazy
Caching docker is a bit tricky - CodeBuild specifies and supports 2 cache modes directly:
- S3 - they explicitely state this is not recommended for docker (also, you will be accumulating S3 GET requests like crazy)
- local - this actually means host, CodeBuild will ‘try’ to cache your stuff on the host, and if you happen to run a subsequent build on the same host, it will use that. I actually never managed to hit this cache, unless I did subsequent builds in the same minute. But my time between builds is measures in days or weeks, not minutes.
Officially, there is no other solutions, but there is a way to use docker itself - just pull the image you are about to rebuild 1st:
- My build and stuff happens to be in same availability zone - pulling from my own repository is actually cheaper than hitting rstudio :)
- It is damn fast
- Since my base R image is the same, and I very rarely change it, I am guaranteed to hit everythig but the last few steps that actually copy the program and set up my docker entrypoint.
Complete build spec leveraging this looks like this:
version: 0.2 # You need to have these global environment variables set # REPO_URL = link to private ECR # IMAGE_NAME = name of image to build # IMAGE_TAG = tag to build by default (latest) phases: pre_build: commands: - REPO=$REPO_URL/$IMAGE_NAME - REPO_TAG=$REPO:$IMAGE_TAG - REPO_GIT=$REPO:$(git log -1 --format=%h) - echo Building $REPO - echo Region set to $REGION - echo Logging in to Amazon ECR... - aws --version - docker --version - aws ecr get-login-password --region $REGION | docker login --username AWS --password-stdin $REPO_URL - docker pull $REPO_TAG || true build: commands: - echo Build started on `date` - echo Building the Docker image... - docker build --cache-from $REPO_TAG --tag $REPO_TAG --tag $REPO_GIT . post_build: commands: - echo Build completed on `date` - echo Pushing the Docker images... - docker push $REPO_TAG - docker push $REPO_GIT
docker pull ... || true so the build continues if there is
no image present.
Note 2: As of now, docker on CodeBuild is version 19.x and does not
docker push -a, so I have to push both tags separately.