Moving Forward with These CI/CD Best Practices
Originally published by New Context.
In order to deliver tested and trusted code quickly and reliably, you need to automate your continuous integration and continuous delivery (CI/CD) solution. In our previous article, we discussed how and when to test throughout a CI/CD pipeline. In our continuing series on CI/CD, we are not going to discuss some of the best practices to refine your CI/CD solution.
Continuous integration and continuous delivery (CI/CD) is a way to automate your code workflow to provide greater speed and reliability. Continuous integration is a methodology that automatically tests and, if desired, builds your code whenever it is merged into your version control system. Continuous delivery is the next step which automatically prepares your code for deployment at the push of a button or on a specified schedule. Continuous delivery can even be extended to continuous deployment which will automatically deploy the code if all tests pass.
The CI/CD world is vast and there are many and varied ideas about how best to tackle this methodology. Which ways are best? In this article, we will discuss some of the CI/CD best practices to better refine your approach.
Single Source of Truth
There are a couple of important points to hit when it comes to having a single source of truth. First and foremost, your code should be stored in a version control system that everyone uses. The main branch of your repository is what your environment tags (e.g., development, stage, production) are based on. If anyone on the team wants to know what is in production, they should have to look no further than the repository.
But the single source of truth concept extends far beyond using version control. Your CI/CD pipeline should also be a single source of truth. All of your merges, testing, and deployments should go through it. Don’t allow any one-offs, manual deployments, or shadow IT. Once implemented, this is the way.
Throughout everything you do, be security conscious. For example, don’t leave your private keys (including access tokens) laying around. Keys need to be stored in secrets management software (e.g., Vault, Docker secrets, cloud-provided solutions).
Your CI/CD pipeline should have your tests being done in ephemeral environments (e.g., Docker containers, ephemeral VMs). This will help ensure that your tests are idempotent, meaning that you won’t run into issues because of artifacts from previous tests and that you’ll get fewer false positives. You should also make sure that your environments match each other as closely as possible. Any differences between environments should be extracted into an environment configuration set and tested for. Finally, if the testing environments stick around after the testing phase, it leaves a larger footprint for attackers to come after, not to mention the possibility of your keys persisting.
When contributing to a project, work on a single issue at once. Don’t try to push through multiple things at once. Keep your commits small. Too often we see massive commits with no tests written. This is asking for trouble. The more code that is submitted, the harder it is to track down issues.
Minimize branching. Do your work in a branch and delete it when it’s merged. If possible, embrace the trunk-based development ideology.
The whole point of this philosophy is to reduce the amount of manual intervention to only what’s necessary. If people have to constantly babysit the CI/CD pipeline, it’s not continuous. Having manual steps also has the side effect of having different versions of code all in the pipeline at the same time. It becomes difficult to determine which is the desired version.
As we discussed previously, sometimes you have to have manual gates and checks. That’s fine, but only if they’re necessary for the organization. And once you have the pipeline working, you can work toward having reliable, trusted automation that can assert control and remove some manual steps.
In larger projects, it might not be necessary to build new artifacts for all of the code, but just for what has changed. For example, there’s no reason to build a complex Docker image every single time; create an image and store that in a container registry. Then, each time you push your code, you can use this pre-built image and not have to waste time rebuilding it when it hasn’t changed. This same mentality should go for all portions of your code: build artifacts and store them in a repository (e.g., Artifactory) and use them across all of your stages and environments.
Let the Developers Develop
We’ve talked about this before so let’s keep it brief: get out of the way. Remove as much red tape as you possibly can. When the developer is ready to push their code, let them. If the code fails tests, fix it. Get feedback quickly to the developer. This is what failing fast looks like. This is good. This is healthy. And when the code is tested and has succeeded, move it along safely into production.
On a related note, let your developers run tests locally before committing. This will instill confidence and save money by not constantly hitting your CI platform.
Be conscious of how long it takes for each step of your pipeline to run. We’ll go over metrics and monitoring in the next section; for now, just keep in mind that everything you test for adds time to the overall deployment. Seconds build-up to minutes and minutes to hours.
One thing you can do to optimize your testing pipeline is to run your fastest tests first. You can also run some testing in parallel. For example, let’s say we’re deploying a Rails app which is going to read information from a database. The Rails app runs in a container which we create with a Dockerfile, and the database is deployed with Terraform. These are independent processes so we can have the Terraform testing and deployment run at the same time as our Dockerfile testing and deployment. If Terraform always had to run first, even when nothing has changed, you’d have to wait for it to run before your Rails tests could run.
The fundamental principle here is that it shouldn’t take longer to merge code into production than it took to write the change. Once your pipeline is in place, you will be able to test earlier and shorten your deployment cycle. This is a major win.
Measuring and Monitoring Progress
This is one of the most important aspects of CI/CD. If you’re not monitoring, how will you know what’s working, what’s not, and what can be improved? Let’s go over some of the things you should be monitoring for:
- Code coverage: What’s your code coverage? We talked earlier about how your code coverage shouldn’t be 100%, but what’s an appropriate percent? This is something you will need to determine and redetermine throughout the life of your project.
- Test success rate: This can be used as an indicator of both code quality and test quality. It shouldn’t be an indicator of whether or not the code is ready for production deployment. Remember, when the tests are passing, they should never fail again in the future. If they do, the code isn’t ready.
- Support tickets: How many support tickets are you getting for each of your environments?
- Deployment metrics: How often are you deploying code? It’s good to see week-over-week how many deployments you have. The more frequently you deploy the production, the more experience and confidence you’ll gain. Additionally, how many deployment failures are there?
- Mean time to recovery (MTTR): When issues happen, how long does it take to get things back to a stable state?
- Performance: How long are your tests taking? Builds? Approvals? Does the amount of time make sense? What can you do to get things through quicker? How long are queries taking? Page loads?
- Percent of code changed: How much code are you changing every go? This will generally be a positive number, but it’s important to remember that negative change isn’t a bad thing.
Moving Forward with These CI/CD Best Practices
CI/CD is more than just a workflow and a methodology: it’s a mentality. It’s creating good engineers who naturally think with this mindset. CI/CD is not just the processes themselves. If you give the best processes to bad engineers, they’ll find a way to screw it up. If you give an awful process to good engineers and don’t get out of their way and let them make changes, they’ll leave.