Deploying Azure Pipelines agents as containers to Kubernetes

A common problem I have seen across the teams I’ve worked on that use Azure Pipelines for building and releasing code is the lack of enough pipeline agents to handle the increasing number of builds/releases that need to be executed simultaneously. Most teams would start by just using Microsoft-hosted agents, which is super straightforward, no setup required. However they come with a few downsides:

  • They only allow for one parallel job at any given time for private projects unless you opt for purchasing more parallel jobs (public projects allow 10 free parallel jobs).
  • They use the Standard_DS2_v2 vm size which gives you only 2 vCPUs, 7 GiB of memory and 8000 IOPS. They also come with at least 10 GB of storage.
  • They come with a large list of installed software which rarely matches the exact set that you need, which is usually a very small subset from what they provide there. See the software list for Windows and Linux.

To overcome those issues you would usually come up with your own self-hosted agents where there is no limit on parallel jobs and you can decide what vm size to use and what software to put there. But then this comes with its own downsides:

  • You have to figure out the entire provisioning of the VM yourself, which can involve a lot of error prone steps if done manually. If not doing it manually you have to figure out how to automate the quick creation and deletion of VMs depending on team needs
  • Provisioning a VM can take several minutes, even more if setup scripts are run after provisioning
  • You have to keep the VM well maintained and updated

One way to find a middle ground among all these issues is to turn the agents into docker containers and then have Kubernetes orchestrate the provisioning of those containers. Yes, you still have to stand up and maintain a k8s cluster but then you get a bunch of benefits:

  • You can easily scale up/down the number of agents as needed. No parallel job limits
  • You get to choose the k8s node vm size
  • You get to pick and customize the docker image to use so it only has the software you need
  • You let k8s deal with all the provisioning stuff. It’s will ensure the number of required agents are always there
  • Provisioning is pretty fast, especially after provisioning the first agent

A bunch of people have come to this conclusion already, so when I found this open source project a while ago I gave it a try and worked pretty well. However since then the Azure DevOps team stopped supporting the docker image used in that project and you are now expected to come up with your own docker image. Plus the old VSTS went into a few changes. Therefore I forked the project and updated a few things to match the latest guidelines and added some more guidance on how to create the docker image and the kubernetes cluster.

To get started you can go to my azure-pipelines-kubernetes-agents GitHub repo and follow the steps described there. Here I’ll just summarize what you’ll end up doing to quickly come up with your own k8s hosted Azure Pipelines agents:

  1. Create a Personal Access Token (PAT) with the Agent Pools(read, manage) scope
  2. Create your pipelines agent pool
  3. Create and publish your pipelines agent docker image (a sample is provided in the repo)
  4. Create your k8s cluster. The repo provides steps for Azure Kubernetes Service (AKS)
  5. Install the Helm chart

So, for instance, once you have the pipelines pool and the k8s cluster created as well as the docker image published, this is all I did to provision my pipelines agents:

helm install --set image.repository=julioc/azpagent --set image.tag=ubuntu-16.04 --set azpToken=[your token here] --set azpUrl= --set azpPool=MyPool -f helm-chart\azp-agent\values.yaml azp-agent helm-chart\azp-agent

Which in the pool management page looks like this:

And then I was able to start running pipelines in those agents right away:

And if I need to scale up to say 10 pipeline agents I could just do this:

kubectl scale statefulset/azp-agent --replicas 10

Which after a few mins (depending on your node vm size) looks like this in the pool management page:

So check it out and let me know if you have any comments or questions.