Learn How to orchestrate Your Infrastructure Fleet with Chef Provisioning

Aziro Marketing

Oct 24 - 0 min read

Chef Provisioning is a relatively new member in the Chef family. It can be used to build infrastructure topologies using the new machine resource. This blog post shows how this is done.

You bring up and configure individual nodes with Chef all the time. Your standard workflow would be to bootstrap a node, register the node to a Chef server, and then run Chef client to install software and configure the node. You would rinse and repeat this step for every node that you want in your fleet. Maybe you have written a nice wrapper over Chef and Knife to manage your clusters using Chef. Until recently, Chef did not have any way to understand the concept of cluster or fleet.

So if you were running a web application with some decent traffic,there would be a bunch of cookbooks and recipes to install and configure: web servers, DB server, background processor, load balancer, etc. Sometimes, you might have additional nodes for Redis or RabbitMQ. So let us say, your cluster consists of three web servers, one DB server, one server that does all the background processing, like generate PDFs or send emails etc., and one load balancer for the three web servers. Now if you wanted to bring such a cluster for multiple environments, say “testing”, “staging,” and “production,” you would have to repeat the steps for each environment; not to mention, your environments could possibly be powered by different providers–production and staging on AWS, Azure, etc. But testing could possible be on local infrastructure, maybe in VMs. This is not difficult, but it definitely makes you wonder whether you could do it better–if only you could describe your infrastructure as code that comes up with just one command.
That is exactly what Chef Provisioning does. Chef Provisioning was introduced in Chef version 12. This helps you describe your cluster as code and build it at will as many times as you want and on various types of clouds, virtual machines, or even on bare metal.

The Concepts

Chef provisioning depends on two main pillars–machine resource and drivers.

Machine Resource

“machine” is an abstract concept of a node from your infrastructure topology. It could be an AWS EC2 instance or a node on some other cloud provider. It could be a Vagrant-based virtual machine, a Linux container, or a Docker instance. It could even be a real, physical bare-metal machine. “machine” and other related resources (like machine_batch, machine_image, etc.,) can be used to describe your cluster infrastructure.
Each “machine” resource describes whatever it does using standard Chef recipes. General convention is to describe your fleet and its topologies using “machine” and other resources in a separate file. We will see this in detail soon, but for now here is

how a machine is described.

#setup-cluster.rb
	machine 'server' do
	   recipe 'nginx'
	end
	machine 'db' do
	   recipe 'mysql'
	end

A recipe is one of a “machine” resource’s attributes. Later we will see a few more of these along with their examples.

Drivers

As mentioned earlier, with Chef Provisioning you can describe your clusters and their topologies and then deploy them across a variety of clouds, VMs, bare metal, etc. For each such cloud or machine that you would like to provision, there are drivers that do the actual heavy lifting. Drivers convert the abstract “machine” descriptions into physical reality. Drivers are responsible for acquiring the node data, connecting to them via required protocol, bootstrapping them with Chef, and running the recipes described in the “machine” resource. Provisioning drivers need to be installed separately as gems.
Following shows how to install and use AWS driver via environment variables in your system.

$ gem install chef-provisioning-aws
$ export CHEF_DRIVER=aws

Running Chef-client on the above recipe will create two instances in your AWS account referenced by your settings in “~/.aws/config.” We will see an example run later in the post.
Driver can be set in your knife.rb if you so prefer. Here, we set the chef-provisioning-fog driver for AWS.

driver 'fog:AWS'

It is possible to set driver inline in the cluster recipe code.

require 'chef/provisioning/aws_driver'
with_driver 'aws'
machine 'server' do
   recipe ‘web-server-app'
end

In the following example, Vagrant driver is given the driver attribute and a driver URL as the value. “/opt/vagrantfiles” will be looked up for Vagrantfiles in the following case.

machine 'server' do
   driver 'vagrant:/opt/vagrantfiles'
   recipe 'web-server-app'
end

It’s a good practice to keep driver details and cluster code separate as it lets you use the same cluster descriptions with different provisioners by just changing the driver in the environment.
It is possible to write your own custom provisioning drivers. But that is beyond the scope of this blog post.

The Provisioner Node

An interesting concept you need to understand is that Chef Provisioner needs a “provisioner-node” to provision all machines. This node could be a node in your infrastructure or simply your workstation. chef-client (or chef-solo / chef-zero) runs on this “provisioner node” against a recipe that defines your cluster recipe. Chef Provisioner then takes care of acquiring a node in your infrastructure, bootstrapping it with Chef, and then running the required recipes on the node. Thus, you will see that chef-client runs twice–once on the provisioner node and then on the node that is being provisioned.

The Real Thing

Let us dig a deeper now. Let us first bring up a single DB server. Using Chef knife you can upload your cookbooks to the Chef server (you could do it with chef-zero as well). Here I have put all my required recipes in a cookbook called “cluster” and uploaded it to a Chef server and set the “chef_server_url” in my “client.rb” and “knife.rb”. You can find all the examples here.

Machine

#recipes/webapp.rb
require 'chef/provisioning'
 machine 'db' do
   recipe ‘database-server’
end
machine 'webapp' do
   recipe 'web-app-stack'
end

To run the above recipe:

sudo CHEF_DRIVER=aws chef-client -r "recipe["cluster::webapp"]"

This should bring up two nodes in your infrastructure — a DB server and a web application server as defined by the web-app-stack recipe. The above command assumes that you have uploaded the cluster cookbook consisting of the required recipes to the Chef server.

More Machine Goodies

Like any other Chef resource, machine can have multiple actions and attributes that can be used to achieve different results. A “machine” can have a “chef_server” attribute, which means different machines can talk to different Chef servers. “from_image” attribute can be used to set a machine image that can be used to create a machine. You can read more about machine resource here.

Parallelisation Using machine_batch

Now if you would like to have more than one web application instances in your cluster and you need more web app servers, say 5 instances, what do you do? Run a loop over your machine resource.

1.upto(5) do |i|
   machine "webapp-#{i}" do
      recipe 'web-app-stack'
   end
end

The above code snippet, when run, should bring up and configure five instances in parallel. “machine” resource parallelizes by default. If you describe multiple “machine” resources consecutively with same actions, then Chef Provisioning combines them into a single (“machine_batch”, more about this later) resource and runs it in parallel. This is great because it saves a lot of time.
The following will not parallelize because the actions are different.

machine 'webapp' do
   action :setup
end
machine 'db' do
   action :destroy
end

Note: if you put other resources between “machine” resources, the automatic parallelization does not happen.

machine 'webapp' do
   action :setup
end
remote_file 'somefile.tar.gz' do
   url 'https://example.com/somefile.tar.gz'
end
machine 'db' do
   action :setup
end

Also, you can explicitly turn off parallelization by setting “auto_batch_machines = false” in Chef config (knife.rb or client.rb).
Using “machine_batch” explicitly, we can parallelize and speed up provisioning for multiple machines.

machine_batch do
   action :setup
   machines 'web-app-stack', 'db'

Machine Image

It is even possible to define machine images using a “machine_image” resource which can be used to build machines by the “machine” resource.

machine_image 'web_stack_image' do
   recipe ‘web-app-stack’
end

The above code will launch a machine using your chosen driver, install and configure the node as per the given recipes, create an image from this machine, and finally destroy the machine. This is quite similar to how Packer tool launches a node, configures it, and then freezes it as image before destroying the node.

machine 'web-app-stack' do
   from_image 'web_stack_image'
end

Here a machine “web-app-stack” when launched will already have everything in the recipe “web-app-stack”. This saves a lot of time when you want to spin up machines, which have common base recipes. Think of a situation where team members need machines with some common stuff installed, and different people install their own specific things as per requirement. In such a case, one could create an image with the basic packages e.g., build-essential, ruby, vim, etc., and that base image could use a source machine image for further work.

Load Balancer

A very common scenario is to put a bunch of machines, say web-application-servers, behind a load balancer thus achieving redundancy. Chef Provisioning has a resource specifically for load balancers, aptly called “load_balancer”. All you need to do is create the machine nodes and then pass the machines to a “load_balancer” as below.

1.upto(2) do |node_id|
  machine “web-app-stack-#{node_id}”
end
load_balancer "web-app-load-balancer" do
   machines %w(web-app-stack-1 web-app-stack-2)
end

The above code will bring up two nodes–webapp-stack-1 and webapp-stack-2 and put a load balancer in front of them.

Final Thoughts

If you are using the AWS driver, you can set machine_options as below. This is important if you want to use customized AMIs, users, security groups, etc.

with_machine_options :ssh_username => '',
  :bootstrap_options => {
    :key_name => '',
    :image_id => ‘',
    :instance_type => ‘’,
    :security_group_ids  => ''
  }

If you don’t provide the AMI ID, the AWS driver defaults to a certain AMI per region. Whatever AMI you use, you have to use the correct ssh username for the respective AMI. [3]
One very important thing to note would be that there exists a Fog driver (chef-provisioning-fog) for various cloud services including EC2. So, there are often different names for the parameters that you might want to use. For example, the chef-provisioning-aws driver that depends on AWS Ruby SDK uses “instance_type” where as the Fog driver uses “flavor_id”. Security Groups use the key “security_groups_ids” in the AWS driver and takes ID as value, but the Fog driver uses “groups” and takes the name of the Security Group as its value. This can at times lead to confusion if you are moving from one driver to another.
At the time of writing this article, I could use the help of the documentation of various drivers. The best way to understand them would be to check the examples provided, run them and learn from them–maybe even read the source code of various drivers to understand how they work. Chef Provisioning recently got bumped to 1.0.0. I would highly recommend to keep an eye on the GitHub issues in case you face some trouble.

References
[1] https://docs.chef.io/provisioning.html
[2] https://github.com/pradeepto/chef-provisioning-playground
[3] http://alestic.com/2014/01/ec2-ssh-username
[4] https://github.com/chef/chef-provisioning/issues

LET'S ENGINEER

Your Next Product Breakthrough

Book a Free 30-minute Meeting with our technology experts.

Aziro has been a true engineering partner in our digital transformation journey. Their AI-native approach and deep technical expertise helped us modernize our infrastructure and accelerate product delivery without compromising quality. The collaboration has been seamless, efficient, and outcome-driven.

CTO

Fortune 500 company