Learn How to orchestrate Your Infrastructure Fleet with Chef Provisioning
Chef Provisioning is a relatively new member in the Chef family. It can be used to build infrastructure topologies using the new machine resource. This blog post shows how this is done.
You bring up and configure individual nodes with Chef all the time. Your standard workflow would be to bootstrap a node, register the node to a Chef server, and then run Chef client to install software and configure the node. You would rinse and repeat this step for every node that you want in your fleet. Maybe you have written a nice wrapper over Chef and Knife to manage your clusters using Chef. Until recently, Chef did not have any way to understand the concept of cluster or fleet.
So if you were running a web application with some decent traffic,there would be a bunch of cookbooks and recipes to install and configure: web servers, DB server, background processor, load balancer, etc. Sometimes, you might have additional nodes for Redis or RabbitMQ. So let us say, your cluster consists of three web servers, one DB server, one server that does all the background processing, like generate PDFs or send emails etc., and one load balancer for the three web servers. Now if you wanted to bring such a cluster for multiple environments, say ātestingā, āstaging,ā and āproduction,ā you would have to repeat the steps for each environment; not to mention, your environments could possibly be powered by different providersāproduction and staging on AWS, Azure, etc. But testing could possible be on local infrastructure, maybe in VMs. This is not difficult, but it definitely makes you wonder whether you could do it betterāif only you could describe your infrastructure as code that comes up with just one command.
That is exactly what Chef Provisioning does. Chef Provisioning was introduced in Chef version 12. This helps you describe your cluster as code and build it at will as many times as you want and on various types of clouds, virtual machines, or even on bare metal.
The Concepts
Chef provisioning depends on two main pillarsāmachine resource and drivers.
Machine Resource
āmachineā is an abstract concept of a node from your infrastructure topology. It could be an AWS EC2 instance or a node on some other cloud provider. It could be a Vagrant-based virtual machine, a Linux container, or a Docker instance. It could even be a real, physical bare-metal machine. āmachineā and other related resources (like machine_batch, machine_image, etc.,) can be used to describe your cluster infrastructure.
Each āmachineā resource describes whatever it does using standard Chef recipes. General convention is to describe your fleet and its topologies using āmachineā and other resources in a separate file. We will see this in detail soon, but for now here is
how a machine is described.
#setup-cluster.rb
machine 'server' do
recipe 'nginx'
end
machine 'db' do
recipe 'mysql'
end
A recipe is one of a āmachineā resourceās attributes. Later we will see a few more of these along with their examples.
Drivers
As mentioned earlier, with Chef Provisioning you can describe your clusters and their topologies and then deploy them across a variety of clouds, VMs, bare metal, etc. For each such cloud or machine that you would like to provision, there are drivers that do the actual heavy lifting. Drivers convert the abstract āmachineā descriptions into physical reality. Drivers are responsible for acquiring the node data, connecting to them via required protocol, bootstrapping them with Chef, and running the recipes described in the āmachineā resource. Provisioning drivers need to be installed separately as gems.
Following shows how to install and use AWS driver via environment variables in your system.
$ gem install chef-provisioning-aws
$ export CHEF_DRIVER=aws
Running Chef-client on the above recipe will create two instances in your AWS account referenced by your settings in ā~/.aws/config.ā We will see an example run later in the post.
Driver can be set in your knife.rb if you so prefer. Here, we set the chef-provisioning-fog driver for AWS.
driver 'fog:AWS'
It is possible to set driver inline in the cluster recipe code.
require 'chef/provisioning/aws_driver'
with_driver 'aws'
machine 'server' do
recipe āweb-server-app'
end
In the following example, Vagrant driver is given the driver attribute and a driver URL as the value. ā/opt/vagrantfilesā will be looked up for Vagrantfiles in the following case.
machine 'server' do
driver 'vagrant:/opt/vagrantfiles'
recipe 'web-server-app'
end
Itās a good practice to keep driver details and cluster code separate as it lets you use the same cluster descriptions with different provisioners by just changing the driver in the environment.
It is possible to write your own custom provisioning drivers. But that is beyond the scope of this blog post.
The Provisioner Node
An interesting concept you need to understand is that Chef Provisioner needs a āprovisioner-nodeā to provision all machines. This node could be a node in your infrastructure or simply your workstation. chef-client (or chef-solo / chef-zero) runs on this āprovisioner nodeā against a recipe that defines your cluster recipe. Chef Provisioner then takes care of acquiring a node in your infrastructure, bootstrapping it with Chef, and then running the required recipes on the node. Thus, you will see that chef-client runs twiceāonce on the provisioner node and then on the node that is being provisioned.
The Real Thing
Let us dig a deeper now. Let us first bring up a single DB server. Using Chef knife you can upload your cookbooks to the Chef server (you could do it with chef-zero as well). Here I have put all my required recipes in a cookbook called āclusterā and uploaded it to a Chef server and set the āchef_server_urlā in my āclient.rbā and āknife.rbā. You can find all the examples here.
Machine
#recipes/webapp.rb
require 'chef/provisioning'
machine 'db' do
recipe ādatabase-serverā
end
machine 'webapp' do
recipe 'web-app-stack'
end
To run the above recipe:
sudo CHEF_DRIVER=aws chef-client -r "recipe["cluster::webapp"]"
This should bring up two nodes in your infrastructure ā a DB server and a web application server as defined by the web-app-stack recipe. The above command assumes that you have uploaded the cluster cookbook consisting of the required recipes to the Chef server.
More Machine Goodies
Like any other Chef resource, machine can have multiple actions and attributes that can be used to achieve different results. A āmachineā can have a āchef_serverā attribute, which means different machines can talk to different Chef servers. āfrom_imageā attribute can be used to set a machine image that can be used to create a machine. You can read more about machine resource here.
Parallelisation Using machine_batch
Now if you would like to have more than one web application instances in your cluster and you need more web app servers, say 5 instances, what do you do? Run a loop over your machine resource.
1.upto(5) do |i|
machine "webapp-#{i}" do
recipe 'web-app-stack'
end
end
The above code snippet, when run, should bring up and configure five instances in parallel. āmachineā resource parallelizes by default. If you describe multiple āmachineā resources consecutively with same actions, then Chef Provisioning combines them into a single (āmachine_batchā, more about this later) resource and runs it in parallel. This is great because it saves a lot of time.
The following will not parallelize because the actions are different.
machine 'webapp' do
action :setup
end
machine 'db' do
action :destroy
end
Note: if you put other resources between āmachineā resources, the automatic parallelization does not happen.
machine 'webapp' do
action :setup
end
remote_file 'somefile.tar.gz' do
url 'https://example.com/somefile.tar.gz'
end
machine 'db' do
action :setup
end
Also, you can explicitly turn off parallelization by setting āauto_batch_machines = falseā in Chef config (knife.rb or client.rb).
Using āmachine_batchā explicitly, we can parallelize and speed up provisioning for multiple machines.
machine_batch do
action :setup
machines 'web-app-stack', 'db'
Machine Image
It is even possible to define machine images using a āmachine_imageā resource which can be used to build machines by the āmachineā resource.
machine_image 'web_stack_image' do
recipe āweb-app-stackā
end
The above code will launch a machine using your chosen driver, install and configure the node as per the given recipes, create an image from this machine, and finally destroy the machine. This is quite similar to how Packer tool launches a node, configures it, and then freezes it as image before destroying the node.
machine 'web-app-stack' do
from_image 'web_stack_image'
end
Here a machine āweb-app-stackā when launched will already have everything in the recipe āweb-app-stackā. This saves a lot of time when you want to spin up machines, which have common base recipes. Think of a situation where team members need machines with some common stuff installed, and different people install their own specific things as per requirement. In such a case, one could create an image with the basic packages e.g., build-essential, ruby, vim, etc., and that base image could use a source machine image for further work.
Load Balancer
A very common scenario is to put a bunch of machines, say web-application-servers, behind a load balancer thus achieving redundancy. Chef Provisioning has a resource specifically for load balancers, aptly called āload_balancerā. All you need to do is create the machine nodes and then pass the machines to a āload_balancerā as below.
1.upto(2) do |node_id|
machine āweb-app-stack-#{node_id}ā
end
load_balancer "web-app-load-balancer" do
machines %w(web-app-stack-1 web-app-stack-2)
end
The above code will bring up two nodesāwebapp-stack-1 and webapp-stack-2 and put a load balancer in front of them.
Final Thoughts
If you are using the AWS driver, you can set machine_options as below. This is important if you want to use customized AMIs, users, security groups, etc.
with_machine_options :ssh_username => '',
:bootstrap_options => {
:key_name => '',
:image_id => ā',
:instance_type => āā,
:security_group_ids => ''
}
If you donāt provide the AMI ID, the AWS driver defaults to a certain AMI per region. Whatever AMI you use, you have to use the correct ssh username for the respective AMI. [3]
One very important thing to note would be that there exists a Fog driver (chef-provisioning-fog) for various cloud services including EC2. So, there are often different names for the parameters that you might want to use. For example, the chef-provisioning-aws driver that depends on AWS Ruby SDK uses āinstance_typeā where as the Fog driver uses āflavor_idā. Security Groups use the key āsecurity_groups_idsā in the AWS driver and takes ID as value, but the Fog driver uses āgroupsā and takes the name of the Security Group as its value. This can at times lead to confusion if you are moving from one driver to another.
At the time of writing this article, I could use the help of the documentation of various drivers. The best way to understand them would be to check the examples provided, run them and learn from themāmaybe even read the source code of various drivers to understand how they work. Chef Provisioning recently got bumped to 1.0.0. I would highly recommend to keep an eye on the GitHub issues in case you face some trouble.
References
[1] https://docs.chef.io/provisioning.html
[2] https://github.com/pradeepto/chef-provisioning-playground
[3] http://alestic.com/2014/01/ec2-ssh-username
[4] https://github.com/chef/chef-provisioning/issues
Ā
Aziro Marketing