smeeb
smeeb

Reputation: 29477

Chef Configuration Constructs by Example

I am trying to wrap my head around Chef and its many configuration constructs:

enter image description here

So we have:

This is a bit overwhelming. After reading pretty deep into the Chef docs, I have the following understanding of everything:

A Node (devmyapp01) is a machine that Chef will manage configuration for. That Node belongs to an Environment (myapp-dev) and it has a Run List which is a set of Roles (mysql-database). Each Role has a Recipe which itself can have 0+ parameterizable Attributes which may be different across the different Environments. For instance, the mysql-database role may have a Recipe which contains a MAX_TABLE_SIZE Attribute, which is the max size a particular table can grow to. Perhaps in DEV it is set to 256 MB but in PROD it is 16 GB, etc. However, this is different than a Data Bag which, like an Attribute, belongs to a Recipe, but instead of being a key-value pair, is basically a JSON ball. A Cookbook is a collection of Recipes that somehow transcends Roles. A Template is a templated Cookbook that allows some kind of additional layer of parameterization/customization.

Now I'm sure my understanding is either flat out wrong or is at least somewhat mislead. Can some battle-weary Chef veteran take each of these concepts above and give a specific, concrete example for each of them in actual use? If you'd like to stick with my MySQL database example, what might be different: Nodes, Run Lists, Roles, Recipes, Attributes, Environments, Cookbooks, Templates and Data Bags look like for a Chef configuration managing a MySQL DB? If I could see an actual, practical, concrete example of all these constructs I might actually be able to wrap my head around Chef :-).

Upvotes: 0

Views: 788

Answers (2)

Tensibai
Tensibai

Reputation: 15784

What is a cookbook ?

You have a lot of parts in a cookbook, corresponding to directories:

  • Attributes : This directory will contain ruby files to define default attributes used in the recipes later.
  • Recipes : Contain ruby files defining resources (directory, files, services) and the desired state for them.
  • files : Here you'll store static files you wish to deploy on hosts (a login banner for example)
  • resources and providers, library They are used to defined custom resources or helpers to use in recipes
  • templates : Here you'll define files templates, they are a way to define files on host which are host dependent, you can give variables to the templates and use the node attributes inside (set the number of thread in mysql to be 3 times the number of cpu for exemple could be computed like node['cpu']['real']*3

At the root of the cookbook you'll find a mandatory file named metadata.rb, which define the cookbook name, its version and its dependencies.

What are dependencies ?

Let's take the database and mysql cookbooks.

In the mysql cookbook there's specific resources defined like mysql_user, mysql_database and so on. The database cookbook use this resources, it depends on the code located in the mysql cookbook.

That's why you'll find a line in its metadata.rb like this depends 'mysql', '>= 5.0.0'. This line tells chef to load the mtsql cookbook in version 5.0.0 or higher if available on the chef-server when the database cookbook is loaded.

What is a node ?

Short one: a computer on which chef-client is run. Long one: A target system where chef-client is run to get the system in the desired state. The node is also the object in chef where we store the run_list and where is stored attributes about the node (ram,cpu, jdk version, mysql version, etc.)

The attributes on the nodes are made from different sources, automatic attributes gathered by ohai ( ram, number of cpu, disks, os type, etc) are merged with attributes from loaded cookbooks, attributes from roles and attributes from environment the node belongs to. See here for the details on this part;

What is the runlist ?

The runlist is an attribute of the node which list the recipes and roles we want to apply to this node.

What is a Role ?

A role is a kind of helper, it gives you a way to define attributes and a runlist to apply to one or many nodes. the consensus is to avoid setting attributes in roles as they're not version-controlled on the server side and are generally cross environment.

A easy to understand drawback is a password change on mysql when you have an already on line setup. You have 3 server in dev/QA and PROD. If you make a change on the role, it will be applied on all environments when you probably wanted to restrict to dev then QA and then PROD once the tests are OK.

The workaround is to use a cookbook to do the same, use depends in the metadata.rb and include_recipe in this wrapper cookbook to define the runlist, and use this cookbooks attributes files to set the common attributes as in a role.

What is an environment ?

An environment is a logical group for your nodes, you can follow the dev/QA/PROD as your working envs or you may go by kind of system (web-servers/db-servers) and eventually mix both( Dev_web-servers, Dev_db-server, and so on). A node can belong to only one environment.

An environment can host attributes too, usually a dns server, a smtp server specific to this environment, etc. Same warning as roles, they are not version controlled, but the scope is less wide here as it targets a logical group of node.

The main interest of environment is the cookbook version limitation. You can control which version of which cookbook is available on each environment. This come handy when you're working a new version of a cookbook but don't want it to be applied on all your servers. If you change some myslq parameter in a my-mysql cookbook attributes file, you'll wish to restrict when the changes are made on all environment, the limitation will help you there, have QA and PROD limiting your my-mysql cookbook in version A when your dev environment is allowed to use version B.

What is a DataBag ?

As it names imply, a databag is a store of data. It's a group of Json files, each being a DataBagItem and containing json converted to a mash to be used in recipes when loaded.

DataBags goal is to store read-only data, updating a databag from a recipe is dangerous, each item being saved as a whole, two nodes trying to write the same object at the same time will go into a race condition and one change will be lost.

The main purpose of databags is to store common objects (a list of admin users, etc) you don't wish to set in a cookbook/role.

So all glued together

I tried to give a simple view of this documentation on the chef-run here on this paragraph.

We have a node running chef-client, it will ask the chef-server (or read the command line) to know it's runlist. This runlist will then be expanded (search which recipes to load in the roles). In this early stage, ohai will be executed to gather the automatic attributes.

After that load all the cookbooks from the server (or disk in solo/local mode) taking care of limitations and dependencies and read all the attributes files.

Now compile the recipes, in this phase the recipe's ruby code is executed, and a collection of resources is built. At this time nothing has changed on the machine.

Once all recipes have been compiled, the resource collection is ready, go over it and try to get each resource on the desired state. I;e: convergence phase.

The directories will be created if they don't exist. Templates will be rendered and then compared to their target, if they don't match, the target will be replaced. Services are checked for their state, if the desired state is :start and the service is stopped, chef will try to start it.

Once the convergence is done, the node state will be saved on the chef-server.

Upvotes: 2

Iain_b
Iain_b

Reputation: 1053

I'll have a go at this.

Firstly, I'll go through your definitions of the high level chef concepts and try and shed some light. I think you're understanding of node and environment is fine.

A run_list is not a set of roles, necessarily. It is an ordered list of recipes and roles to be executed. A role can be thought of as a class of node, though a node can have multiple roles. It defines a run_list and a set of attributes. When a role is put in the run list all the recipes are added to the node run_list and the attributes applied to the node. Note there is a complex order of precedence for attributes as they can be defined in multiple places (see the chef docs on attributes).

A cookbook should configure a piece of software. In order to do so you can define attributes (data), templates (inject data into files e.g. config files), recipes (where you specify what needs to be done to install the software), and some others which I'll omit to keep it simple for now. Crucially, what you haven't mentioned are resources. A recipe is (should be) declarative so you declare actions that should be performed on resources. For example, create a template or restart a service.

Data bags are a little strange. Firstly, they do not belong to a recipe. They are global. They cross-cut cookbooks and environments. There are plain and encrypted data bags available. I would advice you to think before using data bags. The main reason to use them would be to store sensitive data like passwords, though there are newer tools for handling such data (too much to cover). When it domes to structure data bags contain data bag items. The items are the json objects, and can be stored as json.

Now, a quick mysql focused example. We could have a node dev_db_01 which could belong to the dev environment. For arguments sake we could run chef-client with a run_list of a my_org_common role and a mysql default recipe. That is one role and one recipe. The role could contain various recipes. It is common enough to have a base role run on all servers which configures universal things, users/access rights, package managers, whatever. Then your mysql::default recipe would install mysql. Mysql is the cookbook which contains the recipe (incidentally I'm not looking at the open source mysql cookbook for this example but if you're managing mysql you should, a set of default attributes and some templates. You could override these attributes in a role or in an environment. You could also store passwords in an encrypted data bag (simplistic, but secure data management in chef is a topic in its own right). When you run the recipe it could render a template to generate a config file where the config values are parameterised by chef attributes. You can also install the mysql packages and start the service using the package and service resources.

I hope that helps. I've glossed over a lot of detail but I think that's necessary for a high level overview. There are more concepts/tools/practices in chef but I think you need to read/write some simple cookbooks to get a feel first.

Upvotes: 2

Related Questions