As Figure 13-1 shows, Chef Server provides a way to store shared, global data between nodes using data bags.
A data bag is a container for items that represent information about your infrastructure that is not tied to a single node. Data bags contain information that needs to be shared among more than one node. For example:
- Shared passwords
- License keys for software installs
- Shared lists of users and groups
Chef provides no mechanism to share data between nodes, as shown in Figure 13-2.
And even though
chef-client
does send a copy of the node attribute data to Chef Server after a successful Chef run, Figure 13-3 shows there is no way for other nodes to access this information directly.
Data bags are the only built-in mechanism Chef provides to store and access shared data between nodes.
Figure 13-4 shows what a data bag looks like. Each data bag contains a list of items. Each item is a JSON-formatted name-value pair collection expected to have exactly the same schema for every item in the data bag. However, the schema between two different data bags can differ. String values are quoted; integer values are not. Values can also contain lists of strings or integers as well.
Basic Command Line Data Bag Usage with Knife
Let’s start by performing a search query with
knife
on the command line. Let’s say we want to make sure that our employees alice
and bob
have local user accounts created on all nodes. We want to store this user list in a data bag, as we want to be able to add new employees to this list and have accounts created for them automatically. Data bags are the perfect solution for this problem because the list of users is global data that we want to share between nodes.
Use the chef-playground directory you created in Chapter 11. Use the same dual command prompt setup you used there. Start the
chef-zero
server on an open port in one window. We will be using port 9501 in the examples in this chapter:
Then, in the other window, make the chef-playground directory the current working directory. You don’t need to
knife upload nodes
for this chapter, but it doesn’t hurt if you’ve done it already.
Make sure the
chef-playground
directory is the current working directory:
Create a data_bags directory in chef-playground. Also, create a new data bag called
users
. It’s simply a matter of creating a new directory underneath data_bags.
Linux/Mac OS X:
Windows:
Similar to what we did in Chapter 11 to create
node
data, create some items in your data bag by creating a .jsonfile for each item. In this case, we want to create data bags for a user named alice
and a user named bob
. Create the files alice.json and bob.json as shown in Example 13-1 and Example 13-2. The data bag item contains key-value pairs with data relevant to a Unix user. String values are quoted; integer values are not. We’ll feed this data into some Chef recipe code next, so we can show you how to make a data-driven cookbook.
To create a data bag named
users
on the Chef server, run the command knife data_bag create
as follows:
To create data bag items, use the
knife data_bag from file
command. knife data_bag from file
assumes that the .json files are in a subdirectory with the specified data bag name under the directory data_bags:
To search data bags, use the name of the data bag in the
index
parameter to knife search
. In this case, our data bag name is users “:”. The following command will search for the list of users we created in a data bag:
You can add specific key-value pairs in the query part of the
knife search
command line. The following query will return the item where id == alice
:
The same search query variants we used in Chapter 12 for nodes also apply to data bags. The query fields are just slightly different as they are no longer node attributes. For example, the following query would return the items where the
id
is alice
OR bob
:
Just as we covered in Chapter 12, search results can be filtered with the
-a
parameter. For example, -a shell
returns the value only for the users shell:Creating Local Users Based on Data Bag Items in a Recipe
So far, we’ve created a list of users as data bag items, but we also want local accounts created for them. Let’s write a Chef cookbook that creates the user accounts. We’ll be using Test Kitchen and
chef_zero
to write our code, just like we did in Chapter 12.
Make sure chef-playground/cookbooks is the current working directory.
Linux/Mac OS X:
Windows:
Then generate a
users
cookbook in the chef-playground/cookbooks
directory.
Chef Development Kit:
Chef Client:
Edit the .kitchen.yml and make sure you are using the
chef_zero
provisioner and our favorite basebox image as shown in Example 13-3. Notice there is a new addition to the provisioner:
stanza, the data_bags_path:
data_bags_path
is a relative path pointing to the chef-playground/data_bags directory we created with our test data in the last section, similar to what we did with the node
test data in Chapter 12.
All files in the
data_bags_path:
directory tree get uploaded to the chef-zero
server as data bags. In production, data bags are populated with data that is not packaged with the cookbook itself. In other words, any data used for cookbook testing is normally located outside the main cookbook directory structure. In Example 13-3, we store our test data in chef-playground/data_bags, not under the subtree for the users
cookbook within cookbooks/users.
Let’s write a recipe that queries the list of users in our
users
data bag and creates a local user for each item. You can use the search()
method to perform the data bag query, just like you did for nodes in Chapter 12. Plus, you can make use of the Chef user
resource to create a user based on the information contained in the data bag.
Enter in the code for recipes/default.rb that matches Example 13-4.
We’re using the
each do
construct similar to the code we wrote in Chapter 12, this time iterating through each item of the data bag and returning the data bag contents in user_data
. user_data
is a hash containing the key-value pairs in the data bag item.
The
user
statement within the search()
block is a Chef resource. The user
resource creates a local user on the node. It takes the following attributes:comment
- One (or more) comments about the user
uid
- The numeric user identifier
gid
- The identifier for the group
home
- The location of the home directory
shell
- The login shell
The code reads these values from the
user_data
map and passes them to the Chef user
resource.
Run
kitchen converge
. If all goes well, Test Kitchen should upload the cookbook code to the sandbox environment and create the data bag entries in a chef-zero
instance. It should then run the cookbook code that performs a query for our user data bag items and creates corresponding users with the user
resource:Verify Users
Let’s verify that the users actually got created in our sandbox environment. Log in to the sandbox environment, and run
getent password
to verify that our users exist. Then make sure you exit back out to the host command prompt:
Local users
alice
and bob
should now be created with the appropriate user data.
You can also add a new item to the
users
data bag collection. Let’s add a new user called eve
. Go back to the root chef-playground directory and create a new file called chef-playground/data_bags/users/eve.json as shown in Example 13-5.
Make sure the root chef-playground directory is your current working directory, and run
knife data_bag from file
to add eve
to the global list of users:
Make your users recipe the current working directory.
Linux/Mac OS X:
Windows:
Run
kitchen converge
and then kitchen login
to check to see if the new user account got created. Make sure you exit
back out to the host prompt when you are done.
You should notice that an account for
eve
got created. Your recipe is data driven, based on the list of users maintained in the users
databag. Whenever that list changes, a node will pick up the change on its next scheduled Chef run. You didn’t have to change the recipe to get a new user account created.Encrypted Data Bags
Data bag items can be encrypted with a shared key in order to store private information on Chef Server in a secure fashion. Examples of secrets that you might want to store in an encrypted databag include:
- SSL certificates
- SSH keys
- Passwords
- License keys
Because node attributes are in plain text and can be searched—even though other nodes can’t change another node’s attributes—node attributes are not secure. Encrypted data bags are a great option, even when you want to secure an attribute for just one node, though secrets aren’t usually node specific.
Figure 13-5 shows more detail about how encrypted data bags work. When a data bag item is created with
knife data bag create
, a file containing a shared key is passed on the command line. The shared key is used as the password to encrypt the data bag item contents. When a node wants to decrypt the data bag item and access the secret in plain text, it must also pass the same shared key on its knife data bag
operations.
Let’s try working with an encrypted data bag item. Make sure the root chef-playground directory is your current working directory.
First, generate a password to be used as a shared key. Enter in the following command line, which generates a 512-byte random key and saves it to the file
encrypted_data_bag_secret
:
When symmetric key encryption is used, the password is typically a random key generated by a machine instead of a human. So we used
openssl
tool to generate a 512-byte random key. In order to represent the binary data contained in the key, we tell openssl
to use base64 encoding to represent the binary data as an ASCII string. Further, since by default the output of openssl
contains linefeeds, which are different depending on the platform, we use the translate (tr
) command to remove any linefeed characters from the secret key. Removing linefeed characters ensures that the bytes in the random key will be the same even if the platform is different.
For some test data, let’s create a .json file that contains the api key to access our credit card payment system. This is definitely something we want to keep from prying eyes. In addition to the required
id:
field, we’ll add an api_key
field to store api_key.
First, create a new directory to hold the data bag under chef-playground/data_bags to hold our
api_keys
.
Linux/Mac OS X:
Windows:
Now create the file chef-playground/data_bags/api_keys/payment_system.json by using the code provided in Example 13-6.
Create the data bag using the following command line:
When data bag items are encrypted, use the
--secret-file
command line option to pass in the shared key. Create the encrypted data bag item api_keys
by using the payment.json file that we just created, with the following:
So is the data item encrypted on Chef Server? Let’s see. Try using the
knife data bag show
command, but don’t pass the shared key:
Looks encrypted, doesn’t it? You don’t see our plain-text
api_key
anywhere in the output. The only thing that is plain text is the id:
. The id:
field cannot be encrypted, because the server uses this field to index and search for the associated encrypted data.NOTE
If you want to decrypt the item data, just use the
--secret-file
parameter as follows. Now the data bag item is shown in plain text:
There is one problem with using encrypted data bags for which Chef Software does not provide a built-in solution. How does a node get the secret key? In order for the node to decrypt the secret, it must have a copy of the shared key. Unfortunately, there is no central place to access encrypted keys, as storing encryption keys on the same system where the data resides violates all the core principles of computer security. So when you use encrypted data bags, you must find a solution to the key distribution problem.
chef-vault
Kevin Moser came up with idea for
chef-vault
in 2013 while working at Nordstrom. Kevin devised a clever solution to the key distribution issue for encrypted data bags by reusing the public/private key pairs Chef already uses for nodes to implement a key encapsulation scheme. When the data bag item is created, a shared key is generated on the node. Then, for each node that needs access to it, the shared key is encrypted with the node’s public key, creating an encrypted version of the shared key in an encapsulated payload. This encrypted version of the key is stored on Chef Server.
Before we can play with
chef-vault
in our chef-playground
setup, we need to register a legitimate client key for our devhost
node with the chef-zero
server.
Right now, if you run
knife client list
, note that devhost
isn’t in our list of clients. Chef Server doesn’t know it has a client key that allows it to store data on the server from its chef-client
runs nor does it know that devhost
is a node. Both of these conditions are required for a node to access encrypted data with chef-vault
:NOTE
The node list must be blank for this exercise to work, as the fake nodes we created in Chapter 11 do not have accompanying client keys. If you see nodes listed, start and stop the
chef-zero
server to clear them out.
Generate a new private/public client key pair for your Development Workstation, which we call
devhost
in chef-playground/.chef/knife.rb. Since chef-zero
doesn’t check the contents of the chef-playground/.chef/devhost.pem file, it doesn’t matter if we regenerate it. However, when we regenerate the client key, it ensures that a matching public key is stored on Chef Server, which does matter. Run the following command to regenerate the client key.
The
--admin
option lets the client run the APIs behind the knife client show
and knife node commands
on other nodes besides its own node. By default, knife client create
displays the client info in an editor to allow tweaking before a client.pem is generated. In our case, however, the defaults are fine, so we just pass in --disable-editing
. The --file
option writes the client.pem out to the specified filename:
Now if you run
knife client list
, the devhost
machine shows up:
We also need to associate a node with our client key, so run the following:
Now
devhost
shows up as a node as well. This is what happens when a node is bootstrapped—a client key is generated and the node is registered with Chef Server. We are just simulating this process in chef-zero
by hand:
We’re going to create a new encrypted data bag for storing root passwords, which will be managed by
chef-vault
. Create the directory chef-playground/data_bags/passwords to store the .json file we will be creating next.
Linux/Mac OS X:
Windows:
Create the file shown in Example 13-7 in chef-playground/data_bags/api_keys/mysql.json. It stores the MySQL database root user password.
chef-vault
installs a knife plugin to manage encrypted data bags. It exposes chef-vault
commands via knife vault
. Enter in the following command to create an encrypted data bag item with a secret managed by chef-vault
:
You must specify users or nodes that have valid
client keys
using the --search
and --admins
parameters. We have to use both in this example because we didn’t set up the admin
user to have a valid client key. If you run knife client list
, the admin
user isn’t present.
When you are using Chef Server, you must use the option
--mode client
.
The command line options for
knife vault
are a little different than the options for chef data_bags
. See the comprehensive documentation on knife vault
command line options.
The most important takeaway you should get from this example is that
chef-vault
can encrypt data only if Chef Server has valid client keys. This can be hard to configure in a chef-zero
setup. What we’ve done so far in this section is just enough to get chef-vault
working with chef-zero
in order to demo.
So is our data bag encrypted? Let’s perform a check using
knife data bag show
, similar to what we did in the previous section on encrypted data bags:
Looks encrypted to us!
Summary
We covered data bags in this chapter. Data bags are a powerful feature of Chef Server that let you store global information that can be shared among nodes. We also presented an overview of how data bag contents can be secured with encrypted data bags. Unfortunately, because encrypted data bags use symmetric key encryption, there is no way to distribute the shared keys to nodes that need to encrypt the data. We showed how
chef-vault
helps address this key distribution issue.
No comments:
Post a Comment