As Figure 13-1 shows, Chef Server provides a way to store shared, global data between nodes using data bags.
A data bag is a container for items that represent information about your infrastructure that is not tied to a single node. Data bags contain information that needs to be shared among more than one node. For example:
- Shared passwords
 - License keys for software installs
 - Shared lists of users and groups
 
Chef provides no mechanism to share data between nodes, as shown in Figure 13-2.
And even though 
chef-client does send a copy of the node attribute data to Chef Server after a successful Chef run, Figure 13-3 shows there is no way for other nodes to access this information directly.
Data bags are the only built-in mechanism Chef provides to store and access shared data between nodes.
Figure 13-4 shows what a data bag looks like. Each data bag contains a list of items. Each item is a JSON-formatted name-value pair collection expected to have exactly the same schema for every item in the data bag. However, the schema between two different data bags can differ. String values are quoted; integer values are not. Values can also contain lists of strings or integers as well.
Basic Command Line Data Bag Usage with Knife
Let’s start by performing a search query with 
knife on the command line. Let’s say we want to make sure that our employees alice and bob have local user accounts created on all nodes. We want to store this user list in a data bag, as we want to be able to add new employees to this list and have accounts created for them automatically. Data bags are the perfect solution for this problem because the list of users is global data that we want to share between nodes.
Use the chef-playground directory you created in Chapter 11. Use the same dual command prompt setup you used there. Start the 
chef-zero server on an open port in one window. We will be using port 9501 in the examples in this chapter:
Then, in the other window, make the chef-playground directory the current working directory. You don’t need to 
knife upload nodes for this chapter, but it doesn’t hurt if you’ve done it already.
Make sure the 
chef-playground directory is the current working directory:
Create a data_bags directory in chef-playground. Also, create a new data bag called 
users. It’s simply a matter of creating a new directory underneath data_bags.
Linux/Mac OS X:
Windows:
Similar to what we did in Chapter 11 to create 
node data, create some items in your data bag by creating a .jsonfile for each item. In this case, we want to create data bags for a user named alice and a user named bob. Create the files alice.json and bob.json as shown in Example 13-1 and Example 13-2. The data bag item contains key-value pairs with data relevant to a Unix user. String values are quoted; integer values are not. We’ll feed this data into some Chef recipe code next, so we can show you how to make a data-driven cookbook.
To create a data bag named 
users on the Chef server, run the command knife data_bag create as follows:
To create data bag items, use the 
knife data_bag from file command. knife data_bag from fileassumes that the .json files are in a subdirectory with the specified data bag name under the directory data_bags:
To search data bags, use the name of the data bag in the 
index parameter to knife search. In this case, our data bag name is users “:”. The following command will search for the list of users we created in a data bag:
You can add specific key-value pairs in the query part of the 
knife search command line. The following query will return the item where id == alice:
The same search query variants we used in Chapter 12 for nodes also apply to data bags. The query fields are just slightly different as they are no longer node attributes. For example, the following query would return the items where the 
id is alice OR bob:
Just as we covered in Chapter 12, search results can be filtered with the 
-a parameter. For example, -a shellreturns the value only for the users shell:Creating Local Users Based on Data Bag Items in a Recipe
So far, we’ve created a list of users as data bag items, but we also want local accounts created for them. Let’s write a Chef cookbook that creates the user accounts. We’ll be using Test Kitchen and 
chef_zero to write our code, just like we did in Chapter 12.
Make sure chef-playground/cookbooks is the current working directory.
Linux/Mac OS X:
Windows:
Then generate a 
users cookbook in the chef-playground/cookbooks directory.
Chef Development Kit:
Chef Client:
Edit the .kitchen.yml and make sure you are using the 
chef_zero provisioner and our favorite basebox image as shown in Example 13-3. Notice there is a new addition to the provisioner: stanza, the data_bags_path:data_bags_path is a relative path pointing to the chef-playground/data_bags directory we created with our test data in the last section, similar to what we did with the node test data in Chapter 12.
All files in the 
data_bags_path: directory tree get uploaded to the chef-zero server as data bags. In production, data bags are populated with data that is not packaged with the cookbook itself. In other words, any data used for cookbook testing is normally located outside the main cookbook directory structure. In Example 13-3, we store our test data in chef-playground/data_bags, not under the subtree for the users cookbook within cookbooks/users.
Let’s write a recipe that queries the list of users in our 
users data bag and creates a local user for each item. You can use the search() method to perform the data bag query, just like you did for nodes in Chapter 12. Plus, you can make use of the Chef user resource to create a user based on the information contained in the data bag.
Enter in the code for recipes/default.rb that matches Example 13-4.
We’re using the 
each do construct similar to the code we wrote in Chapter 12, this time iterating through each item of the data bag and returning the data bag contents in user_data. user_data is a hash containing the key-value pairs in the data bag item.
The 
user statement within the search() block is a Chef resource. The user resource creates a local user on the node. It takes the following attributes:comment- One (or more) comments about the user
 uid- The numeric user identifier
 gid- The identifier for the group
 home- The location of the home directory
 shell- The login shell
 
The code reads these values from the 
user_data map and passes them to the Chef user resource.
Run 
kitchen converge. If all goes well, Test Kitchen should upload the cookbook code to the sandbox environment and create the data bag entries in a chef-zero instance. It should then run the cookbook code that performs a query for our user data bag items and creates corresponding users with the user resource:Verify Users
Let’s verify that the users actually got created in our sandbox environment. Log in to the sandbox environment, and run 
getent password to verify that our users exist. Then make sure you exit back out to the host command prompt:
Local users 
alice and bob should now be created with the appropriate user data.
You can also add a new item to the 
users data bag collection. Let’s add a new user called eve. Go back to the root chef-playground directory and create a new file called chef-playground/data_bags/users/eve.json as shown in Example 13-5.
Make sure the root chef-playground directory is your current working directory, and run 
knife data_bag from file to add eve to the global list of users:
Make your users recipe the current working directory.
Linux/Mac OS X:
Windows:
Run 
kitchen converge and then kitchen login to check to see if the new user account got created. Make sure you exit back out to the host prompt when you are done.
You should notice that an account for 
eve got created. Your recipe is data driven, based on the list of users maintained in the users databag. Whenever that list changes, a node will pick up the change on its next scheduled Chef run. You didn’t have to change the recipe to get a new user account created.Encrypted Data Bags
Data bag items can be encrypted with a shared key in order to store private information on Chef Server in a secure fashion. Examples of secrets that you might want to store in an encrypted databag include:
- SSL certificates
 - SSH keys
 - Passwords
 - License keys
 
Because node attributes are in plain text and can be searched—even though other nodes can’t change another node’s attributes—node attributes are not secure. Encrypted data bags are a great option, even when you want to secure an attribute for just one node, though secrets aren’t usually node specific.
Figure 13-5 shows more detail about how encrypted data bags work. When a data bag item is created with 
knife data bag create, a file containing a shared key is passed on the command line. The shared key is used as the password to encrypt the data bag item contents. When a node wants to decrypt the data bag item and access the secret in plain text, it must also pass the same shared key on its knife data bag operations.
Let’s try working with an encrypted data bag item. Make sure the root chef-playground directory is your current working directory.
First, generate a password to be used as a shared key. Enter in the following command line, which generates a 512-byte random key and saves it to the file 
encrypted_data_bag_secret:
When symmetric key encryption is used, the password is typically a random key generated by a machine instead of a human. So we used 
openssl tool to generate a 512-byte random key. In order to represent the binary data contained in the key, we tell openssl to use base64 encoding to represent the binary data as an ASCII string. Further, since by default the output of openssl contains linefeeds, which are different depending on the platform, we use the translate (tr) command to remove any linefeed characters from the secret key. Removing linefeed characters ensures that the bytes in the random key will be the same even if the platform is different.
For some test data, let’s create a .json file that contains the api key to access our credit card payment system. This is definitely something we want to keep from prying eyes. In addition to the required 
id: field, we’ll add an api_key field to store api_key.
First, create a new directory to hold the data bag under chef-playground/data_bags to hold our 
api_keys.
Linux/Mac OS X:
Windows:
Now create the file chef-playground/data_bags/api_keys/payment_system.json by using the code provided in Example 13-6.
Create the data bag using the following command line:
When data bag items are encrypted, use the 
--secret-file command line option to pass in the shared key. Create the encrypted data bag item api_keys by using the payment.json file that we just created, with the following:
So is the data item encrypted on Chef Server? Let’s see. Try using the 
knife data bag show command, but don’t pass the shared key:
Looks encrypted, doesn’t it? You don’t see our plain-text 
api_key anywhere in the output. The only thing that is plain text is the id:. The id: field cannot be encrypted, because the server uses this field to index and search for the associated encrypted data.NOTE
If you want to decrypt the item data, just use the 
--secret-file parameter as follows. Now the data bag item is shown in plain text:
There is one problem with using encrypted data bags for which Chef Software does not provide a built-in solution. How does a node get the secret key? In order for the node to decrypt the secret, it must have a copy of the shared key. Unfortunately, there is no central place to access encrypted keys, as storing encryption keys on the same system where the data resides violates all the core principles of computer security. So when you use encrypted data bags, you must find a solution to the key distribution problem.
chef-vault
Kevin Moser came up with idea for 
chef-vault in 2013 while working at Nordstrom. Kevin devised a clever solution to the key distribution issue for encrypted data bags by reusing the public/private key pairs Chef already uses for nodes to implement a key encapsulation scheme. When the data bag item is created, a shared key is generated on the node. Then, for each node that needs access to it, the shared key is encrypted with the node’s public key, creating an encrypted version of the shared key in an encapsulated payload. This encrypted version of the key is stored on Chef Server.
Before we can play with 
chef-vault in our chef-playground setup, we need to register a legitimate client key for our devhost node with the chef-zero server.
Right now, if you run 
knife client list, note that devhost isn’t in our list of clients. Chef Server doesn’t know it has a client key that allows it to store data on the server from its chef-client runs nor does it know that devhost is a node. Both of these conditions are required for a node to access encrypted data with chef-vault:NOTE
The node list must be blank for this exercise to work, as the fake nodes we created in Chapter 11 do not have accompanying client keys. If you see nodes listed, start and stop the 
chef-zero server to clear them out.
Generate a new private/public client key pair for your Development Workstation, which we call 
devhost in chef-playground/.chef/knife.rb. Since chef-zero doesn’t check the contents of the chef-playground/.chef/devhost.pem file, it doesn’t matter if we regenerate it. However, when we regenerate the client key, it ensures that a matching public key is stored on Chef Server, which does matter. Run the following command to regenerate the client key.
The 
--admin option lets the client run the APIs behind the knife client show and knife node commands on other nodes besides its own node. By default, knife client create displays the client info in an editor to allow tweaking before a client.pem is generated. In our case, however, the defaults are fine, so we just pass in --disable-editing. The --file option writes the client.pem out to the specified filename:
Now if you run 
knife client list, the devhost machine shows up:
We also need to associate a node with our client key, so run the following:
Now 
devhost shows up as a node as well. This is what happens when a node is bootstrapped—a client key is generated and the node is registered with Chef Server. We are just simulating this process in chef-zero by hand:
We’re going to create a new encrypted data bag for storing root passwords, which will be managed by 
chef-vault. Create the directory chef-playground/data_bags/passwords to store the .json file we will be creating next.
Linux/Mac OS X:
Windows:
Create the file shown in Example 13-7 in chef-playground/data_bags/api_keys/mysql.json. It stores the MySQL database root user password.
chef-vault installs a knife plugin to manage encrypted data bags. It exposes chef-vault commands via knife vault. Enter in the following command to create an encrypted data bag item with a secret managed by chef-vault:
You must specify users or nodes that have valid 
client keys using the --search and --admins parameters. We have to use both in this example because we didn’t set up the admin user to have a valid client key. If you run knife client list, the admin user isn’t present.
When you are using Chef Server, you must use the option 
--mode client.
The command line options for 
knife vault are a little different than the options for chef data_bags. See the comprehensive documentation on knife vault command line options.
The most important takeaway you should get from this example is that 
chef-vault can encrypt data only if Chef Server has valid client keys. This can be hard to configure in a chef-zero setup. What we’ve done so far in this section is just enough to get chef-vault working with chef-zero in order to demo.
So is our data bag encrypted? Let’s perform a check using 
knife data bag show, similar to what we did in the previous section on encrypted data bags:
Looks encrypted to us!
Summary
We covered data bags in this chapter. Data bags are a powerful feature of Chef Server that let you store global information that can be shared among nodes. We also presented an overview of how data bag contents can be secured with encrypted data bags. Unfortunately, because encrypted data bags use symmetric key encryption, there is no way to distribute the shared keys to nodes that need to encrypt the data. We showed how 
chef-vaulthelps address this key distribution issue.




No comments:
Post a Comment