This blog has moved
I've migrated the blog to madstop.com, including copying over most of the posts. Blogspot was just getting to be too difficult to write in, and I wanted the blog to be on that domain, rather than blogspot.com.
Tales of the development of Puppet, a configuration management tool, along with anything else I happen to come across related to configuration management.
I've migrated the blog to madstop.com, including copying over most of the posts. Blogspot was just getting to be too difficult to write in, and I wanted the blog to be on that domain, rather than blogspot.com.
Now that the Web 2.0 Conference is over, it's time to get back to puppet development. The main problem I'm still trying to solve is service management -- I'm not quite sure there's a single abstraction that will work, since each and every *nix does service startup slightly differently, either using a single script for starting all services, or making links for everything but then using shell script config files (ugh) to determine whether a service actually starts, or using something like Solaris's SMF or Mac OS X's LaunchD to do it for you.
Trying to do service management has convinced me that the real answer is to not worry about getting it right just yet, since it's going to take many iterations to get it right on all of the platforms. Instead, I'm going to focus on making Puppet easy to update, so that as things work better, it'll be easy for existing customers to get that better functionality.
I need to come up with some kind of versioning system within Puppet; I'm thinking that the Puppet framework itself will have one version, and then each primitive (e.g., 'file', 'package', etc.) will have its own version. This might not work out that well in the long run, since we'll often update a primitive for one platform (e.g., add a new packaging type) without updating it for the other platforms, but it seems to be an acceptable compromise for now.
The first step is to come up with a way of registering and checking versions (which should be pretty easy, since I'm already registering all of the primitives), then I just need to add a 'versioncheck' method or something to the primary configuration protocol, and finally have some kind of self-update mechanism (yay!).
Hmmm.
I have to say that overall the conference is amazing, but I'm disappointed that the Web 2.0 concepts are not being broken down structurally as much as I'd hoped.
So, blogspot has already lost two of my posts, and it forces me to write in HTML instead of a simpler markup language like restructured text, so I'm writing in restructured text on the side and sending the rst2html output up to blogspot. That's why the html looks insanely bad.
I'm writing this on the plane to the Web 2.0 conference.
I now have basic tagging working in Puppet. They're generated on the server when a configuration is provided, and they're stored with the individual objects in the configuration. For instance, here is what some of the tags look like in a short, example configuration on my OS X laptop:
file(/etc/issue):
remotefile base nodebase tsetse puppet file /etc/issue
file(/etc/motd):
base nodebase tsetse puppet file /etc/motd
file(/root):
remotefile base nodebase tsetse puppet file /root
file(/tmp/screens/.):
base nodebase tsetse puppet file /tmp/screens/.
file(/usr/local/scripts):
remotefile base nodebase tsetse puppet file
/usr/local/scripts
file(/var/spool/cron):
base nodebase tsetse puppet file /var/spool/cron
symlink(/etc/resolv.conf):
darwin nodebase tsetse puppet symlink /etc/resolv.conf
I'll break down the configuration that generated this admittedly-basic list of objects and tags.
For each line, the last two tags are the object type and the path to the object (they could also be service names, user names, etc.). I think I should also add a tag for each of the files that mention the object, along with the respective lines from each file.
The tags associated with a specific configuration are sent with the configuration to the appropriate server, but they're also merged into a central repository, so as each node connects and generates its tag list, the central tag repository gets more comprehensive. (Each node's configuration needs to be compiled before its tag list can be generated, and the configuration requires information from the client before it can be compiled, so the nodes need to connect to get its tags.)
The tags are currently unused, but their mere existence has got me thinking:
Logging
I have been expecting that most people would just use syslog to pass logs around -- it's easy, it's pervasive, and Puppet has been developed specifically to be easily compatible. However, it would be difficult to shoehorn tags into syslog, or at least it would be annoying to get them back out; it would need to be based on pattern matching a specific line, which is never pleasant. I'm thinking instead that I could write a log server capable of receiving these log messages, and then enhancing the log messages to store the tags associated with the node that generated the message.
Imagine being able to trivially find every log message generated by any darwin node in a given time period, or the the error messages from a specific network associated with DNS. Build a database of all of these log messages, with the fields and tags set up appropriately, slap a Rails interface on it, and you could get that pretty easily.
Metrics
Similarly, I have been expecting people to ship their Puppet metrics off to a specific performance app, but there probably aren't any apps out there that gracefully handle associating arbitrary tags with each metric. Build a simple server to accept the metrics that Puppet generates, and suddenly you can generate change count reports on a specific server class like 'webserver' or on how many out of sync security directives there are in the DMZ.
Ticketing
I know I'd like to have each of these tags stored with each ticket generated from Puppet (yes, I hope to eventually have Puppet autogenerate tickets). Query for all outstanding tickets on the web farm or any tickets related to the recent Solaris upgrdae.
Reference
With a web-based annotation system for the configuration, you could use tags as a kind of wiki keyword system -- selecting a tag in a log message automatically finds the definition of that tag (as a server class or functional component or node or whatever).
One of the thing unaddressed in the current proof-of-concept system is that servers will often want to generate their own tags. In particular, I can see adding an 'unmanaged' tag to objects that get mentioned in a configuration (e.g., as a requirement) but that are not managed, or storing the sync state as at least a local tag for an object (e.g., error, synced). Should these tags make their way to the central system, or should users at least be able to get access to those tags?
The concept of nodes injecting tags brings up the concept of maybe using tags as a means of storing state, or rather, turning the state of an object into just another tag. Again, this would be useful mostly for reports, but it could certainly help quickly find all nodes in an error state, along with the objects associated with those nodes, and this would be especially useful if it were updated live. It might make sense to just support this as a live query against the system -- it should be pretty light-weight, at least compared to, say, querying the actual configuration.
This most likely makes sense to use as either a last resort or as a verification system. If you've got a ticketing system for reporting errors, Puppet could automatically generate tickets when an object fails, and then the ticket system could automatically verify that the object is no longer in an error state before it allows the user to close the ticket.
One of my primary design goals in Puppet is to encourage and support configuration sharing. Configuration management will never advance as a field when every organization has to create its own definition for every server class, because those definitions, like servers managed without automation, present unnecessary and often arbitrary variation. So, I know that collaboration will already be critical to the future of Puppet.
How can tagging affect that collaboration? Would it be beneficial if people published the tag lists associated with each of the objects they manage? Could seeing someone else's tags on an object provide new ideas for organization or configuration? Would people who are unwilling to share their whole configuration be willing to just share their tags?
Could tags be used to discover a node's existing configuration? Maybe provide a few important tags through autodiscovery, like operating system, host name, and network, and then use that to start collecting a larger configuration over time.
This seems pretty damn unlikely.
Bjork or Goldthwait? Sorry, couldn't resist. Just like in cfengine (although I definitely took the long way around), classes in Puppet can generally be considered as booleans, i.e., tags. There's a heckuvalot of additional semantics associated with these tags compared to, say, a Flickr tag, since adding a tag causes work to happen on the system in question, but it can still be considered just a tag.
As the configurations get both more complex and more dynamic (changing often either because it is designated to do so by users, or because it is automatically reacting to network state, time, etc.), it might make the most sense to have a static configuration in one place that does not mention nodes at all, and then almost a scratch space for the nodes, where you can dynamically pin tags on a host and watch the chaos ensue.
At the very least, if you had a configuration that was changing constantly, you would almost definitely benefit from a map that showed the tags on a given host.
Is it worth deemphasizing the boolean aspect of tags, and add typing instead? For instance, is it worth differentiating server class tags from component tags? Or, like so much else, does it just make sense to intelligently pick class and component names, so that it's relatively apparent?
Obviously the easy solution is to start without typed tags, but it's something to keep in mind. I intuitively think that a lot of the power of tags comes specifically from their simplicity, and adding complexity would probably tend to decrease the flexibility.
I am attending the Web 2.0 Conference this week, so I'm preparing for it by thinking about how it can apply to Puppet. I'm guessing that I'm ignoring a significant portion of what's considered to be a crucial aspect of Web 2.0, because I'm focusing on the immense value-add of tagging objects and then providing simple interfaces for sharing and browsing tags, which I think of as the 'tag and flatten' method. That is, rather than building up specific, static heirarchies based on whatever specific categories or criteria, just tag every object in the heirarchy with whatever details you think matter and then flatten the heirarchy, letting the tags themselves draw patterns out.
A good example of self-discovered patterns are Flickr's tag clusters, which are algorithmically determined groups of related tags -- that is, they are tags that are determined to be related by 1) individuals tagging individual items with whatever they want, and then 2) computers assessing those tags and finding sets of tags that seem "related", via whatever definition the algorithm is using.
How can Puppet take advantage of this? I mentioned it briefly in my letter to the O'Reilly Radar, but it's pretty simple. One way to talk about Puppet's goals is that it is empowering sysadmins to normalize object specifications across an entire network. Given any configurable element on any machine on your network -- file, user, package, cron job, IP address -- that individual element should only be mentioned one time in your configuration, and then every host or host class that needs that element (e.g., to install a package or provision a user) just imports that portion of the specification.
The top-down way of looking at this importing is that it presents a bit of a heirarchy, from server, to server class, to service, to element, except that it's not one heirarchy, it's a myriad of heirarchies, one for every node and every element. Try to draw a map of these relationships in anything resembling a real heirarchy and you will soon need more dimensions than string theory. Tag that same set of heirarchies, though, and tag-and-flatten them, so that each configurable element is tagged with the host names, services, and server classes that mention it, and you are no longer forced into one way of seeing the data. You can draw out whatever structures you think are appropriate, and sufficient tools should be able to draw them out for you, just like Flickr's tag clusters.
If you do this for your whole network, you could go to this tagged-and-flattened list of elements and quickly determine which hosts have a given user on them, or which server classes require Apache. And because you have these tags on the elements themselves, you could subsequently tag those elements' data with the same tags. Centralized Apache logs are great, but they lose a lot of implicit information. Imagine a log centralizer that also tagged each log message with all of the Apache element's tags (localized to the specific host, of course -- that is, the tag list would not include every host name with Apache in it or all of the server classes that use Apache, just the tags related to Apache running locally) -- go to your log database and see all Apache logs related to a specific server class, or a specific server, or network, or whatever you want.
If course, those add-on tags require quite a bit of additional infrastructure -- standard means of referring to an individually configurable element (hah! just try to standardize the definition of an element, I dare you!), plus standard ways of retrieving the tags on them, and then tools that actually do so. I'm guessing that this is not worth it for many purposes or many organizations, but I think it is worth it for some, and I also think it adds a helluvalot more value than is necessarily obvious.
Puppet is not quite ready to support this level of pervasive tagging, at least partially because I've been stupidly focused on making it actually do work, but... maybe this is what I will do in the 36 hours or so I have between now and when the conference starts.
Here's a letter I sent to O'Reilly Radar, in relation to my attendance of the Web 2.0 conference:
define apache {
service { apache: running => true }
package { apache: install => latest }
file { "/etc/apache":
source => "puppet://server/source",
recurse => true
}
}
import "apache" # the definition above
class webserver {
apache {}
}
case $hostname {
culain: { webserver {} }
}