<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-17147343</id><updated>2011-12-14T21:02:27.045-06:00</updated><title type='text'>Configuration Management</title><subtitle type='html'>Tales of the development of Puppet, a configuration management tool, along with anything else I happen to come across related to configuration management.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>14</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-17147343.post-112985756711588614</id><published>2005-10-20T20:18:00.000-05:00</published><updated>2005-10-20T20:19:27.120-05:00</updated><title type='text'>This blog has moved</title><content type='html'>I've migrated the blog to &lt;a href="http://madstop.com"&gt;madstop.com&lt;/a&gt;, including copying over most of the posts.  Blogspot was just getting to be too difficult to write in, and I wanted the blog to be on that domain, rather than blogspot.com.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112985756711588614?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112985756711588614/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112985756711588614' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112985756711588614'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112985756711588614'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/this-blog-has-moved.html' title='This blog has moved'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112923955357246705</id><published>2005-10-13T16:39:00.000-05:00</published><updated>2005-10-13T16:40:25.216-05:00</updated><title type='text'>Self-updating</title><content type='html'>&lt;div class="document"&gt;&lt;br /&gt;&lt;p&gt;Now that the &lt;a class="reference" href="http://web2con.com"&gt;Web 2.0 Conference&lt;/a&gt; is over, it's time to get back to &lt;a class="reference" href="http://reductivelabs.com/projects/puppet"&gt;puppet&lt;/a&gt; development.  The main problem I'm still trying to solve is service management -- I'm not quite sure there's a single abstraction that will work, since each and every *nix does service startup slightly differently, either using a single script for starting all services, or making links for everything but then using shell script config files (ugh) to determine whether a service actually starts, or using something like Solaris's SMF or Mac OS X's LaunchD to do it for you.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Trying to do service management has convinced me that the real answer is to not worry about getting it right just yet, since it's going to take many iterations to get it right on all of the platforms.  Instead, I'm going to focus on making Puppet easy to update, so that as things work better, it'll be easy for existing customers to get that better functionality.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I need to come up with some kind of versioning system within Puppet; I'm thinking that the Puppet framework itself will have one version, and then each primitive (e.g., 'file', 'package', etc.) will have its own version.  This might not work out that well in the long run, since we'll often update a primitive for one platform (e.g., add a new packaging type) without updating it for the other platforms, but it seems to be an acceptable compromise for now.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;The first step is to come up with a way of registering and checking versions (which should be pretty easy, since I'm already registering all of the primitives), then I just need to add a 'versioncheck' method or something to the primary configuration protocol, and finally have some kind of self-update mechanism (yay!).&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Hmmm.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112923955357246705?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112923955357246705/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112923955357246705' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112923955357246705'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112923955357246705'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/self-updating.html' title='Self-updating'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112864406847803212</id><published>2005-10-06T19:09:00.000-05:00</published><updated>2005-10-06T19:14:28.486-05:00</updated><title type='text'>Web 2.0 so far</title><content type='html'>I have to say that overall the conference is amazing, but I'm disappointed that the Web 2.0 concepts are not being broken down structurally as much as I'd hoped.&lt;br /&gt;&lt;br /&gt;This is a huge marketplace, though, especially for me -- all of these social software companies are going to be building relatively large hosting setups, and they don't want to suffer from those setups any more than they have to.  I've already talked to tons of potential customers, and am basically out of cards.&lt;br /&gt;&lt;br /&gt;What I haven't been able to get, though, is an in-depth discussion of applying the principles of Web 2.0 to something very different, or even what the principles are.  I've also been pretty disappointed that not many of the panels are talking about the structural aspects of Web 2.0 and instead are talking a lot more about the market aspects.  In fact, it seems like quite a few of the more lengthy panelists really don't understand why they're on the stage or what this conference is supposed to be about.  I've seen a few questions already that seemed pretty obvious but met blank faces.&lt;br /&gt;&lt;br /&gt;It now looks relatively unlikely that I'll be able to delve deep into the technology and principles before I leave, but I would love to find someone who's really interested in the principles themselves and then spend an hour or so hashing it out.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112864406847803212?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112864406847803212/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112864406847803212' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112864406847803212'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112864406847803212'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/web-20-so-far.html' title='Web 2.0 so far'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112848706161506259</id><published>2005-10-04T23:35:00.000-05:00</published><updated>2005-10-04T23:37:41.616-05:00</updated><title type='text'>HTML Sucks</title><content type='html'>So, blogspot has already lost two of my posts, and it forces me to write in HTML instead of a simpler markup language like restructured text, so I'm writing in restructured text on the side and sending the rst2html output up to blogspot.  That's why the html looks insanely bad.&lt;br /&gt;&lt;br /&gt;I don't expect to be on this site that much longer, considering that 1) it's a pretty big pain to post, and 2) it is fond of losing posts, especially using Camino.&lt;br /&gt;&lt;br /&gt;Anyone have any better recommendations?  Especially something that lets me write in a simplified markup like ReST, rather than having to write real HTML?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112848706161506259?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112848706161506259/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112848706161506259' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112848706161506259'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112848706161506259'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/html-sucks.html' title='HTML Sucks'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112848659840504430</id><published>2005-10-04T23:29:00.000-05:00</published><updated>2005-10-04T23:35:11.860-05:00</updated><title type='text'>Pervasive Tagging</title><content type='html'>&lt;div class="document"&gt;&lt;br /&gt;&lt;p&gt;I'm writing this on the plane to the Web 2.0 conference.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I now have basic tagging working in Puppet.  They're generated on the server when a configuration is provided, and they're stored with the individual objects in the configuration.  For instance, here is what some of the tags look like in a short, example configuration on my OS X laptop:&lt;/p&gt;&lt;br /&gt;&lt;pre class="literal-block"&gt;&lt;br /&gt;file(/etc/issue):&lt;br /&gt;    remotefile base nodebase tsetse puppet file /etc/issue&lt;br /&gt;file(/etc/motd):&lt;br /&gt;    base nodebase tsetse puppet file /etc/motd&lt;br /&gt;file(/root):&lt;br /&gt;    remotefile base nodebase tsetse puppet file /root&lt;br /&gt;file(/tmp/screens/.):&lt;br /&gt;    base nodebase tsetse puppet file /tmp/screens/.&lt;br /&gt;file(/usr/local/scripts):&lt;br /&gt;    remotefile base nodebase tsetse puppet file&lt;br /&gt;    /usr/local/scripts&lt;br /&gt;file(/var/spool/cron):&lt;br /&gt;    base nodebase tsetse puppet file /var/spool/cron&lt;br /&gt;symlink(/etc/resolv.conf):&lt;br /&gt;    darwin nodebase tsetse puppet symlink /etc/resolv.conf&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;p&gt;I'll break down the configuration that generated this admittedly-basic list of objects and tags.&lt;/p&gt;&lt;br /&gt;&lt;ul class="simple"&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;tsetse&lt;/em&gt; The laptop name&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;base&lt;/em&gt; The base class; all nodes are members&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;basenode&lt;/em&gt; The, um, base node; all nodes inherit from it, and it basically just loads the node's operatingsystem class and the base class.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;puppet&lt;/em&gt; This is the top-level collection of the entire configuration.  This tag may not stay, since it would currently be on every single object.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;remotefile&lt;/em&gt; A simple wrapper function to encapsulate my primary method of copying files around my network.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;darwin&lt;/em&gt; The laptop's operating system class (based on the output of &lt;cite&gt;uname -s&lt;/cite&gt;)&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;p&gt;For each line, the last two tags are the object type and the path to the object (they could also be service names, user names, etc.).  I think I should also add a tag for each of the files that mention the object, along with the respective lines from each file.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;The tags associated with a specific configuration are sent with the configuration to the appropriate server, but they're also merged into a central repository, so as each node connects and generates its tag list, the central tag repository gets more comprehensive.  (Each node's configuration needs to be compiled before its tag list can be generated, and the configuration requires information from the client before it can be compiled, so the nodes need to connect to get its tags.)&lt;/p&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="possible-uses" name="possible-uses"&gt;Possible uses&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;The tags are currently unused, but their mere existence has got me thinking:&lt;/p&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Logging&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I have been expecting that most people would just use syslog to pass logs around -- it's easy, it's pervasive, and Puppet has been developed specifically to be easily compatible.  However, it would be difficult to shoehorn tags into syslog, or at least it would be annoying to get them back out; it would need to be based on pattern matching a specific line, which is never pleasant.  I'm thinking instead that I could write a log server capable of receiving these log messages, and then enhancing the log messages to store the tags associated with the node that generated the message.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Imagine being able to trivially find every log message generated by any darwin node in a given time period, or the the error messages from a specific network associated with DNS.  Build a database of all of these log messages, with the fields and tags set up appropriately, slap a Rails interface on it, and you could get that pretty easily.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Metrics&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Similarly, I have been expecting people to ship their Puppet metrics off to a specific performance app, but there probably aren't any apps out there that gracefully handle associating arbitrary tags with each metric.  Build a simple server to accept the metrics that Puppet generates, and suddenly you can generate change count reports on a specific server class like 'webserver' or on how many out of sync security directives there are in the DMZ.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Ticketing&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I know I'd like to have each of these tags stored with each ticket generated from Puppet (yes, I hope to eventually have Puppet autogenerate tickets).  Query for all outstanding tickets on the web farm or any tickets related to the recent Solaris upgrdae.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Reference&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;With a web-based annotation system for the configuration, you could use tags as a kind of wiki keyword system -- selecting a tag in a log message automatically finds the definition of that tag (as a server class or functional component or node or whatever).&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="tag-injections" name="tag-injections"&gt;Tag injections&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;One of the thing unaddressed in the current proof-of-concept system is that servers will often want to generate their own tags.  In particular, I can see adding an 'unmanaged' tag to objects that get mentioned in a configuration (e.g., as a requirement) but that are not managed, or storing the sync state as at least a local tag for an object (e.g., error, synced).  Should these tags make their way to the central system, or should users at least be able to get access to those tags?&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="state-tags" name="state-tags"&gt;State tags&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;The concept of nodes injecting tags brings up the concept of maybe using tags as a means of storing state, or rather, turning the state of an object into just another tag.  Again, this would be useful mostly for reports, but it could certainly help quickly find all nodes in an error state, along with the objects associated with those nodes, and this would be especially useful if it were updated live.  It might make sense to just support this as a live query against the system -- it should be pretty light-weight, at least compared to, say, querying the actual configuration.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;This most likely makes sense to use as either a last resort or as a verification system.  If you've got a ticketing system for reporting errors, Puppet could automatically generate tickets when an object fails, and then the ticket system could automatically verify that the object is no longer in an error state before it allows the user to close the ticket.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="collaborative-tagging" name="collaborative-tagging"&gt;Collaborative tagging&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;One of my primary design goals in Puppet is to encourage and support configuration sharing.  Configuration management will never advance as a field when every organization has to create its own definition for every server class, because those definitions, like servers managed without automation, present unnecessary and often arbitrary variation.  So, I know that collaboration will already be critical to the future of Puppet.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;How can tagging affect that collaboration?  Would it be beneficial if people published the tag lists associated with each of the objects they manage?  Could seeing someone else's tags on an object provide new ideas for organization or configuration?  Would people who are unwilling to share their whole configuration be willing to just share their tags?&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="configuration-discovery" name="configuration-discovery"&gt;Configuration discovery&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Could tags be used to discover a node's existing configuration?  Maybe provide a few important tags through autodiscovery, like operating system, host name, and network, and then use that to start collecting a larger configuration over time.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;This seems pretty damn unlikely.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="tags-or-classes" name="tags-or-classes"&gt;Tags or classes?&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Bjork or Goldthwait?  Sorry, couldn't resist.  Just like in cfengine (although I definitely took the long way around), classes in Puppet can generally be considered as booleans, i.e., tags.  There's a heckuvalot of additional semantics associated with these tags compared to, say, a Flickr tag, since adding a tag causes work to happen on the system in question, but it can still be considered just a tag.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;As the configurations get both more complex and more dynamic (changing often either because it is designated to do so by users, or because it is automatically reacting to network state, time, etc.), it might make the most sense to have a static configuration in one place that does not mention nodes at all, and then almost a scratch space for the nodes, where you can dynamically pin tags on a host and watch the chaos ensue.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;At the very least, if you had a configuration that was changing constantly, you would almost definitely benefit from a map that showed the tags on a given host.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="typed-tags" name="typed-tags"&gt;Typed tags&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Is it worth deemphasizing the boolean aspect of tags, and add typing instead?  For instance, is it worth differentiating server class tags from component tags?  Or, like so much else, does it just make sense to intelligently pick class and component names, so that it's relatively apparent?&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Obviously the easy solution is to start without typed tags, but it's something to keep in mind.  I intuitively think that a lot of the power of tags comes specifically from their simplicity, and adding complexity would probably tend to decrease the flexibility.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112848659840504430?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112848659840504430/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112848659840504430' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112848659840504430'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112848659840504430'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/pervasive-tagging.html' title='Pervasive Tagging'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112837575374723005</id><published>2005-10-03T16:40:00.000-05:00</published><updated>2005-10-03T16:48:39.886-05:00</updated><title type='text'>Tag and flatten</title><content type='html'>&lt;div class="document"&gt;&lt;br /&gt;&lt;p&gt;I am attending the &lt;a class="reference" href="http://www.web2con.com/"&gt;Web 2.0 Conference&lt;/a&gt; this week, so I'm preparing for it by thinking about how it can apply to Puppet.  I'm guessing that I'm ignoring a significant portion of what's considered to be a crucial aspect of Web 2.0, because I'm focusing on the immense value-add of tagging objects and then providing simple interfaces for sharing and browsing tags, which I think of as the 'tag and flatten' method.  That is, rather than building up specific, static heirarchies based on whatever specific categories or criteria, just tag every object in the heirarchy with whatever details you think matter and then flatten the heirarchy, letting the tags themselves draw patterns out.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;A good example of self-discovered patterns are &lt;a class="reference" href="http://flickr.com"&gt;Flickr&lt;/a&gt;'s tag clusters, which are algorithmically determined groups of related tags -- that is, they are tags that are determined to be related by 1) individuals tagging individual items with whatever they want, and then 2) computers assessing those tags and finding sets of tags that seem &amp;quot;related&amp;quot;, via whatever definition the algorithm is using.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;How can Puppet take advantage of this?  I mentioned it briefly in my &lt;a class="reference" href="http://config-mgmt.blogspot.com/2005/09/web-20-and-puppet.html"&gt;letter to the O'Reilly Radar&lt;/a&gt;, but it's pretty simple.  One way to talk about Puppet's goals is that it is empowering sysadmins to normalize object specifications across an entire network.  Given any configurable element on any machine on your network -- file, user, package, cron job, IP address -- that individual element should only be mentioned one time in your configuration, and then every host or host class that needs that element (e.g., to install a package or provision a user) just imports that portion of the specification.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;The top-down way of looking at this importing is that it presents a bit of a heirarchy, from server, to server class, to service, to element, except that it's not one heirarchy, it's a myriad of heirarchies, one for every node and every element.  Try to draw a map of these relationships in anything resembling a real heirarchy and you will soon need more dimensions than &lt;a class="reference" href="http://en.wikipedia.org/wiki/String_theory"&gt;string theory&lt;/a&gt;.  Tag that same set of heirarchies, though, and tag-and-flatten them, so that each configurable element is tagged with the host names, services, and server classes that mention it, and you are no longer forced into one way of seeing the data.  You can draw out whatever structures you think are appropriate, and sufficient tools should be able to draw them out for you, just like Flickr's tag clusters.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;If you do this for your whole network, you could go to this tagged-and-flattened list of elements and quickly determine which hosts have a given user on them, or which server classes require Apache.  And because you have these tags on the elements themselves, you could subsequently tag those elements' data with the same tags.  Centralized Apache logs are great, but they lose a lot of implicit information.  Imagine a log centralizer that also tagged each log message with all of the Apache element's tags (localized to the specific host, of course -- that is, the tag list would not include every host name with Apache in it or all of the server classes that use Apache, just the tags related to Apache running locally) -- go to your log database and see all Apache logs related to a specific server class, or a specific server, or network, or whatever you want.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;If course, those add-on tags require quite a bit of additional infrastructure -- standard means of referring to an individually configurable element (hah!  just try to standardize the definition of an element, I dare you!), plus standard ways of retrieving the tags on them, and then tools that actually do so.  I'm guessing that this is not worth it for many purposes or many organizations, but I think it is worth it for some, and I also think it adds a helluvalot more value than is necessarily obvious.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Puppet is not quite ready to support this level of pervasive tagging, at least partially because I've been stupidly focused on making it actually do work, but... maybe this is what I will do in the 36 hours or so I have between now and when the conference starts.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112837575374723005?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112837575374723005/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112837575374723005' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112837575374723005'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112837575374723005'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/10/tag-and-flatten.html' title='Tag and flatten'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112794981330209981</id><published>2005-09-28T18:21:00.000-05:00</published><updated>2005-10-12T23:25:32.420-05:00</updated><title type='text'>Web 2.0 and Puppet</title><content type='html'>Here's a letter I sent to O'Reilly Radar, in relation to my attendance of the Web 2.0 conference:&lt;br /&gt;&lt;br /&gt;I just announced the beta release of my software startup's main product, Puppet (http://reductivelabs.com/projects/puppet), which is a GPLed configuration management solution written in Ruby (no, the point of this email isn't a pitch, it's to ask two questions related to Web 2.0).  At this point Puppet is analogous to cfengine, although I believe I've created a significantly superior product, especially Puppet's language.  I actually wrote a couple of cfengine articles for onlamp.com last year and I spent three years doing cfengine consulting, along with spending 4 months trying (and failing) to rewrite cfengine's parser, so I know cfengine and its language pretty well.&lt;br /&gt;&lt;br /&gt;The reason that I'm writing the O'Reilly Radar about Puppet is that I have plans to significantly develop the Puppet client to create a kind of Puppet mesh network, and while I am convinced that there is some value-add to doing this that's analogous to the value-add in Web 2.0 sites, I can't quite pin it down.&lt;br /&gt;&lt;br /&gt;I'm attending the Web 2.0 conference in October, and I'd like to show up with, at the least, some extra contacts so that my time in the hallway track (as it's called at LISA) is a bit more valuable, but I'd especially love to have a bit of a dialog about this idea before I show up, so that the time at the conference is especially valuable.&lt;br /&gt;&lt;br /&gt;So, what am I thinking of?  Here are some of the important aspects of the setup:&lt;br /&gt;&lt;br /&gt;* Each puppet daemon will be modeling the entire configuration of the server on which it's running, using higher-level elements like packages, services, and files.  In addition to the normal elements, though, the client will also be modeling all of the relationships between objects -- if a service requires a file, the client will know it.&lt;br /&gt;&lt;br /&gt;* Each daemon will also be doing significant monitoring and record-keeping on the client and will have information available on what work it has done -- what packages it has installed or upgraded, which files it had to fix permissions or ownership of, which services had to be restarted, etc.&lt;br /&gt;&lt;br /&gt;* Daemons will eventually also be able to model relationships between different servers, although that's not going to happen until probably 2.0.  So, what we have now is a mesh capable of modeling not just a single host's configuration but that of the entire network, including hopefully all of the interrelationships between hosts and the services on the hosts.  In addition, we have historical information about what things previously looked like and what we've had to do to keep the configuration correct.&lt;br /&gt;&lt;br /&gt;How can we throw some Web 2.0 goodness into the picture?  Well, I'm not really sure myself, at least partially because the definition of Web 2.0 doesn't seem very clear, but it does kind of necessarily imply humans visiting websites, and here we have neither humans nor websites.  So we first have to ask whether it makes sense even to talk about Web 2.0 without those two key features; or rather, it makes sense to ask whether the general principles of Web 2.0 extend beyond the web and into general connectedness.&lt;br /&gt;&lt;br /&gt;I think they do.  Let's take a simple example:  Puppet has classing capabities, where you collect objects and name the collection:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    define apache {&lt;br /&gt;        service { apache: running =&gt; true }&lt;br /&gt;        package { apache: install =&gt; latest }&lt;br /&gt;        file { "/etc/apache":&lt;br /&gt;            source =&gt; "puppet://server/source",&lt;br /&gt;            recurse =&gt; true&lt;br /&gt;        }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Like most things, it makes sense to create the configuration in this kind of heirarchical style, but like most things, we want to be able to get more out of the configuration than simple heirarchy.  Let's take the flickr route, then, and consider each of these elements to be tagged with 'apache', and then maybe also tag them each with the name of each server to which the 'apache' definition is applied.  This doesn't seem too useful to start with, but if we extend it all the way up -- every element on a system is tagged with each class or definition that includes it, and since both classes and definitions can be hierarchical, this could be pretty big:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;    import "apache" # the definition above&lt;br /&gt;&lt;br /&gt;    class webserver {&lt;br /&gt;        apache {}&lt;br /&gt;    }&lt;br /&gt;&lt;br /&gt;    case $hostname {&lt;br /&gt;        culain: { webserver {} }&lt;br /&gt;    }&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;This would result in each of the objects in the apache definition getting tagged with 'apache', 'webserver', and 'culain'.&lt;br /&gt;&lt;br /&gt;Let's go one better, though; this is marginally interesting on one client, but let's extend it to the whole network:  Let's normalize all configuration elements across the entire network.  Let's do the same tag-and-flatten to every element on every node on the whole network (ignore where we do this for now, whether on a central server or whatever) -- you now have, continuing with our example, a single apache package element (or maybe one for each major rev), tagged with every host that has apache installed along with a tag for every class or definition that refers to an apache package.&lt;br /&gt;&lt;br /&gt;Now take this tagged-and-flattened list and make it available to every node, and add some CLI tools to access it.  Now you can connect to any machine on the network and query for tags related to any element, and you'll get back all kinds of metadata -- what hosts also have that element, what classes care about it, what elements depend on it, that kind of thing.&lt;br /&gt;&lt;br /&gt;Would this be useful?  I can only think it would be useful, even with just that.  But then take this and start doing all kinds of weird things, like looking for clusters like flickr's tag clusters -- I can basically guarantee you that you'll find all kinds of interesting patterns and clusters in this flattened-and-tagged list.&lt;br /&gt;&lt;br /&gt;So, my two questions to the O'Reilly Radar team are:&lt;br /&gt;&lt;br /&gt;1) Is this Web 2.0?&lt;br /&gt;&lt;br /&gt;2) Are any of you interested in having a bit of a conversation about this?  If not, is there a forum that it makes sense to bring this to?  I think Puppet will be a useful and popular tool regardless of whether I can apply Web 2.0 to it, but if I can take all of this information and really do interesting things with it, I think I could have a great tool, and I think I could seriously affect how systems and managed and monitored.&lt;br /&gt;&lt;br /&gt;No idea if you're interested, but feel free to post this on the blog if you think it would generate interesting discussion.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112794981330209981?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112794981330209981/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112794981330209981' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794981330209981'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794981330209981'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/web-20-and-puppet.html' title='Web 2.0 and Puppet'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112794967753280781</id><published>2005-09-28T18:10:00.000-05:00</published><updated>2005-09-28T18:21:17.536-05:00</updated><title type='text'>Instance overrides get more complicated...</title><content type='html'>Okay, things just got more complicated.&lt;br /&gt;&lt;br /&gt;It's pretty straightforward if none of the objects in question have any relationships.  You destroy the implicit objects, and swap in the explicit ones.&lt;br /&gt;&lt;br /&gt;It gets much more complicated with things like dependencies:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;file { "/etc": recurse =&gt; true, owner =&gt; root }&lt;br /&gt;service { apache: subscribe =&gt; file["/etc/apache/apache.conf"] }&lt;br /&gt;file { "/etc/apache": owner =&gt; apache, recurse =&gt; true }&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;In this situation, the "/etc/apache" declaration should override the "/etc" declaration, but we &lt;span style="font-style:italic;"&gt;definitely&lt;/span&gt; don't want it to happen in such a way that we lose the subscription by the apache process.&lt;br /&gt;&lt;br /&gt;So, we can't just destroy the old files and plop in the new ones; we have to go in and replace old values with new values.  But we have to do it recursively -- the overriding files can be whole trees of files, not just individual files.  And, importantly, their children are implicit, which means we can't just use normal overriding mechanisms, or it would look like one implicit file overriding another.&lt;br /&gt;&lt;br /&gt;Maybe the solution is to switch from "hard" links like I'm using now with subscriptions (i.e., when I make a subscription I actually store a reference to the object) to "symbolic" links (i.e., just record the type and name).  This would allow me to swap new objects in without screwing up the subscriptions.&lt;br /&gt;&lt;br /&gt;That sounds like the best option; I've already had one really nasty bug that was caused by this storing of references (if I created an object that subscribed to something but didn't get completely created, I removed the object but didn't destroy the subscription, which resulted in the object staying around -- very bad, and very difficult to track down).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112794967753280781?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112794967753280781/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112794967753280781' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794967753280781'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794967753280781'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/instance-overrides-get-more.html' title='Instance overrides get more complicated...'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112794840655752463</id><published>2005-09-28T17:52:00.000-05:00</published><updated>2005-09-28T18:00:06.563-05:00</updated><title type='text'>Stupid services, I'm moving on</title><content type='html'>I'm giving up on service management for a few days -- I think I need to stew on it or something.&lt;br /&gt;&lt;br /&gt;Instead I'm going to take a crack at handling overriding object values, along with maybe the value intersections I mentioned a couple days ago, which will need to involve parameter values as arrays.&lt;br /&gt;&lt;br /&gt;Fortunately I should be able to keep them pretty separate.  I'm first going to create an 'implicit' attribute, which will be set on all instances created through recursion (e.g., recursing through '/etc' will result in the 'implicit' flag being set on all resulting files except '/etc' itself).  Any conflicts between implicit and explicit instances will always be entirely won by the explicit instance -- that is, no intersection will be sought.&lt;br /&gt;&lt;br /&gt;But implicitness cannot handle cases where two instances that are either both implicit or both explicit collide.  So the next step will be to use intersections for handling these conflicts.  That is, the values will be the shorter list of all values allowed by all specifications.  The only problem with this method is that it could introduce ordering, since when it comes time to fix a problem, you have to pick the answer, and I was planning on using the value specified first, but whose first?  For instance, take this configuration:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;file { "/etc/funtest": owner =&gt; [root, adm] }&lt;br /&gt;file { "/etc/funtest": owner =&gt; [adm, root] }&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The value list is the same, but the order is different.  How do I pick which value to set the owner to?  It's essentially unsolveable at this point, but I don't have to worry about it until I've already got the implicit stuff working, thankfully.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112794840655752463?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112794840655752463/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112794840655752463' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794840655752463'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112794840655752463'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/stupid-services-im-moving-on.html' title='Stupid services, I&apos;m moving on'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112778813623831130</id><published>2005-09-26T21:27:00.000-05:00</published><updated>2005-09-26T21:28:56.243-05:00</updated><title type='text'>Autoinst</title><content type='html'>Someone on #puppet on irc.freenode.net pointed out &lt;a href=http://autoinst.tigris.org/&gt;Autoinst&lt;/a&gt;.  Anyone out there use it?  How does it compare to, say, &lt;a href=http://eq.rsug.itd.umich.edu/software/radmind/&gt;Radmind&lt;/a&gt;?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112778813623831130?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112778813623831130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112778813623831130' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112778813623831130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112778813623831130'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/autoinst.html' title='Autoinst'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112778493907556175</id><published>2005-09-26T20:20:00.000-05:00</published><updated>2005-10-04T23:32:26.813-05:00</updated><title type='text'>Specificity, priorities, and intersections, Oh my!</title><content type='html'>&lt;div class="document"&gt;&lt;br /&gt;&lt;p&gt;I'm writing this on the plane to the Web 2.0 conference.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I now have basic tagging working in Puppet.  They're generated on the server when a configuration is provided, and they're stored with the individual objects in the configuration.  For instance, here is what some of the tags look like in a short, example configuration on my OS X laptop:&lt;/p&gt;&lt;br /&gt;&lt;pre class="literal-block"&gt;&lt;br /&gt;file(/etc/issue): remotefile base nodebase tsetse puppet file /etc/issue&lt;br /&gt;file(/etc/motd): base nodebase tsetse puppet file /etc/motd&lt;br /&gt;file(/root): remotefile base nodebase tsetse puppet file /root&lt;br /&gt;file(/tmp/screens/.): base nodebase tsetse puppet file /tmp/screens/.&lt;br /&gt;file(/usr/local/scripts): remotefile base nodebase tsetse puppet file /usr/local/scripts&lt;br /&gt;file(/var/spool/cron): base nodebase tsetse puppet file /var/spool/cron&lt;br /&gt;symlink(/etc/resolv.conf): darwin nodebase tsetse puppet symlink /etc/resolv.conf&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;p&gt;I'll break down the configuration that generated this admittedly-basic list of&lt;br /&gt;objects and tags.&lt;/p&gt;&lt;br /&gt;&lt;ul class="simple"&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;tsetse&lt;/em&gt; The laptop name&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;base&lt;/em&gt; The base class; all nodes are members&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;basenode&lt;/em&gt; The, um, base node; all nodes inherit from it, and it basically just loads the node's operatingsystem class and the base class.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;puppet&lt;/em&gt; This is the top-level collection of the entire configuration.  This tag may not stay, since it would currently be on every single object.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;remotefile&lt;/em&gt; A simple wrapper function to encapsulate my primary method of copying files around my network.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;em&gt;darwin&lt;/em&gt; The laptop's operating system class (based on the output of &lt;cite&gt;uname -s&lt;/cite&gt;)&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;p&gt;For each line, the last two tags are the object type and the path to the object (they could also be service names, user names, etc.).  I think I should also add a tag for each of the files that mention the object, along with the respective lines from each file.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;The tags associated with a specific configuration are sent with the configuration to the appropriate server, but they're also merged into a central repository, so as each node connects and generates its tag list, the central tag repository gets more comprehensive.  (Each node's configuration needs to be compiled before its tag list can be generated, and the configuration requires information from the client before it can be compiled, so the nodes need to connect to get its tags.)&lt;/p&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="possible-uses" name="possible-uses"&gt;Possible uses&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;The tags are currently unused, but their mere existence has got me thinking:&lt;/p&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Logging&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I have been expecting that most people would just use syslog to pass logs around -- it's easy, it's pervasive, and Puppet has been developed specifically to be easily compatible.  However, it would be difficult to shoehorn tags into syslog, or at least it would be annoying to get them back out; it would need to be based on pattern matching a specific line, which is never pleasant.  I'm thinking instead that I could write a log server capable of receiving these log messages, and then enhancing the log messages to store the tags associated with the node that generated the message.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Imagine being able to trivially find every log message generated by any darwin node in a given time period, or the the error messages from a specific network associated with DNS.  Build a database of all of these log messages, with the fields and tags set up appropriately, slap a Rails interface on it, and you could get that pretty easily.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Metrics&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Similarly, I have been expecting people to ship their Puppet metrics off to a specific performance app, but there probably aren't any apps out there that gracefully handle associating arbitrary tags with each metric.  Build a simple server to accept the metrics that Puppet generates, and suddenly you can generate change count reports on a specific server class like 'webserver' or on how many out of sync security directives there are in the DMZ.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Ticketing&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;I know I'd like to have each of these tags stored with each ticket generated from Puppet (yes, I hope to eventually have Puppet autogenerate tickets).  Query for all outstanding tickets on the web farm or any tickets related to the recent Solaris upgrdae.&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;p class="first"&gt;&lt;strong&gt;Reference&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;&lt;p&gt;With a web-based annotation system for the configuration, you could use tags as a kind of wiki keyword system -- selecting a tag in a log message automatically finds the definition of that tag (as a server class or functional component or node or whatever).&lt;/p&gt;&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="tag-injections" name="tag-injections"&gt;Tag injections&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;One of the thing unaddressed in the current proof-of-concept system is that servers will often want to generate their own tags.  In particular, I can see adding an 'unmanaged' tag to objects that get mentioned in a configuration (e.g., as a requirement) but that are not managed, or storing the sync state as at least a local tag for an object (e.g., error, synced).  Should these tags make their way to the central system, or should users at least be able to get access to those tags?&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="state-tags" name="state-tags"&gt;State tags&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;The concept of nodes injecting tags brings up the concept of maybe using tags as a means of storing state, or rather, turning the state of an object into just another tag.  Again, this would be useful mostly for reports, but it could certainly help quickly find all nodes in an error state, along with the objects associated with those nodes, and this would be especially useful if it were updated live.  It might make sense to just support this as a live query against the system -- it should be pretty light-weight, at least compared to, say, querying the actual configuration.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;This most likely makes sense to use as either a last resort or as a verification system.  If you've got a ticketing system for reporting errors, Puppet could automatically generate tickets when an object fails, and then the ticket system could automatically verify that the object is no longer in an error state before it allows the user to close the ticket.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="collaborative-tagging" name="collaborative-tagging"&gt;Collaborative tagging&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;One of my primary design goals in Puppet is to encourage and support configuration sharing.  Configuration management will never advance as a field when every organization has to create its own definition for every server class, because those definitions, like servers managed without automation, present unnecessary and often arbitrary variation.  So, I know that collaboration will already be critical to the future of Puppet.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;How can tagging affect that collaboration?  Would it be beneficial if people published the tag lists associated with each of the objects they manage?  Could seeing someone else's tags on an object provide new ideas for organization or configuration?  Would people who are unwilling to share their whole configuration be willing to just share their tags?&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="configuration-discovery" name="configuration-discovery"&gt;Configuration discovery&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Could tags be used to discover a node's existing configuration?  Maybe provide a few important tags through autodiscovery, like operating system, host name, and network, and then use that to start collecting a larger configuration over time.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;This seems pretty damn unlikely.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="tags-or-classes" name="tags-or-classes"&gt;Tags or classes?&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Bjork or Goldthwait?  Sorry, couldn't resist.  Just like in cfengine (although I definitely took the long way around), classes in Puppet can generally be considered as booleans, i.e., tags.  There's a heckuvalot of additional semantics associated with these tags compared to, say, a Flickr tag, since adding a tag causes work to happen on the system in question, but it can still be considered just a tag.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;As the configurations get both more complex and more dynamic (changing often either because it is designated to do so by users, or because it is automatically reacting to network state, time, etc.), it might make the most sense to have a static configuration in one place that does not mention nodes at all, and then almost a scratch space for the nodes, where you can dynamically pin tags on a host and watch the chaos ensue.&lt;/p&gt;&lt;br /&gt;&lt;p&gt;At the very least, if you had a configuration that was changing constantly, you would almost definitely benefit from a map that showed the tags on a given host.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="section"&gt;&lt;br /&gt;&lt;h1&gt;&lt;a id="typed-tags" name="typed-tags"&gt;Typed tags&lt;/a&gt;&lt;/h1&gt;&lt;br /&gt;&lt;p&gt;Is it worth deemphasizing the boolean aspect of tags, and add typing instead?  For instance, is it worth differentiating server class tags from component tags?  Or, like so much else, does it just make sense to intelligently pick class and component names, so that it's relatively apparent?&lt;/p&gt;&lt;br /&gt;&lt;p&gt;Obviously the easy solution is to start without typed tags, but it's something to keep in mind.  I intuitively think that a lot of the power of tags comes specifically from their simplicity, and adding complexity would probably tend to decrease the flexibility.&lt;/p&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112778493907556175?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112778493907556175/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112778493907556175' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112778493907556175'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112778493907556175'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/specificity-priorities-and.html' title='Specificity, priorities, and intersections, Oh my!'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112777252685837195</id><published>2005-09-26T16:50:00.000-05:00</published><updated>2005-09-26T17:09:53.330-05:00</updated><title type='text'>Object specificity</title><content type='html'>One of the things I've been struggling with is how to handle conflicts within Puppet.  Some conflicts should generate an error:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;class base {&lt;br /&gt;   file { "/etc/apache": owner =&gt; root }&lt;br /&gt;}&lt;br /&gt;class webserver {&lt;br /&gt;   file { "/etc/apache": owner =&gt; httpd }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;include base, webserver&lt;br /&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;There's no way to resolve this conflict, since the system can't decide which is more important.  Other times, it's very clear how to handle conflicts:&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;file { "/etc": mode =&gt; 644, recurse =&gt; true }&lt;br /&gt;file { "/etc/shadow": mode =&gt; 440 }&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;In this case, the first statement affects /etc/shadow implicitly but the second one does so explicitly, so it's pretty clear who should win.&lt;br /&gt;&lt;br /&gt;It can get a bit sketchier, though:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;class base {&lt;br /&gt;   file { "/etc/apache": owner =&gt; root }&lt;br /&gt;}&lt;br /&gt;class solaris inherits base {&lt;br /&gt;   file { "/etc/apache": owner =&gt; httpd }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;include solaris&lt;br /&gt;&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;In this case, it seems somewhat clear that the solaris specification should override the base specification, but, well, we don't really have any way to know whether that's acceptable or not -- it could be that the base specification was a security requirement and any deviations would break policy.&lt;br /&gt;&lt;br /&gt;At this point, I'm implementing an 'implicit?' method to objects, which will test whether they were explicitly specified (by testing what their parent object is) or whether they were specified through some kind of recursion process.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112777252685837195?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112777252685837195/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112777252685837195' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112777252685837195'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112777252685837195'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/object-specificity.html' title='Object specificity'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112775754191081323</id><published>2005-09-26T12:58:00.000-05:00</published><updated>2005-09-26T12:59:01.913-05:00</updated><title type='text'>Service status</title><content type='html'>I'm in the process of converting my &lt;a href=http://www.cfengine.org&gt;cfengine&lt;/a&gt; configuration to a Puppet configuration, and I realized that, amazingly, Apache 2 on Debian does not provide a &lt;span style="font-style:italic;"&gt;'status'&lt;/span&gt; command in its init script.  Even worse, when you call it with &lt;span style="font-style:italic;"&gt;status&lt;/span&gt; as an argument, it prints a usage message but then exits with an exit code of 0, which means that I can't tell that it failed.  &lt;br /&gt;&lt;br /&gt;So, I figure, ok, maybe apache2ctl would do it for me.  Nope, it tries to hit the local server's &lt;span style="font-style:italic;"&gt;status.cgi&lt;/span&gt; script, but even if it gets a 404 it still exits with a return code of 0.  This means, basically, that Apache 2 on Debian provides no way out of the box to automatically tell whether the service is running at all, much less running well, which kind of shoots my whole service management plan in the head. I was planning on relying on &lt;span style="font-style:italic;"&gt;status&lt;/span&gt; commands on init scripts, but I obviously can't do that.&lt;br /&gt;&lt;br /&gt;It looks like most platforms are moving to some kind of service manager -- Solaris 10's service manager is certainly highly touted, and Fedora (and, one assumes, Red Hat) has a 'service' command that does something similar.  So, I need to redo service management so that it uses the local service manager if one is available, and if not then does what it can.&lt;br /&gt;&lt;br /&gt;At this point I'm thinking about either allowing users to specify status commands (e.g., they would specify a command that searched through the process table), or maybe have a boolean 'hasstatus' flag that services support, so you could specify whether the init script should be used to get a status or whether Puppet should look in the process table.  I don't think either of these is a particularly good solution.&lt;br /&gt;&lt;br /&gt;Another idea I had, which is what I will probably go with, is to focus less on server status and more on whether a service is enabled or not.  This would rely a bit too much for comfort on services being started at boot time and then never having any trouble (although, really, monitoring should probably used to discern trouble there), but it does seem to fit in better with how sysadmins actually work.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112775754191081323?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112775754191081323/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112775754191081323' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112775754191081323'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112775754191081323'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/service-status.html' title='Service status'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-17147343.post-112775535322972818</id><published>2005-09-26T12:01:00.000-05:00</published><updated>2005-09-26T12:22:33.233-05:00</updated><title type='text'>Introduction</title><content type='html'>My name is Luke Kanies, and I'm developing a configuration management tool called &lt;a href=http://reductivelabs.com/projects/puppet&gt;Puppet&lt;/a&gt;, for my software startup, &lt;a href=http://reductivelabs.com&gt;Reductive Labs&lt;/a&gt;.  I'm in the unfortunate position of doing most of the design on my own, and I have stinkloads of conversations with myself about how to do so, so I've decided to create this blog as a way of recording and sharing those mental conversations.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/17147343-112775535322972818?l=config-mgmt.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://config-mgmt.blogspot.com/feeds/112775535322972818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=17147343&amp;postID=112775535322972818' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112775535322972818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/17147343/posts/default/112775535322972818'/><link rel='alternate' type='text/html' href='http://config-mgmt.blogspot.com/2005/09/introduction.html' title='Introduction'/><author><name>Luke Kanies</name><uri>http://www.blogger.com/profile/10424526871234503195</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
