Deployment And Change Management - A Framework

A couple days after I posted my last article, I went to del.icio.us to see if anyone had bookmarked it. A user named timbaileylondon had, with this note - "We need to get on top of this. The last one in particular made me cringe." Too true on both counts.

A few weeks ago I was trying to chase down a bug in content copy, because it was causing us problems with some content types we were trying to move. I had never dealt with FAPI much before, and it was a real eye-opener discovering I could script content type exports and imports easily in a few lines of code, and then further realizing I could actually script any form the same way. I started thinking about how many Drupal objects can be scripted quickly this way - node load/save, user load/save, any form, etc. It was then that I started thinking that maybe all we need to tackle the deployment problem is a) a way to abstract each of these pieces of data and b) a way to move them from server to server. Maybe Drupal doesn't need an overarching deployment solution, it just needs some glue.

I have written a proof-of-concept framework to work through some of these ideas. It is structured similarly to the Services module. Services provides a base core of functionality into which you can plug server modules (XMLRPC, JSON, REST, etc.) and service modules (node services, user services, etc.) Deployment also has three parts:

  • Deployment API - This implements the concept of a deployment plan. You create a deployment plan and add objects to it which will be deployed (content types, views, etc.) When the time comes these items are pushed to a server you specify via XMLRPC with an API key. The data stored about each item is extremely minimal, relying largely on the implementers to implement object-specific knowledge. Currently this is deploy.module
  • Deployment Implementers - Individual modules implement the deployment API to add the data they need to a deployment plan, and expose that ability to the front end. Currently I have content_copy_deploy.module and system_settings_deploy.module.
  • Deployment Services - Services modules which contain the knowledge to receive deployed data and do what is appropriate on the destination server. Currently I have content_copy_service.module and system_settings_service.module. One great thing about this is that regardless of what happens with the greater deployment framework, these services exist on their own as useful code which will be contributed back to the community.

So lets look at an example. Lets say I have a launch coming up where I will need to push live two content types, and pathauto information related to those content types. First I will create a new deployment plan, My Big Site Launch and indicate its destination URL and required API key (please note this API key will self-destruct before you ever read this.) This functionality is all provided through deploy.module.

Now I need to add my content types and settings to this plan. You can see below that there is a new tab on the content types listing page called deploy, and clicking it allows you to choose a content type and a deployment plan to add it it to.

This is all surfaced through content_type_deploy.module. Note there was no hacking of contrib or core modules in anything you see here. I have similarly exposed a deployment plan drop down on all forms that make use of system_settings_form through system_settings_deploy.module as you can see below in the pathauto settings screen. I'd like to acknowledge Daniel Kudwien and his Journal module here, as it was their code that provided the inspiration for how to handle system_settings_forms.

Having added my items to the plan, you can see a listing of them below.

On the destination server (specified above) I have enabled content_copy_service.module and system_settings_service.module.

Note the destination server doesn't even need deploy.module or any of its related modules at all! The only thing deploy.module offers is a way to collect data together about items to deploy, and a way to send them all out in one mouse-click. The services receive the data pushed forward by deploy.module and do with it whatever is needed.

So all I need to do now is click the Push button and everything will go on its way. I wanted to have a screencast to show the whole process to drive the point home, but I couldn't get it together in time. Hopefully after the weekend I can get all the pieces together to make that happen.

This framework is extremely powerful because it is a) modular and b) flexible. In my eyes, it is very drupal-like in all the best ways. Any piece of core or contrib could easily write their own deployment and services code and be in business to allow their data to be pushed around. Anyone who needs these services can either release their own modules to do it or even better submit patches to the modules in question to bring everything together. While I am mostly focused on the admin area, there is no reason why this framework couldn't be used to push content around as well. Services already includes node and user services which allows for saving and loading of those objects remotely.

While this is all exciting (well it is to me anyways) this is far from a complete solution right now. There are a great many gaps that need to be filled and questions to be answered:

  • Security - My original version of this code used xmlrpc.php, which was a disaster security-wise. Services includes API keys which helps, but I would want someone with a stronger background in security looking over this before I deemed it fit for use.
  • Dependencies - Do the appropriate modules live on the destination server? Are they the proper versions? Do all the nodereferenced content types exist?
  • Ordering - Related to the above, how to make sure deployment happens in the right order. If content type foo nodereferences content type bar, you have need to push out content type bar first. If pathauto has settings for content type foo and bar, they both need to go before the settings go. It would be nice if everyone knew this as well.
  • Versioning - There is no concept of versioning, is the content on my dev server newer or older than the content on my live server?
  • Rollback - There is no automated way to roll changes back, unless you count "take a mysqldump before you begin deployment, and if something goes wrong restore it."
  • Drupal 5 - This code is currently D5-only and thus in grave danger of becoming irrelevant sooner than later.

The holy grail would be integrating a full-featured module with SVN/CVS checkout and packaging, for full one-click scripted deployment.

So there's obviously a lot of work to be done, both on the above problems and the fact that the code even as it exists now needs some shoring up. Some of the above I have ideas about, some not so much. Hell, Workhabit might release Autopilot tomorrow and it will solve everyone's problems and render this whole conversation moot (although I suspect there will be room for multiple solutions in this sphere.) However, my plan right now is to spend a week or two getting this set of functionality into shape and release it as an alpha. I will probably release the services first, because they're the closest to being done.

I am actively interested in hearing what people think of this idea conceptually, and I'm also looking for collaborators on this project. The more brains the better. Let me know in the comments or the contact form!

Comments

This is the best idea I've seen yet to tackle this sticky issue. We'll have to stew this one over a bit.

this is very cool stuff. i've been struggling with how to set up a proper dev/test/prod drupal environment for a while now. this looks like an important step. i'd love to help out if i can, i'm still new to drupal coding though! i can maybe do some documentation or that screencast you wanted! drop me an e-mail if interested.

I might take you up on that screencast offer, plus it would give me a good chance to have someone else try it out.

After a cursory glance through, this seems a very promising approach. Some of the issues you raise I think can be seen as 'nice features' (for example, Rollback) and as you point out workarounds exist. I will also be bookmarking this and will be following developments with some interest.

heyrocker:

We spent a bit of time the other day in #drupal-support discussing this very fact. I am very excited to see what comes from this, and am interested in collaborating. We need something like this pretty bad on a few of the Drupal projects I am working on.

Drop me an email, or I'll see you on IRC. I have no way to get in touch right now since you didn't leave your email!

You say about deploy.module The data stored about each item is extremely minimal

Maybe it could be replaced by RDF API so it will provide more data. In that case you can also maybe hold Ordering?

Actually you want to store a minimal amount of data. For instance, for content copy, all i store is the name of the content type I'm pushing out. This is a good thing because if changes happen after I create my plan, I don't have to worry about updating the data in the table behind the scenes. I already know how to deal with ordering, I just have to get around to it.

I think this really has potential. I'll two more items to your "gaps" list.

  • Don't tie the URL to the deployment plan. Some sites have staging and production environments and they should be able to run the deployment plan to either server.
  • Audit. We need to log who did what deployment and when.

Audit is definitely a must, you're right. What would be nice for the URLs actually is if you could preset servers and API keys in advance, and choose which one you want to push to ad hoc from a select list. That's going on the To Do list.

Separating content from configs for deployment purposes is one of our biggest bug bears. This approach looks like a super start to a solution!

I think we should have some hooks/api that allow the module itself to extract/push its data/settings to this system.

Yes this is exactly what deploy.module exposes. You add data into the system, then push it out, then it gets extracted by the service on the other side. It will become more apparent when I release the code. Stay tuned!

This is a really great looking start. I can say that this is a real pain point with the work that we do in Drupal. I currently insist that all the dev work we do be done on the dev server but having an easy way to rollout to live would be wonderful! Also, having a way to roll back changes would be great. That way if you see that the new changes screwed up something, you could quickly revert the changes.

I'll be interested to see how this goes.

Why did you change the name from "My Big Site Launch" to "My Big Deployment"? Seriously.

I did the screenshots after I wrote the text and I'm a terrible proofreader

We have been working on this (while waiting for AutoPilot ;) ) and we have set up a solution which looks like yours but ... we are doing it manually.
We are actually using the update_N functions with a combinaison of content_copy, views_default_view [0], macros and some Drupal API functions (such as drupal_install_module) and so forth and so on.
With that we are able to face heavy upgrade but there are still some problems that you could also have with your framework. Of course your solution has the (immense) advantage to be automatique. Our solution is a pain in the ass (to say it frankly).

  • Reference to numeric ids : let say that you need to creat a nodequeue that will be referenced by a view. In an ideal world we have a deploy nodequeue and a deploy views. The fact is that the view is refering to the nodequeue by its numerical id, given by the db_next_id() function. If everything goes smooth you create your nodequeue (nqid = 1 for ex) on the dev, then you create your view which refers to nqid = 1. Deployement : creation of nodequeue nqid = 1, creation of the view, nqid = 1. So far so good. But let say that the developer create a nodequeue, then delete it, then create a new one : nqid = 2. He create the view (refers to nqid = 2) and Deployement :
    creation of the nodequeue (nqid = 1 because it is the first in this db) then import of the view which is still refering to nqid = 2 !

    I can see 2 solutions to face this problem :

    • Take the id from dev to prod. It solves the problem of the reference but it brings a new one : id collision. It could be solve by the even/odd id trick (BTW the original post you were looking for is here http://drupal.org/node/181128, and they are using this solution at http://www.france24.fr).
    • Bring the concept of dependency in the deployement process. Say that you have a view that depends of a nodequeue (it could be anything else like a node or a user btw). If in your deploy plan you can say that step B (deploy view) depends on step A (creation of the nodequeue) you can then abstract the nqid in the view to be the nqid of the nodequeue previously created. We are doing it by hand (= in code, create a nodequeue, retrieve its nqid with some sql, put it in a variable, change the exported views to refer to this variable instead of the original nqid).
      This step could be automated but it would be quite complexe, given the number of different dependencies that are possible.
    • Actually there is a third solution ; deploy in 2 times. First create all the objects that need to be referenced and deploy (sometime they could even be created directly the prod server). Syncronise your devel db from your prod one. Create the view/menu etc... that refer to the objects created in the first step. Deploy the second step.
  • Managing the changement in the db : this mean that while you are developing your "new features" you can' t synchronise your dev db from your production one. Or if you do it, it will :
    1. erase your deployement plan and its component (but ok, the table could be saved and reimport in the fresh import)
    2. erase all the modification that have been done in the db (creating of new content type, new views, etc)

    If, instead of keeping the changes in the db they are "frozen" in code, it is very easy to sync the dev from the prod : dump your prod, import it in the dev db, apply your update_N function et voila. Of course, you have to *write* your update function which is boring and time consuming.

  • AJAX/AHAH: asyncronous method are making there way into drupal admin in D6 (which is a nice thing). Asyncronous calls don't rely on the drupal_system_settings or on drupal_execute (correct me if I am wrong, I am saying this without looking at the code). A deployement implementer should be provided for every page that have ahah functionnality (blocks and menu for exemple)

Given that (sorry for the length of the post), your solution looks very cool and is probably the most promising one that I know.

We would be very happy to share with you the work we have done and contribute to your module.
Feel free to join me by email if you want to.

[0] which can be overriden by the administrators and then re-import by the developers

Thanks for the long thoughtful comment.

Numeric IDs are a big problem, although so far I have avoided them. Both my existing modules (content copy and system settings form) rely on a text-based identifier. This makes them very easy to implement. I suspect a lot of other admin interfaces will be similar (for instance FAPI forms are always text-identifiable.) However when it gets into nodes and users and comments, things do indeed become much more tricky. I really dislike either of the core hacks to force IDs into a non-conflicting state. Your third solution is similar to what we're doing now, but you really have to have a well thought out plan in order for that to work. It's very easy to mess up. I have been thinking about ways to do more automated synchronization through my module, but as you say its very complicated.

One thing I have considered recently is making a CCK GUID field for all my content types. These would then be uniquely identified regardless of where they originated, and now synchronization is far easier. This could be adapted for users and comments as well most likely.

As far as your second point, for now I'm not planning on putting db schema changes into this. There is a (yes, tedious) process for this already and it works well and I'm more worried about things we don't have good solutions for.

I don't know enough about how the AHAH stuff is implemented at all, but i BELIEVE it is still FAPI behind the scenes, because there needs to be a way for the forms to degrade gracefully for non-javascript-enabled users. I hope so anyways!

I will be contacting a bunch of people when I get this turned into a proper project. I'll definitely be in touch then.

We had a discussion at DrupalCon about many of these topics. It seems many of us are trying to solve these issues, in many different ways.

There's a lot to like about your approach, but one problem I think of has to do with IDs. Let's say your task is to replace the site's front page. So you create a new node on your development box, say it ends up being node/42. Then you go to settings and make the site frontpage "node/42". When you run the deployment script on the production server, it creates an identical node, but the nid is different, say 1234; and the frontpage will become node/42, which will be the wrong page.

You may think that's easily solved with an alias or what have you, but the problem will occur when any drupal object refers to any other. For example the id of the user who created the node, a node reference field, even vocabularies and terms. In the latter case, your deployment script might create vocabulary with vid 42, then add 10 terms to it. Later the deployment script would create a new vocabulary that gets ID 123, then proceeds to add 10 terms to vocabulary 42, whatever that happens to be on the other server.

Most of my focus so far has been on admin settings of various parts. We do all our content creation on our live site, and thus we never have to push nodes live. This is helpful for us but obviously not an ideal situation. I have some ideas going forward about how to deal with the problem of distributing nodes (and their widely varied references) however it will require a great deal of thought and experimentation. This will probably not get into the 1.0 of my modules.

I will say this - I reject outright any solution that involves hacking core, and all of my thought is going towards creating a solution that is modular, extensible, and GUI driven.

Dave makes a good point about nodes, particularly referenced nodes. Point taken about settings being a great rollout feature in itself, but If someone is coding "about" pages, or front pages and the like, then it seems like they would be important to be able to "roll out" as well. It would be really great if a roll out of nodes worked with the revisioning system :)

Maybe If it just focused on deploying only a few crucial nodes like say anything that's a menu item...or the frontpage? then it maybe it could do it, keeping a reference map of the old and new node ids and updating any references accordingly...hmmmm

Frontpage might be easy.., say import the referenced node, get its new node id, and then put the correct node id it in the front page setting? Anyway, Great ideas here, look forward to playing with it.

Awesome! I already tinkered with a similar idea, called "Sync" module. I think you've done the ground-work for this with Deployment now. :)

Currently, I'm developing Migrator module, which aims to convert any existing (non-Drupal) site to Drupal. To accomplish this, Migrator creates an object id mapping table to ensure that external ids can be altered to newly created Drupal ids (f.e. for users, user roles, nodes, taxonomy terms, aso) in front of creating the imported objects in Drupal. IMHO, a similar id mapping mechanism could work out for Deployment, too, since you already need to process the pushed data on the remote site.

Yes, this could work out well. I am currently pondering a lot of possibilities for node deployment. Currently I don't plan to have this in my first release but it is the primary goal for a v2. I will watch your project with interest, maybe we can talk more about this when I start thinking about my next release.

Just brainstorming...

How about looking at it from a slightly different angle, still starting from your idea of service: how about using a standard synchronization engine (such as http://www.funambol.com/opensource/ or something similar) connecting to multiple Drupal database instances through services. The instances could be DEV + QA + PROD, or PROD1 + PROD2 + PROD3, etc.

When content is added / changed / deleted in one store, it would be replicated on the others.

Now this is pretty interesting. I did not know about this project before. We will still have the problem of how to identify pieces of data uniquely, but reading this is starting to spawn some ideas. Thanks for the link.

I first heard of Funambol when reading an article "The Holy Grail of Synchronization". This article describes an approach to synchronize several distributed PIM (Personal Information Management) databases, but I felt it could be applied on other data entities (I understand Funambol implements SyncML in an extensible architecture).

As you say, one of the main issues here (shared by most or all synchronization solutions) is to identify corresponding objects in distributed environments/namespaces.

See also:
http://en.wikipedia.org/wiki/Syncml
http://internetducttape.com/2006/08/11/the-holy-grail-of-synchronization...

(sorry for the very long URL)

Reading this thread has given me an idea. Imagine three servers, called Dev, Stage, and Prod. Dev is where the code magic happens, and where Heyrocker's deploy.module would be installed to push changes up to Stage.

Stage is where the QA happens, but not where content is added. All content changes have to happen to the Prod server, and here's why.

# Content flows downhill
Users and admins create content on Prod. A brand new, never-before-thought-of module utilizes NodeAPI to create duplicate content on Stage and/or Dev whenever a node is created or edited on Prod. Thus, 100% of content is added (or deleted) to Prod and flows downhill. Node ID's remain sequential and perfectly in sync.

# Code flows uphill.
All changes to code, including views and content types happens on Dev, and is pushed to Stage for QA and testing, and then onto Prod for final deployment. Any content type changes that occur will be reflected on Stage first before showing up on the live site, so they can be tested and tweaked. Thus, Stage becomes both a clearinghouse for new functionality *and* a mirrored copy of the Prod server, but its tweakable until showtime.

# Keeping track of the /files
An rsync over SSH would need to be set up to keep the files directories in sync. Prod's /files need to be rcyned by Stage, and Stage's /files need to be rsynced by Dev. In this manner, and on a cron of say 10 minutes, a shell script is executed. Your Dev box's referenced /files will never be more than ten minutes later than your Prod box's referenced /files, even though the nodes themselves will exist already because they were created at the same instant on both Stage and Dev.

This is definitely the ideal and is actually very close to how we do it on nwsource.com. Our editorial staff works on production and we rsynch uploaded files around. The big problem with moving nodes around is primary keys - it is very very difficult to guarantee that any nid will be available from one environment to another. And then of course if they are different, you have to worry about the foreign keys out to roles, taxonomy, etc etc etc. I still don't know the best way to handle that. I have lots of ideas but they all have problems (some more political than technical.) Hopefully after I release the module for real (next week if we're lucky!) I will be getting a group together to discuss these issues and come up with a plan of action. Contact me if you're interested!

Hello,

Has source code been published for this module? If you could even put something pre-alpha up, it would help me out a lot, as I'm looking at change management for nabuur.com.

Thanks!

Very Nice.

Actually, I should have done better digging before my last weekend's work: http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/alex_b/port/

Take a look at this implementation of a "port" and see how similar our concepts are:
http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/alex_b/port...

:)

The port module does a subset of what the deployment framework does. The notion of 'symmetry' of export and import functions is very powerful and can be used for not only deployment - and this actually brings me to my point: I'd like to suggest to break out this functionality of deploy module.

With port.module I came from a different angle: while building an installer profile I found it was tedious and wasteful that I had to export and generate data structures on all ends of Drupal and then wind up with a set of associative arrays in my profile file of which only I knew where they're coming from: the ideal solution would internally _know_ which export function matches which import function.

This simple bit of information about what import function matches which export function allows us to do various things like the push deployment that the deploy module implements or generating install functions like the installer profile wizard of the Install profile API (http://drupal.org/project/install_profile_api ) or just keeping a site configuration in version controlled code.

This simple notion of symmetrical import/export function pairs on a per module level together with a wide usage (deployment, installer profiles, version control, etc.) could be a great driver for more and more import/export function implementations for site structure and content. This is going to be very important, because nobody will be able to build import/export functions on their own, but as a community we can. A very wide basis of use cases for this core functionality is therefore crucial...

Looking forward to your thoughts,

cheers and again: awesome module,

Alex

We really really really need this! I'd love to take a look at some early development stages. I agree with the gaps identified. In terms of rollback, it may be possible to do a rollback if we approach it as one would approach undoable changes in any document editing interface. Here are the basics we would have to consider:

Assumptions:

  1. Individual changes would have to be reversible
  • Would need to prepare an SQL statement which would undo each individual SQL statement included in the deployment, e.g. a DELETE for every INSERT, a REPLACE for every REPLACE
  • Would need to retain a backup copy of each file to be replaced in the deployment
  • Changes would have to be serial, must be recorded in a journal/log, and the order of operations must be strictly maintained.
  • Compound changes would need to be defined which are "atomic", i.e. they must be done completely otherwise an automatic rollback will occur and failure will be the result.
  • A hierarchical model of deployments would need to be defined to enable the above.
  • With these basic assumptions we can imagine a deployment which may be entirely atomic, or may contain several subpackages, each of which is atomic, so that several independent packages can be included and if one fails it can be retried (perhaps even automatically, some number of times?) or an alert can be sent to the administrator. Assuming that the integrity of the deployment itself is solid, "undo" operations can be defined for every possible type of supported change, and the order of operations can be guaranteed, then it follows that a deployment should be able to be rolled back automatically.

    Further considerations:

    1. All data should be serializable.
    2. Ensuring the integrity of the undo operations is an absolute necessity.
    3. Rollback failure is always a possibility, and this means corruption, so a full backup is recommended for all deployments.
    4. There should be a complete record of all changes, packaged up in a manner easily identifiable as a deployed package (perhaps even as basic as a tar of all the SQL statements executed and files transferred).
    5. It would be great to include some automated testing as pre-conditions or post-conditions of a deployment (i.e. a "smoke test")

    Thanks for all the work you have put into this so far, I am very excited to see the results. I am not holding my breath for the Autopilot project, and this project has a corner on the best approach to this problem. I am not an experienced Drupal developer or PHP but I have 12 years of experience with Java and configuration/build/release management, and would love to be a part of this project. Please let me know how I can help!

    Hi,

    did you ever released the code of your module or is it just a private project?

    The code was released but it was then pulled again due to a compatibility problem with the Services module. I am currently working on a version for Drupal 6 that is incredible exciting and much more fully featured. Stay tuned for news soon!

    Hi Greg,

    Great effort, and as everybody else, I needed this yesterday :)

    I'd like to give you a hand to get this done sooner (I build custom Drupal modules), so let me know!

    Cheers,
    Farez

    test

    Add new comment

    I wrote two chapters of this book - Drupal 7 Module Development and I co-wrote it with Matt Butcher, Larry Garfield, Matt Farina, Ken Rickard, and John Wilkins. Go buy a copy!
    I am the owner of the configuration management initiative for Drupal 8. You can follow this work at the dashboard on groups.drupal.org.
    I work at Lullabot!. If you don't know who Lullabot is then you haven't been around in the Drupal world long have you? Come check us out!