Burn-Down Charts: The Silent Value Killer

Spend any time with Agile development and you’ll come across the burn down chart. The premise is simple – if a team committed to delivering 15 planned innovations (features, user stories, or whatever you want to call them), a little gets done over time so the remaining work should “burn down” to zero at the end of the sprint.

While burn down charts are effective tools for tracking how a team is progressing towards finishing planned work, there is a sinister effect that happens in many organizations that learn to use them. Managers begin to primarily focus on the number of innovations delivered per sprint. Though they may be tracking other metrics, it draws their attention because it is simple to understand, and easy to communicate. It also feels like producing as many innovations as possible is getting the most value out of a team – common sense right?

Unfortunately, the goal of a business is not to do work. The goal is to grow and create economic value. As anyone who’s read Continuous Delivery or The Lean Startup knows, most of the innovations you plan and deliver do not produce value as expected. Though market and customer research, usability testing, and strategic planning are all useful tools; the only proof of value is whether customers actually use your innovations.

To find out which ideas are good, you’ve got to release them and be prepared to throw away the innovations you had that didn’t meet their needs. If you’ve invested in releasing in small batches and using a deployment pipeline, the delivery process is optimized for getting feedback. If you stop there however, you’ve fallen victim to process ceremony.

Once you start releasing more often, you also need to optimize for changing course based on what you learn from the increased feedback. The bad news is it doesn’t look as good on a chart.

Optimizing_for_Story_Points

The chart above depicts some of the effects on value for a team focused on innovation throughput as a primary goal. As each sprint progresses, the percentage of value is about the same. The industry standard being less than 1/3rd of the ideas you release will provide useful value to customers in established markets, much less in new products. If the theories businesses have about what their customers want before they deliver them were usually true, we’d see more successful startups.

Since what was planned at the beginning of the project above doesn’t change from one sprint to the next, there are no opportunities to course correct until the end. And each new feature that is released increases the maintenance cost since complexity goes up along with feature count.

In this sad state of affairs, the trends should be obvious: an appalling rate of value, decreasing innovation throughput, and increasing maintenance costs. In situations like this, it is alarming to experience how teams will subconsciously game the numbers so the number of story points (effort towards features) looks like it is going up. This is an unfortunate behavior that is unlikely to change with the motivation for measuring performance focused on innovation throughput.

Optimizing_for_Value

The chart above depicts a more typical set of activities for a team focused on value instead of rate of innovation. The highest throughput of new features occurs at the beginning, when you don’t know how your customer will receive the minimum viable product. The team can focus mostly on delivering the first few new features.

During the second sprint, the first release is available to customers and they are providing feedback. To prepare for the change in direction that is inevitable with listening to this feedback, it is more important at this point that the features that were already delivered are of high quality. If any corners were cut, now is a great time to remedy that.  The team might do some refactoring, and plans for what features they deliver next will need to change to accommodate the feedback that was gathered.

When the third sprint rolls around, the team got rid of some ideas we delivered that weren’t right based on what we learned, made sure that what was in the prior sprint was stable, and are now working to deliver features that are closer to what the customer said they wanted. If the feedback we got was reliable (customers ask for things they don’t really want all of the time!) the value produced during this next sprint will potentially be higher.

At the fourth sprint, the team has a steady stream of feedback, and is working on a combination of new ideas and enhancements to adapt what was released to the customer’s desires. The number of features delivered is still lower than when the project started though because some refactoring may need to be done to keep the quality of features high since potentially dramatic changes to the design may have been made to release the most valuable ones. From here on out is a see-saw of increased value from releases, and intermediary releases where the team vigorously realigns the priority and structure of work to accommodate what they are learning. That’s right folks – finding market fit is messy. But this is what needs to be done to make real money in the industry.

Now some of you might be thinking “Can’t I just measure the time spent refactoring and maintaining so we can combine these together and have a clean burn-down chart? I need to show that our resources are fully utilized!”. And yes, you could do this, but why would you want to? I will tell you why – a lack of trust. When management overseeing a new product or initiative wants to micro-measure all aspects of the delivery process to ensure they are getting the most throughput there is an assumption that someone is not working as hard as they could be, and that this can be identified on a chart to “save money”.

The problem is that any process measurement system can be gamed. The only true measurement of progress in a business is profit, and the way to arrive at these moments of high growth is through the adaptive, difficult-to-quantify, collaborative approach in the second chart. Having a team able to work towards value in this way requires brutal honesty about what didn’t work, variation from one sprint to the next, and above all – trust.

So go ahead and estimate features you will deliver each sprint, and use the burn-down chart to see how the team is performing so they can do better at estimating. But stop penalizing teams for delivering a different number of features (or user stories) each sprint. You just might figure out the ideal features for your customers – and keep your staff around for the long haul ahead.

How to Evaluate Application Architects

About 3 years into my career, I was promoted to the coveted “Application Architect” at a manufacturing software subsidiary of a Fortune 500. At the time I was a force of youth and passion – but I hadn’t developed the skills necessary to really shine in the position. I was successful in many ways, but I also lost many opportunities that are crystal clear looking back at that early period of my career.

I served as an Architect in several positions after that, but it wasn’t until I started working for Catapult as a consultant that I learned the soft skills necessary to really be effective. In my 9 years of consulting I’ve had the benefit of working alongside architects at clients that were both great and in need of some help. Though the industry still has no standard definition for what an Architect should do (don’t believe what those certifications try to tell you), there are a few considerations that lend themselves to great outcomes in the position.

Catching Process Breakdowns

Any architect who has worked with a team to deliver a product or service will have stories to tell about what makes a great process for delivering software, and what doesn’t. When talking to a potential Architect, I look for how they used technology to prevent low quality deliverables from getting further down the release pipeline.

We’ve probably all worked on a project where the team established a process for testing, code reviews, and managing configuration changes in each of the environments the product gets deployed to before customer eyes see it. It all sounded dandy until someone realized team members weren’t following the process – but that was too late. How does the individual use technology to cause deliverables to fail fast if they aren’t acceptable?

A Realistic Estimation Mindset

One of the hardest things about being a developer is estimating. The more experience we have with a technology, and the better we understand what constitutes acceptable deliverables, the more accurate the estimate. Most projects allow some user stories to be given to the team with too much uncertainty in the technology or holes in acceptance criteria – how does the individual deal with this? A common response to this situation is to get clarification from the product owner – but what happens when there is still significant risk? I like to hear how individuals communicate timeboxes for research or prototyping of an aspect of the unfamiliar technology and how they would determine when too much time is spent on it.

Sharing Pattern Decisions

We’ve probably all worked with a technical “leader” of some sort who has to be the person to set the pattern that will be used for error handling, validation, messages on a service bus, or a host of other common needs in modern applications. Architects are called to lead – otherwise you end up with the “rockstar” personality who can’t scale and becomes a bottleneck, poorly utilizing the whole team. While a good Architect will show the team patterns during decisions and help them think through problems they might see with them, a great Architect inspires team members to propose patterns themselves and then helps them champion these, giving full credit to the team members.

Mastery of CRUD

The vast majority of modern applications are heavily backed by data; and an application with a poorly designed data model can exhibit poor performance, high cost to introduce new features, and excessive business logic. Great Architects can understand requirements of products as they emerge and help the team to make sound decisions about how to model relationships between data and modify it. They also understand REST not as a technology, but as a principal. I run across a surprising number of Architects that have not developed this skill.

Making Time for the Team

An experienced and effective Architect will not allow themselves to be assigned too much work to support their team. They know that quality deliverables require support and that this has a cost – and will argue it to the folks approving budget with appropriate vigor. As a leader of one or more teams, an Architect cannot be effective if they are annoyed by questions from more junior team members. Rather, some of the best Architects I’ve worked with love to teach and know that by building up the value of their subordinates they can improve the velocity of the team more than focusing on themselves.

Expert Communicator and Facilitator

Almost any Architect can create a PowerPoint presentation or write a blog post, but is what they convey at the appropriate level of detail for their audience? Ask an Architect questions about how they explained an aspect of their implementation of a product to customers and other non-technical personnel. How did they get consensus when stakeholders disagreed?

Evaluate their face-to-face communication style – many architects have a hard time listening because they are already trying to design what’s coming out in the conversation. A great Architect takes notes during conversations and strikes a balance between too much detail and not enough. Evaluating this during an interview or conversation is difficult, but most Architects who know how to do this will mention it when the topic is approached.

Patterns as Necessary

The ivory castles Architects build can become wonders of intellectual masturbation at times and lead the rest of the team to ruin with abuse of the single responsibility principal to an excessive level. I look for Architects who know how to select the simplest possible way to meet the requirements while introducing new patterns only when absolutely necessary. I’ve run across one too many projects where the Architect is the only one following a pattern they set. When I talk to other team members, they just don’t understand the pattern, and don’t have time to follow it. If an Architect cannot write a simple Wiki topic and communicate the value of an aspect of the architecture, chances are it may be unnecessary.

Courage to be Honest

The pressure to agree to deadlines, estimates, and functionality from the business on most Architects is incredibly high. While the best Architects I’ve worked with have an affinity for the business and immense respect for the company’s business model, they aren’t “yes men”. Saying yes to an unreasonable ask simply to avoid confrontation is not only irresponsible but also earns you a reputation for being out of touch with the true effort to complete work. The best Architects set appropriate expectations for sustainable paces of work – and spend the time necessary to argue why their methods are important for the better of the products. They help explain why they cannot make progress on a task in terms the business understands.

Hopefully some of these topics will help you when evaluating your next hire or partner who will be architecting one or more of your applications. What other key traits make individuals successful in this position? Share your comments below! Thanks for reading.

IT deployment standards: why now?

Much like what is happening in the “big data” space, challenges with interoperability in the IT deployment tool market are converging to create an opportunity for standardization. Before I go any further, let me introduce two terms that have no official meaning in the industry but will make this article easier for you to read:

Deployment Providers – These are command-line utilities, scripts, or other self-contained processes that can be “kicked off” to deploy some single aspect of an overall solution. An example might be a code compiler, a command-line utility that executes database scripts, or a script that provisions a new node in VMWare or up on Amazon Web Services.

Deployment Orchestrators – These are higher level tools that oversee the coordinated execution of the providers. They often include the ability to add custom logic to the overall flow of a deployment and combine multiple providers’ activities together to provide an aggregate view of the state of a complex solution’s deployment. Some open source examples of these are Puppet, Chef and Powerdelivery; and there are many commercial solutions as well.

With those two definitions out of the way, let me begin by describing the problem.

Do we really need deployment standards?

First, as more companies are finding that they are unable to retain market leadership without increasing release frequency and stability, having deployment technologies that differ wildly in operation as personnel move between teams (or jobs) creates a significant loss in productivity. Much like ruby on rails disrupted the web application development market by demonstrating that providing a consistent directory structure and set of naming conventions leads to less variability in software architecture and reduced staff on-boarding costs, so too is the IT deployment space in need of solutions for reducing complexity in the face of economic pressure. As developers and operations personnel collaborate to better deliver value to their customers, having a consistent minimum level of functionality necessary for Deployment Orchestrators to deliver the technology stack they’ve chosen provides the opportunity to reduce IT delivery costs that impact both of these job roles.

Secondly, as the number of technologies that need to be deployed increase exponentially, more organizations are struggling with having a means to get the same quality of deployment capabilities from each of them. While Deployment Orchestrators often include off-the-shelf modules or plugins for deploying various assets, the external teams (often open source contributors or commercial vendors) that create the Deployment Providers rarely have the financial incentive to ensure their continued compatibility with them as they change. This places the burden of compatibility on the Deployment Orchestrator vendor. These vendors must then constantly revise their modules or plugins to support changes originating from the downstream teams that deliver the Deployment Providers, and this coordination will grow increasingly difficult as the technologies that they deploy are also released more frequently through industry adoption of Continuous Delivery.

Standards create their own problems

Whenever one speaks of standardization, there are a number of legitimate concerns. If history repeats itself (and it often does), the leading Deployment Orchestration vendors will lobby hard to have any emerging standards align best with their product’s capabilities – and not necessarily with what is right for the industry. Additionally, as standards are adopted, they must often be broken for innovation to occur. Lastly a committed team of individuals with neutral loyalties to any vendor and a similar financial incentive must be in place to ensure the standard is revised quickly enough to keep pace with the changes needed by the industry.

Scope of beneficial IT deployment standardization

What aspects of IT deployment might benefit from a standard? Below is a list of some that I already see an urgent need for in many client deployments.

Standard format for summary-level status and diagnostics reporting

When deployment occurs, some Deployment Providers have better output than others, and having a standard format in which these tools can emit at least summary-level status or diagnostics that can be picked up by Deployment Orchestrators would be advantageous. Today most Deployment Providers’ deployment activities involve scripting and executing command-line utilities that perform the deployment. These utilities often generate log files, but Deployment Orchestrators must understand the unique format of the log file for every Deployment Provider that was executed to provide an aggregate view of the deployment.

If Deployment Providers could generate standard status files (in addition to their own custom detailed logging elsewhere) that contain at least the computing nodes upon which they took action, a summary of the activities that occurred there, and links to more detailed logs this would enable Deployment Orchestrators to render the files in the format of their choice and enable deep linking between summary level and detailed diagnostics across Deployment Providers.

More Deployment Orchestrators are beginning to provide this capability, but they must invest significantly to adapt the many varying formats of logs into something readable where insight can occur without having to see every detail. A standard with low friction required to adhere would encourage emerging technologies to support this as they first arrive in the market, lowering the burden of surfacing information on Deployment Orchestration vendors so they can invest in delivering features more innovative than simply integration.

Standard API for deployment activity status

When Deployment Providers are invoked to do their work, they often return status codes that indicate to the Deployment Orchestrator whether the operation was successful. Deployment Providers don’t always use the same codes to mean the same thing, and some will return a code known to indicate success but the only way that a problem is determined is by parsing a log file. Having a consistent set of status codes and being able to certify a Deployment Provider’s compatibility would improve the experience for operations personnel and reduce the complexity of troubleshooting failures.

Standard API for rolling back a deployment

Deployment in an enterprise or large cloud-hosted product often involves repeating activities across many computing nodes (physical or virtual). If a failure occurs on one node, in some situations it may be fine to simply indicate that this node should be marked inactive until the problem is resolved, while in other situations the deployment activities already performed on prior nodes should be rolled back.

Since rollbacks of deployments are typically executed in reverse-order and need to provide each Deployment Provider with state information needed to know what is being rolled back, today’s Deployment Orchestration vendors don’t make it easy to do this and significant investment in scripting is necessary to make it happen. A standard API that Deployment Providers could support to allow them to receive notification and state when a rollback has occurred, and then take the appropriate action would allow for a consistent behavior across more industry solutions and teams.

Standard API for cataloging, storing, and retrieving assets

When a failed rollback occurs, Deployment Orchestrators must have access to assets like compiled libraries, deployment packages, and scripts that were generated during a prior deployment to properly revert a node to its previous state. Additionally, it is often desirable to scale out an existing deployment onto additional nodes after the initial deployment already completed to meet additional capacity demands.

Depending on the number and size of the assets to store, number of revisions to retain, and the performance needs of the solution; anything from a simple network share to dedicated cloud storage might fit the bill. A standard API abstracting the underlying storage that Deployment Providers can use to store their assets and Deployment Orchestrators can use to retrieve the correct versions would enable organizations to select storage that meets their operational and financial constraints without being locked in to a single vendor’s solution.

Standard API for accessing a credential repository

In addition to Deployment Providers needing to publish assets as a result of deployment activities, they also often need to access security credentials that have been given permission to modify the infrastructure and components that are configured and deployed to when they execute. On Linux and OSX this is often provided by SSH keys, while on Windows a combination of Kerberos and (in double-hop scenarios through Windows PowerShell) CredSSP. Rather than each deployment needing to implement a custom method for locating keys on the correct servers, a credential repository where these keys can be stored and then securely accessed by trusted nodes would simplify the management of security configuration needed for Deployment Providers to do their work with the right auditability.

Summary

With these standards in place, Deployment Orchestration vendors still have plenty of room to innovate in their visualization of the state of nodes in a delivery pipeline, rendering of status in their own style and format for the devices they wish their users to support, and the coordination of these APIs to provide performant solutions across the Deployment Providers that would adopt such a standard. As a team bringing a new Deployment Provider to market, it would improve the adoption of the technology by having the option of adhering to a standard that would ensure it can be orchestrated consistently with the rest of the solution for which their technology is but one part. When standard APIs such as these are made available by a single Deployment Orchestration vendor without first having been established as at least a proposed standard, it is very difficult to motivate the broader community that creates Deployment Providers to do any work to support proprietary APIs themselves.

What do you think? Has the industry tried various aspects of this before outside of a single vendor? Has the market matured since then where the general audience might now see the value in this capability? Would it be too difficult to provide an API in enough languages and formats that orchestration could occur regardless of the platform? Let me know your comments below!

Photo: “lost legos 1/3” © sharyn morrow creative commons 2.0

Continuous Delivery is about failing faster

If releasing to customers once a month, once a week, and even several times a day were easy – everyone would be doing it. Getting IT assets that generate revenue released faster is clearly an advantage. However, the biggest return on investment of Continuous Delivery is a little more subtle.

If you build it, they don’t always come

When a business has an idea for an investment in IT, they do market research, calculate costs and return, forecast customer adoption and get stakeholder buy-in – but often wear rose colored glasses without even knowing it. This isn’t any different from a start-up that is pitching its first business plan. Even with conservative estimates, many assumptions are made and there’s really no way to know how the investment is going to pay off until it gets delivered. There are no sure bets in product management.

As a product manager or business owner several of the ideas you have for your customer will turn out to be useless to them.

Over the 19 years that I’ve been working with companies to deliver IT products and services, I’ve never been satisfied with the waste I see in the processes that many organizations use in bringing ideas to their customer. So much time is spent prioritizing, evaluating costs, scheduling, and tweaking the efficiency of resources but a healthy dose of humbleness is often the key ingredient missing from the equation – and the most costly to ignore. When sure bets turn into failures, the cost of the investment and decisions of engineering are often where the finger is initially pointed. The reality is that as a product manager or business owner, several of the best ideas you have worked out with your customer will turn out to be useless to your larger audience. Customers may even tell you it’s the best thing they have ever seen, sign up to be early adopters, and give you positive feedback on what you’re thinking of charging. But by the time you release it to them, you’ve spent heaps of capital and it can cost you your job (or your equity) to find out you were wrong.

Taking risks in the market is necessary for a business to thrive. Innovation in small businesses drive the economy of the world, despite what larger corporations would have you believe. When a tiny idea explodes to have a big impact, we all benefit as new opportunities are created that result in exceptional economic growth. But even large organizations can have innovative, disruptive ideas flourish as pockets of entrepreneurial spirit win internal support to try something new. The barrier that must be overcome is risk aversion. With the biggest opportunities for growth requiring us to take some chances, why would anyone want to go about delivering their IT offerings using a process that penalizes change and the risk of unexpected outcomes?

Stop the bleeding

“Mr Corleone is a man who insists on hearing bad news at once” – The Godfather

The good news is that it doesn’t have to be this way. When a product that requires physical manufacturing is getting ready for market, everything has to be solid in the design and any changes along the way can bankrupt the idea. If a design change is needed in a new mold for carbon fiber parts that costs millions and the first one has been manufactured, the entire cost of the mold is lost. IT assets aren’t at all like this, though companies continue to think in this mindset. The inventory in an IT asset is information, and information is cheap and easy to change.

Despite the low cost of information as inventory, the time invested to create and change that information is still a concern. If an IT asset is delivered to the customer and found to not have market fit, the more that was invested in that idea, the more that is lost. Knowing this, we should seek delivery processes that let us know as soon as possible that what we’re delivering is not on target and enable us to make changes with the lowest possible loss. Many organizations employ usability studies and surveys to give them validation that an idea is viable before delivering it, and this is certainly a great practice. However I’ve seen customers rave about a design on paper, or a prototype – and then hate the product once it was delivered to them. You really need to deliver an actual, working idea to measure adoption in the market and get feedback.

The challenge

To deliver IT assets more frequently requires organizational change, relentless communication, investing in some technology, and the ability to handle the increased volume of customer feedback that comes with doing so. This isn’t something that can be done all at once, and a challenge I deal with when helping clients is to balance keeping momentum going with the changes needed and not being too disruptive to existing business. When I’ve seen teams attempt to do this without outside help, the politics and lack of experience with cross-functional collaboration can doom it to being looked at as just another pet project of an ambitious employee who didn’t understand “our culture”.

So many ways to fail (and be profitable)

When a team uses Continuous Delivery to move towards frequent releases for their customers, they invest in the creation of a deployment pipeline – which is essentially an automated release process. This is more than an automated build, which may compile some code and deploy it somewhere. Does someone ping the web servers every time you do a release, to make sure things are up? Automate it. Does someone change a setting in a configuration file to publish a mobile app to the right store? Automate it. Do a group of three people have to approve a build before it goes into production? Automate moving the build further downstream with their electronic approval.

With the pipeline in place, there are dedicated stages that each build of your IT assets go through prior to being released to customers. Along the way, an increased level of automated scrutiny is placed on it to fail as soon as a problem indicating that this version is not ready for prime time is found. Here are just some of the ways a delivery pipeline can help you be more profitable by causing failures to occur earlier in the release process.

Failure Cost Remedy
Defect found in production Must rollback system to good state, interrupt new work, and repair customer confidence. Automate acceptance tests in deployment pipeline that run in environments prior to production
Flawed design pattern identified All assets that used that pattern must be changed and re-tested. Initially release a small number of assets that use the pattern to customers.
Reduced desire for feature Lost investment in what was built. Invest only in a minimum viable feature before releasing to customers.
Slower output from team than expected Must adjust forecast delivery dates, must improve capability of team. Don’t commit to dates for sprints far in advance, measure team output more frequently, get training or make hires.
Insufficient performance at capacity Must purchase new hardware, optimize IT code or assets. Automate performance acceptance tests in a capacity testing environment.

These are just a few examples of how a delivery pipeline, and a team that has been organized to continuously deliver IT assets while aligned with their customer, can save money by failing faster. You can see here that the cost of finding a defect prior to production is less than if it were. The cost to change a design pattern that has low penetration in the overall architecture is lower than if it were identified after applied broadly. The cost to find or train resources to better support the needs of a team is less than not knowing they need this help and utilizing their low throughput over the lifetime of a product’s development. All of these early identifications of failure lead up to dramatic savings in the overall cost of finding that critical market fit, and working in a sustainable fashion to support the needs of the business.

By reducing the cost of failure, the ability to handle risk goes up and the opportunities for efficiency are numerous. If your organization is not taking steps towards a more fluid release process, it is only a matter of time until your competition’s agility will enable them to blow past any market advantage you currently have.

How frequent releases help you satisfy and retain your best employees

Think back to the last time you released to your customers. There was probably a brief feeling of satisfaction, hopefully a validation from the customer that you delivered what they wanted, and your team learned a thing or two about how effective they are at deploying and testing the changes that were delivered with the release. Soon afterwards, the team gets to start all over again and these lessons are forgotten.

If you think about a long term relationship, when two people haven’t talked for a while they can get nervous. “Does he still remember what I said about what we were going to do?”. “I wonder if they still feel the same way about me?”. This phenomenon is also present in product development, and the longer your team goes before releasing, the bigger impact these psychological (and measurable) effects have on the profitability of your business goals.

Keeping delivery staff satisfied

When a change is delivered to the customer that meets their needs, the team gets a big boost in motivation. The team is thanked and hopefully rewarded for their efforts, and sales and marketing have a great new story to tell. During this period of “delivery afterglow”, staff are intrinsically motivated to work harder as they feel a responsibility for and purpose to their job. After a while however, staff can revert to their instincts to question how important their job really is, leading to the “I’m just a cog in a wheel” mentality. “Was what we delivered really that big of a deal?”. “I wonder if we can repeat that success again?”.

When release cycles are short enough, this feeling of satisfaction becomes constantly present and creates the environment where people love to do their job not just because of their compensation, but because of the satisfaction they get out of doing it through regular positive feedback. An environment like this is contagious – staff outside of the successful team want to learn their methods and repeat their success, and staff on the successful team are happy to talk about it with friends and potential customers.

What have you done for me lately?

From the customer point of view, they also experience similar emotions that have an impact on the profitability of your delivery efforts. If a long period of time has elapsed between when the customer got a release with changes, they may begin to wonder if you still understand their vision. “Do they know what’s changed in my market since last time we met?”. “I hope they understood what I said!”. “I wish I could talk to them more without bothering them!”. When release cycles are long, this risk of changing priorities and incorrect assumptions has a higher cost. When a manufacturing plant releases a part on to the next station that has a problem, it is often caught via quality gates when assembled as part of the next process to stop the line from moving it forward. In product development information is our inventory, and our customers are the quality gate that matters most.

If an incorrect assumption is made about a change or feature and not validated with the customer early, hundreds of other changes can be based on this incorrect assumption and they are all impacted if the customer finds them to not be valid once released to them. Engineers can lose motivation dramatically if they release a big feature that took a long time to implement only to find that it all has to be reworked. It makes both economic and behavioral sense to release more frequently to ensure that the relationship between the team and its customers is aligned as often as is reasonable.

Practice makes perfect

When releasing changes to the customer, there are delivery costs that are only incurred at the time of release. These usually include things like deployment, user acceptance testing, updating user documentation, and gathering feedback. When release cycles are long, these activities are infrequent so there is low motivation to getting better at them. If a team only releases to their customer once every 6 months, they feel the pain of these activities infrequently and so they are willing to see it as such – a necessary evil that isn’t worth the time to improve. When releases are more frequent, the cost of manual or inefficient delivery processes is more apparent and staff can more clearly see the need for making them as efficient as possible.

Optimizing release process costs is doubly profitable as it both reduces the cost of performing the process, and reduces the time for return on investment due to enabling more value to be delivered per release since a smaller percentage of the effort that goes into a cycle is taken up by these processes. It also increases staff job satisfaction because more time is spent delivering value that is the direct result of the innovation inherent in the creation of IT assets, not simply drudging through excessive process overhead that hasn’t been optimized due to low motivation to do so.

Motivating for excellence

A final important consideration that impacts staff retention and job satisfaction is the opportunity that frequent releases create for evaluating competence. When releases cycles are long, staff are able to report being done with tasks but this is not truly verifiable until it is released to the customer. Many organizations realize the inherent value in retaining top talent but struggle to know how best to help them grow. One of the best ways to help our employees be more effective is to have regular checkpoints during which to measure both quantitative and qualitative outcomes. An IT asset release is a perfect time to do this.

The SCRUM methodology encourages teams to hold a sprint retrospective meeting during which the team can be candid about their successes and opportunities for improvement since the last iteration. When iterations of effort are reported as “complete” at the end of each sprint but not released to customers, unchecked problems with deployment and missed alignment with customer needs is not caught and can leave a team with a misplaced perception of success. This perception is then aligned when the release actually occurs, and the reset between what was perceived and what is reality can be harsh and demotivating.

Rather than delay the inevitable, by releasing to customers more often before starting new work, leaders that evaluate their staff have a better gauge for where staff are doing well and where their skills may need help. An effective manager will use this period to both reward and compliment staff on their improvement and be courteous in helping them see where they need to improve. Though this increased confrontation with competence can initially be met with feelings of uneasiness, it soon becomes a regular part of work and employees begin to expect honest feedback and to be complimented and rewarded when they improve.

Top 5 business myths about Continuous Delivery

When a team decides to try reducing the time it takes for their ideas to get to their customers (cycle time), there are a few new technical investments that must be made. However, without business stakeholders supporting the changes in a SCRUM approach that delivers frequent releases, decisions and planning are driven by gut feel and not quantifiable outcomes. The following is a list of the top 5 myths I encounter (and often address when I provide coaching) to help staff that are not solely technically-focused when they begin adopting Continuous Delivery.

#5: By automating deployment, we will release more profitable ideas.

Automating deployment of IT assets to reduce low value activities like manual configuration and deployment (with risky error-prone manual human intervention) certainly can eliminate wasted capital as part of the process of releasing IT offerings, and is a key practice of Continuous Delivery. However, if the frequency of releases is long, the cost of delaying the availability of those releases to customers adds risk in that their viability in the market may no longer be what was theorized when it was planned.

Once release frequencies are improved, measurement of customer impact and proper work management (specifically appropriate capacity planning and calculating the cost of delay for potential features) must be done to ensure that ideas that turn out to be misses in the market stop stop being worked on as soon as they are identified as bad investments. It is this harmony of smart economic decisions with respect to investing in the idea combined with the technical benefits of building an automated deployment pipeline that transforms the profitability of an IT value chain.

#4: We must automate 100% of our testing to have confidence in automating releases to production

Utilizing automated quality checks to ensure that changes to IT assets do not break existing functionality or dependent systems is certainly a good practice. A long manual test cycle is doubly problematic: it delays releases and adds risk since many teams try to get started on new work while testing is underway. When issues are found with a release candidate build or package being tested, engineers must stop what they are doing to troubleshoot and attempt to fix the problems.

On the flip side, automating the entire testing effort has its own risks as the team can cost the business large sums by having to change and maintain tests when they make changes to the design which happens frequently in Continuous Delivery. Deciding on an appropriate test coverage metric and philosophy should be treated with importance and not included in work estimates as separate line items to discourage removal in an attempt to cut costs. Cutting quality is often the final dagger in the throat of a struggling IT offering.

#3: The CFO requires us to release everything in the backlog to report costs

Many businesses treat IT investments as capital expenditures since they can take advantage of amortization and depreciation to spread the cost of that investment over a longer time period. However, this assumes that the value in the investment provides a consistent return over the lifetime of it being used to generate revenue. A SCRUM process for delivering IT assets aligns better with being recorded as operating expenditures since a minimum viable offering is typically released with a low initial investment in the first few sprints, and the business makes ongoing “maintenance” changes to the offering as the priorities of the market and customer needs change. This is especially true today with everything moving increasingly to cloud based models for value consumption.

#2: We need a “rockstar” in each role to deliver profitable offerings

Many IT offerings that start with an idea are initially implemented with an expert in a particular technology or aspect of delivery, and the team leans on them early on for implementation and expertise. As the complexity of a solution expands, the biggest drain on the profitability of a team is no longer the availability of experts and the high utilization of people’s time – it is the time work to be completed spends waiting in queues. There are several ways to reduce wait time when work with a high cost of delay is held up in a queue. The two methods I see with the most value are to reduce the capacity utilization of team members, and to enable staff to work on more than one discipline.

When team members are highly utilized (their planned capacity is over 60%) this leaves no room for the highly-variable process of delivering IT offerings to account for unknowns that were not identified during planning or design of a cycle of implementation. If the cost of delaying the availability of an idea is high, the cost increases when the date planned for release is missed. Rather than loading resources up to a high capacity, leave them with reasonable overhead to collaborate, tackle unforeseen challenges, and help each other if they finish early.

When team members are specialized, the probability of one member being blocked from continuing by another goes up dramatically. Work to be completed spends more time in a queue wasting money and not moving closer to being made available to customers so it can realize a return. Though you will always have team members that have expertise in specific areas, resources that are willing to test, make informed product priority decisions, and help with deployment and automation are more valuable as part of an IT value stream than specialists. Use specialists when the work requires it, but scale more of your resources out across multiple disciplines for sustainability.

#1: Until we release everything in the backlog, we won’t succeed with our customers

This myth is driven by the manufacturing mindset of looking at IT offering delivery as though all features must be identified up front and misses the point of agile methods entirely. The backlog is a set of theories on what customers will find to be valuable at any given point in time. Any offering that takes more than one release to complete will have a working minimum viable product available to some audience where feedback can be gathered before it’s done.

Since the point of frequent releases is to get that feedback and let it impact the direction of the IT offering, planning to release everything in the backlog leaves no capacity for taking action on that feedback. If you only plan to release everything the business thinks is a good idea at the beginning of a project before letting customer feedback influence priorities, you are simply releasing milestones of planned up-front work – which is a classic waterfall delivery process.

Why you should use Migrations instead of Visual Studio 2010 Database Projects

If you work on an application that uses a database, chances are you have to deal with releasing new versions of your software that make changes to it. The SQL language provides comprehensive support for making these types of changes and can access even advanced features of your chosen database platform. Schema changes are made through create and alter statements typically, and data movement is performed using selects and inserts.

When releasing your software initially, deployment is straightforward as there is no existing data to deal with. As users exercise the features in your software, rows of data are added to tables, and future changes require more care to not destroy or make invalid changes to the existing data.

In the past, DBAs or developers with sufficient SQL programming knowledge have written scripts to make the changes necessary to update database assets that have existing data in them, paying special care to typical situations like adding a new NOT NULL column (you need to initialize it with data to enable the constraint) splitting one column into two, or splitting some columns of a large table out into a new detail table.

For years seasoned developers have used the following approach for making changes to the database:

  • Add a table with one row that stores the “version” of the database. This data is not really application data per se, but more like metadata that identifies the state the schema is in. This version is usually initialized to the lowest version where development starts, let’s say 1.0.0.0.
  • Create SQL scripts when you have changes that check this row. If the version of the database the script is running against is lower than the “version” of your script, make your changes.
  • When your script is done making the changes, increment the version number of the database row to its new version (1.0.0.1 for example).

The great thing about this approach is that it supports deploying changes to multiple versions of the same database. If you are “upgrading” version 1.0.0.0 database and your latest version is 1.0.0.5, any scripts that have numbers between these two versions will run in ascending order. If you are “upgrading” version 1.0.0.3 to the 1.0.0.5, scripts that apply to databases at a version prior to 1.0.0.3 will be skipped.

There are two gotchas with this approach:

  1. You need to test upgrading from any version you have in the field before deploying to that version. So if you are upgrading databases that can vary by 5 versions, you really need to test the upgrade process going from all 5 of these versions to the current version. This is more a consideration than a limitation as you always need to do this when supporting multiple upgrade paths for your software.
  2. Developers can make mistakes in their SQL script and look for the wrong version, or forget to update the version in the database to current if the operation is successful.

When Ruby on Rails was released, the ActiveRecord team along with David Heinemeier Hansson provided the then-emerging ruby community with a technology called migrations, that provides some extra help on top of this. Basically anytime you want to change the database, you would run a command at your operating system prompt that would generate a new script that’s prepended with a version number greater than the latest script in your source.

An example will help here.

  • You run the command “rails generate migration create_users” and a file 00001_create_users.rb is generated. You put code in here to both update, AND rollback changes related to the “users” table for example.
  • You run the command “rails generate migration create_roles” and a file 00002_create_roles.rb is generated. Notice the tool recognized your latest script version and created a newer version automatically.

When you want to deploy to a database, you run another command “rake db:migrate” which tells the “rake” (ruby make) build engine to run all of your database migrations against the target database. The migrations engine automatically does the work of checking the target version of the database, running only those scripts that apply, and incrementing the database version to the latest one that succeeded.

This approach solves the problem of developers having to version things manually, and really simplifies deployment to multiple versions of a database. It also allows developers to incrementally make changes needed to support changes they are working on, without stepping on the toes of other developers.

Enter Visual Studio 2010 database projects

With the release of Microsoft Visual Studio 2010, another approach was provided to developers for managing database versions. This approach was made available by Microsoft’s acquisition of DBPro.

The VS 2010 DB project approach is to have a type of project in your solution that contains scripts that can create all the artifacts in your database. There are create scripts for stored procedures, schemas, roles, tables, views etc. However, the tool is sold as not requiring developers to know as much SQL programming, but rather they are provided with a treeview panel in the Visual Studio IDE (referred to as “Schema View” in the documentation). They can interact with this tree to add tables, rename columns, and make other trivial changes via a GUI and these changes are then saved as new SQL scripts in the project.

What happens when you deploy your DB project is that an engine that is part of the build system in Visual Studio does a compare of the target database being deployed to with what a “new” database would look like based on the scripts in your project, and then generates a script to alter it to make it’s structure match the project’s source code. The engine works much like RedGate software’s “SQL compare” tool in that it is fairly intelligent about determining changes in schema and generating appropriate scripts.

At first glance, this seems like a superior solution as it gives point-and-click programmers more productivity, removes version management from the picture, and eliminates the need to manually create alter scripts. In practice however, by itself this approach will not meet the requirements of most deployment cycles.

Microsoft released an ALM rangers guide to using Visual Studio 2010 database projects that is meant to be used as primary guidance for developers, DBAs, and architects looking at how to use best practices around VS 2010 DB projects. Part of this guide talks about “Complex Data Movement”, or what I will refer to here as “changing database assets containing data” because that’s really what they are talking about.

Unfortunately Microsoft’s solution for this “complex” scenario (which is common and regular, in my experience) is to subvert the diffing engine and revert to the use of temporary tables and pre/post build scripts to trick the engine into thinking the schema doesn’t need to be changes while fixing it up afterwards. This issue is described in the ALM rangers guide, and also on Barclay Hill’s blog post here.

Jeremy Elbourn comments on the MSDN forums why this approach actually makes maintaining database changes over time even more difficult than the migration approach in a real world environment. Microsoft also recently announced the availability of database migration support in ASP.NET MVC 4 (but only if you are using Entity Framework as well). These developments leave folks responsible for determining a database change management approach confused as to where the best practices are going with respect to Microsoft’s vision.

It is of my opinion that Visual Studio 2010 database projects should be avoided in favor of a migrations engine for the following reasons:

  1. The success or failure of employing VS 2010 DB projects in real world, enterprise sized clients has yet to be demonstrated in measurable capacity and the technology is still relatively new. I’ve seen some press releases, but these are marketing announcements with no downloadable artifacts to evaluate. I also have been discussing the tradeoffs with one colleague who is using it on a single application for an enterprise client with many integrated applications.
  2. I tend to embrace tools that generate code for me or do work automatically only when they are comprehensive, well-understood, and have limited “gotchas”. Schema and data change management is a complex topic and the VS 2010 database project approach leads developers to think the solution is easy, while in practice it forces them to understand how the diffing engine works, the project structure and deployment lifecycle of a DB project build, and how to circumvent the diffing engine to change database assets containing data.
  3. The ALM guide proposes detecting existing schema state to determine when pre or post deployment scripts need to be run. “If this column exists, run this script”. This is an error prone and ignorant approach. What if version 1 has this column, version 2 does not, and version 3 adds it back in? This kind of check will fail. Ironically the workarounds for this are to come up with a custom versioning and incremental migration strategy for your pre/post build scripts anyway, which is a red flag to me that the design is flawed.
  4. VS 2010 database projects abstract developers from getting better at SQL, much like Web Forms did for HTML/CSS/JavaScript prior to ASP.NET MVC arriving on the scene. In my experience, developers are seriously lacking adequate database management skills and need to get better at all aspects of it. There are several assets not supported by VS 2010 database projects in the ALM rangers guide that need to be scripted manually anyway.
  5. The best time to write tests for changes being made to a database and reviewing their impact is when making the changes, as the structural impact is fresh in the developers’ mind. Using the diffing tool, the generated alter scripts still need to be reviewed prior to deployment, especially if you don’t have a high coverage functional and acceptance test suite to ensure no breakage was caused by the change. Chances are you have an operations person reviewing the changes before running them on production, and without comprehensive testing you are relying on them to make sure the changes are appropriate. I hope you are working closely with operations during the entire development lifecycle in this situation!
Migrations work for all technologies and are simpler to understand and maintain

If you would like to use migrations today without both adopting ASP.NET MVC 4 and Entity Framework, Thoughtworks created an open source tool “DBDeploy” (with a corresponding .NET version, DBDeploy.NET) that they use with all of their clients and handles this elegantly. The only difference between it and the rails migration approach is that rails migrations use a DSL for making the changes, while DBDeploy uses SQL.

UPDATE (6/29/2012) I now recommend using RoundhousE as it has better support for more databases, uses .NET instead of Java, and gives you dedicated directories for stored procs, functions, and other assets that can get dropped and recreated each time without having existing data come into the picture.

Re-trusting check constraints in SQL doesn’t help for NULLABLE columns

I’ve been going through a large database for a client of mine and finding foreign key and check constraints that are marked “untrusted”. This happens when a relationship between two tables has some rows with foreign key column values that don’t have a match in the related table. When this happens, Microsoft SQL Server can’t use the query optimizer as well to lookup matches between the two tables when running queries. This results in sub-optimal performance.

Unfortunately I discovered today, if the foreign key column accepts NULL, you can still run a query to re-enable the check constraint without error, but it will still be marked as “untrusted” in INFORMATION_SCHEMA and will not benefit from the query optimization available to trusted keys!

Hopefully this helps someone out there to reduce the work you need to do when determining a data optimization strategy around dealing with existing untrusted checks.

The continuing saga of aligning .NET with rails – FactoryGirl.NET

While working on kinlighten, a side business I am starting up that runs on ruby on rails, I used a popular open source ruby framework for generating objects to use in tests. There are many frameworks available for rails that are popular including factory_girl, machinist, and fabrication.

These frameworks assist with the creation of objects in your application’s domain model that are initialized with state appropriate for tests. They help by allowing multiple test cases to re-use the same objects factories, reducing the lines of object initialization needed by tests.

The framework I used is factory_girl, and you use it by creating a single ruby file for each model’s factory. For example, if my domain had an order, customer, and line_item models, I would have a order_factory, customer_factory, and line_item_factory. When I want an instance of an object from my factory in my test, I just call a single method to “build” one for me.

James Kovacs has created the FactoryGirl.NET class library to allow .NET developers to use factories as well. This is another of many recent steps (bundling and minification, migrations, etc.) to bring ASP.NET MVC productivity up to the level of rails.

In his framework, you would instead create a class for each model in your domain. So if you had an Order, Customer, and LineItem C#/VB class, you would have OrderFactory, CustomerFactory, and LineItemFactory classes that can be used to retrieve objects initialized with state appropriate for testing. These classes go into your Test class library project.

Check out his article on creating FactoryGirl.NET, the github repository (it’s also on NuGet), and a great intro on how to use the original ruby version of factory_girl for more information. This is a first class technology with heaps of success in the rails community and I commend James for working to bring this to .NET.

Razor is sharp, but NHaml is still haiku for HTML 5 and JQuery

A colleague of mine told me recently about Razor, the view engine for ASP.NET MVC 3, and upon researching it and using it in a test project, I almost instantly came to compare it to NHaml since I’ve been using HAML for several years doing rails on the side. What I found is that though Razor is the best view engine I’ve seen from Microsoft (on top of a great version 3 of ASP.NET MVC – nice job guys), I still believe NHaml’s syntax is significantly better suited for HTML applications and even more so if they use JQuery.

Though Razor does a great job requiring minimal characters to insert executable code logic in between the markup it generates (and is basically equivalent in that respect to HAML), it does nothing for minimizing the amount of code you have to write to express the HTML outside of those logic statements. NHaml is simply superior here when you are generating HTML for this reason: it reduces markup to the minimal information needed to identify the JQuery or CSS selectors that elements have applied to them.

It does this because with NHaml normally you specify the name of an element without the angle brackets, but if the tag you want is a DIV element with an ID attribute, you can just specify the ID prefixed by a hash symbol and drop the DIV altogether.

<div id=”blah”></div>

becomes:

#blah

This also works for CSS classes. This dramatically increases code readability because lines of code begin with the JQuery selector or CSS style name used to access them. When writing Javascript or CSS, locating these elements in markup is much easier. This is already on top of the fact that NHaml drops the requirement for closing tags.

Here’s an example that I think illustrates the point. Let’s say I have a CSS style sheet with the following styles. Don’t worry about the attributes in the styles themselves just look over the list of CSS selector names (container, banner etc.) and think of looking at this file day to day:

#container { width: 100%; }
#banner { background-color: blue; } 
.topic { font-size: 14pt; }

Now here’s some HTML markup styled with selectors in the above sheet in HAML:

#container
  #banner

    .topic Hello

Here’s how you would generate the exact same markup using Razor:

<div id=”container”>
 
<div id=”banner”>
   
<div class=”topic”>Hello</div>
 
</div>
</div>

Building on this let’s say we wanted to override the .topic style inline with some other styles, and throw in some inline JavaScript. Here’s HAML again:

:css
  .topic { font-size: 14pt; }

:javascript
  alert(‘hello!’);
#container
  #banner

    .topic Hello

and here’s Razor:

<style type=”text/css”>
.topic { font-weight: bold }
</
style>
<javascript type=”text/javascript”>
alert(‘hello’);
</javascript>
<div id=”container”>
 
<div id=”banner”>
   
<div class=”topic”>Hello</div>
 
</div>
</div>

Hopefully you can see the HAML is much easier to read, and reduced in lines of code by about 15% in this example.

Here’s another great post from late last year that shows some comparisons of Razor and NHaml.

%d bloggers like this: