Keep your agile promises by validating acceptance and reducing your sprint workload

Most project teams have tried some permutation of an agile or SCRUM process by now, and a consistent theme amongst those I see on consulting engagements is a failure to deliver the work done in a sprint to users before starting the next one. Continuous integration, standup meetings, and backlogs are usually present, and some will even try test-driven development. But at the end of the sprint, the work is still not ready to deliver to users.

At the end of a sprint, there is a meeting that takes place where stakeholders and the development team get together to review the work that was done. More often than not, the stakeholders like what they see to some extent, but find discrepancies between what they thought they were getting and what was actually implemented. In every case the reason this occurs is a failure to establish acceptance criteria prior to doing the work.

An agile or SCRUM process is usually sold to the business as a way to get more communication going between the development team, and an ability to shift priorities since the backlog can be prioritized; and the work assigned to development in the next sprint allows for more flexibility than a waterfall approach with typical several month to year cycles for releases. Additionally, higher quality is usually promised to the business.

Inexperienced agile teams may read extreme programming and story-driven approaches and like the fact that requirements are sold as only needing to be short statements and not detailed use cases as happens on a waterfall project. They often take this to the extreme, in that a simple description of the work is established in the backlog, and this is the only agreement that can be pointed to with certainty at any point in the process.

A user story in an agile or SCRUM development team’s backlog should be a promise for a future conversation. When a story is scheduled into the next sprint and that sprint starts, the first activity that should take place is the business stakeholder most intimate with the story has a conversation with the developer who is going to do the work, and a QA person who is responsible for validating acceptance of the work is also present.

During this conversation, the goal is to establish acceptance criteria. This focus on criteria provides several benefits. First, it allows the business more time to provide details about the story that they may not have originally communicated when it was placed on the backlog. Second, it allows the developer a chance to communicate technical challenges with the business’ vision, and gives the two parties a chance to come to a compromise in design that will sufficiently meet the needs of both. Thirdly, it enables QA to think about possible ways in which it may be tested, which often leads the business and development representatives to further clarity. The last, which is the subject of this post, is to establish what constitutes successful completion of the work.

To do so does not require a use case, or a large document. Rather, the group should be able to walk away with a description in English of ways in which the user or system will interact with the software that, if successful, validate that the work was done correctly. The level of detail you go into with your description is up to your team, but the more detailed, the more sure you can be that what will be completed at the end of the sprint will be ready for delivery to users. If you are building a calculator for example, you may establish several mathematic calculations that have to succeed to consider it acceptable. Every possible calculation does not need to be present here; but rather enough that it would be difficult to meet the acceptance criteria and still deliver a low quality feature.

Once this acceptance criteria is established, it is of great importance that the person responsible for user acceptance, typically a QA representative, works with the developer to write automated acceptance tests (highly preferred) or come up with a manual testing process that can be used to verify it. This work can be done before the code is written (test-first acceptance) or during (parallel acceptance) but do not leave this until the end. It is highly important that developers are able to execute the acceptance tests, whether automated or manually, several times during the sprint to gauge their progress towards completing it.

Because this acceptance criteria is established up front, it helps developers to focus on delivering precisely that functionality and also reduces the chatter that often happen in lieu of this as a developer attempts to get clarification from the business about details that were not there at the beginning. That being said, if the team is inexperienced with defining acceptance, the first few sprints may result in two undesirable side effects.

The first of these is that since developers are not used to having to deliver acceptance tests along with the work itself, there is a good chance that too much work will be scheduled in the sprint, and some of them may be late. It is of great importance that the entire team – the business, QA, and development accept this possibility and use it as a learning experience to discover what a reasonable amount of work to deliver in a sprint looks like when it has to be accepted and deliverable before the sprint is over. The next sprint will likely deliver less functionality in the same amount of time, but will be done in time for the end of the sprint. This is the difference between what I’ve heard others call “agilefall” or “waterscrum”. That being the cherry-picking of practices from an agile/SCRUM process and failing to deliver on the promises.

The second side effect of this process change that may be felt when first implementing it on your project is that there will still be some things missing from what was delivered and what the business expects. Let me be clear here – it is perfectly normal, and actually a great benefit to using an agile process that the business can see something every 2 weeks (or however long your sprint is) and upon doing so, provide additional detail and changes that can be scheduled for the next sprint. However the entire delivery team needs to get better at articulating what they do plan to deliver in a way that can be acceptance tested and is clear to the developer so what is agreed upon is not open to interpretation.

This subtle difference is important – it is unrealistic and illogical for the business to attempt to hold developers accountable for not delivering functionality at the end of a sprint for which acceptance criteria could not be defined. If the business wants developers to do a better job at delivering what they want, they must improve their ability to articulate it, or simply embrace the great flexibility that comes with an agile process to allow them to figure out more about what exactly they want every two weeks.

One more change needs to occur to your process to allow for the work that is done in the sprint, now backed by acceptance criteria, to be delivered to users at the end. The developer should allow for time to meet with operations personnel or whoever maintains the various environments (development, acceptance, and production for example) to ensure that they can actually deliver it to users at conclusion of the end-of-sprint meeting. The business may still decide that it is not ready for users from a functional standpoint, but the goal should be for the functionality delivered in each sprint to be of high enough quality to deliver to users immediately following conclusion of the sprint should they desire. Refer to my post about the dangers of making production an island and not optimizing your build process for quick escalation from a user acceptance environment into production for more information on how your operations team can deliver deployment scripts along with the development team.

Think for a moment about the net result of all of this. At the end of a sprint, QA can demonstrate that the functionality delivered meets the acceptance criteria not that they or a developer came up with, but the business as well. We’ve all seen the project where QA said a feature was tested but the business is upset with them and the developers for what was delivered. Developers also do not have to feel any anxiety that what they deliver will not be acceptable. They should however be comfortable with the fact that upon seeing the work, the business may want things changed or additional functionality put in. This is exactly why businesses usually agree to trying out an agile/SCRUM process in the first place.

Even though less functionality is delivered at the end of a sprint than your team may be used to due to having to include time for defining acceptance criteria, building automated or manual acceptance processes, and getting that functionality deployed into a user acceptance environment – the net result is that your business truly will be able to deliver new functionality to users every two weeks. This outcome alone is more important than any of the individual agile or SCRUM processes – that of continuously delivering real value to your users.

Put all your environment-specific configuration in one place for a pleasant troubleshooting experience

Most software applications leverage a variety of third party libraries, middleware products, and frameworks to work. Each of these tools typically comes with its own method of configuration. How you manage this configuration has an impact on your ability to reduce time wasted tracking down problems related to differences in the environment in which your application runs.

Configuration as it relates to delivering your software really comes in two forms. The first of these is environment-neutral configuration. This type of configuration is necessary for the tool or software to work and doesn’t change when used in your development, testing, production, or any other environment. The second type is environment-specific configuration and is basically the opposite.

When your application is delivered into an environment, whether on a server somewhere or your users’  devices, troubleshooting configuration problems is much easier if all environment-specific configuration is in one place. The best way to do this is to create a table in a database, or a single file, that stores name/value pairs. For example the “ServerUrl” configuration setting might be set to “localhost” in the development environment. Where in production, it’s some domain name you probably purchased.

The problem with adopting this at first glance is that most tools have their own method of configuration, so to make this work you need to find a way to populate their configuration from this database or file. Do this with the following process:

  1. Create a table or file named “ConfigurationSettings” or “ApplicationSettings” for example, that holds the name/value pairs for environment-specific configuration. You can use nested pairs, or related tables if you need more complicated configuration.
  2. Create a build script for each environment (T-SQL, PSake, MSBuild, rake etc.) that populates the table or file with the values appropriate for it. If you have 4 environments, you will have 4 of these files or scripts.
  3. When you target a build at an environment, run the appropriate build script to overwrite the configuration values in that environment with the ones from the script. Note that I said overwrite, as you want to prevent people in the field from changing the configuration of your environment without doing a build. This is because configuration changes should be tested just like code.
  4. For each tool or asset that you want to configure, create a build script (PSake, MSBuild, rake etc.) that reads the values it needs by name from the table or file populated in step 3, and updates the configuration in the format needed. An example would be updating a web.config file’s XML data from the data in the table or file, or applying Active Directory permissions from the data in the table or file.
  5. Create a page, dialog, or view in your application that lists all of the data in the configuration table or file. This can be used by your personnel to easily see all the environment-specific configuration settings in one place.

This may seem like a hoop to jump through considering Microsoft and other vendors already provide environment-specific configuration files for some of their technologies, but I still encourage you to do this for the following reasons:

  1. When something goes wrong in one environment that works in another, it is much faster to look at a page with a flat list of configuration settings than to look in source control at a bunch of files or scripts that can be anywhere in your source tree.
  2. When environment-specific configuration is stored in source control as scripts, you have an audit trail of how those changes have occurred over time in the history of each file.
  3. Whenever you need a new environment, you can simply create a new script with data for that environment and you already have an automated means of populating the configuration mechanisms used by all of the tools and libraries you leverage.
  4. When you need to provide environment-specific configuration for a new technology, you can script setting it up and not worry about whether it supports environment specific methods out of the box.

Pay off your technical debt by preferring API clarity to generation efficiency

I’ve built the technical aspects of my career on combining technologies from Microsoft, that are easy to sell into enterprises that require the confidence that comes from their extensive support contacts and huge market footprint, with open source technologies that steer the direction of technology ahead of the enterprise curve – eventually to be embraced by them.

Microsoft has always provided powerful tools for developers in their Visual Studio product line. They focus on providing more features than any other vendor, and also having the flexibility to allows developers to design their software with the patterns that they find make the most sense to them. Because of this, the community is full of discussion, and there are always new ways to combine their technologies together to do similar things – but with quite a bit of variance on the architecture or patterns used to get them done. It can be daunting as a new developer, or a new member of a team, to comprehend some of the architectural works of art that are created by well-intentioned astronauts.

After I learned my first handful of programming languages, I began to notice the things that were different between each of them. These differences were not logic constructs, but rather how easy or difficult it could be to express the business problem at hand. Few will argue that a well designed domain model is easier to code against from a higher level layer in your application architecture than a direct API on top of the database – where persistence bleeds into the programming interface and durability concerns color the intent of the business logic.

In recent years domain specific languages have risen in popularity and are employed to great effect in open source projects, and are just starting to get embraced in Microsoft’s technology stack. A domain specific language is simply a programming interface (or API) for which the syntax used to program in it is optimized for expressing the problem it’s meant to solve. The result is not always pretty – sometimes the problem you’re trying to solve shouldn’t be a problem at all due to bad design. That aside, here are a few examples:

  • CSS – the syntax of CSS is optimized to express the assignment of styling to markup languages.
  • Rake/PSake – the syntax of these two DSLs are optimized to allow expressing of dependencies between buildable items and for creating deployment scripts that invoke operating system processes – typically command-line applications.
  • LINQ – The syntax of Language Integrated Query from Microsoft makes it easier to express relationship traversal and filtering operations from a .NET language such as C# or VB. Ironically, I’m of the opinion that LINQ syntax is a syntactically cumbersome way to express joining relationships and filtering appropriate for returning optimized sets of persisted data (where T-SQL shines). That’s not to say T-SQL is the best syntax – but that using an OO programming language to do so feels worse to me. However, I’d still consider its design intent that of a DSL.
  • Ruby – the ruby language itself has language constructs that make it dead simple to build DSLs on top of it, leading to its popularity and success in building niche APIs.
  • YAML – “Yet another markup language” is optimized for expressing nested sets of data, their attributes, and values. It’s not much different looking from JSON at first glance, but you’ll notice the efficiency when you use it more often on a real project if you’ve yet to have that experience.

Using a DSL leads to a higher cognitive retention of the syntax, which tends to lead to increased productivity, and a reduced need for tools. IntelliSense, code generation, and wizards can all cost orders of magnitude longer to use than to simply express the intended action using a DSL’s syntax when you’ve got the most commonly expressed statements memorized because the keyword and operator set it small and optimized within the context of one problem. This is especially apparent when you have to choose a code generator or wizard from a list of many other generators that are not related to the problem you’re trying to solve.

Because of this, it will reduce your cycle time to evaluate tools, APIs, and source code creation technologies based not on how much code your chosen IDE or command-line generator spits out, but rather the clarity in comprehension, and flexibility of that code once written. I am all for code generation (“rails g” is still the biggest game changer of a productivity enhancement for architectural consistency in any software tool I’ve used), but there is still the cost to maintain that code once generated.

Here are a few things to keep in mind when considering the technical cost and efficiency of an API in helping you deliver value to customers:

  • Is the number of keywords, operators, and constructs optimized for expressing the problem at hand?
  • Are the words used, the way they relate to each other when typed, and even the way they sound when read aloud easy to comprehend by someone trying to solve the problem the API is focused on? Related to this is to consider how easy it will be for someone else to comprehend code they didn’t write or generate.
  • Is there minimal bleed-over between the API and others that are focused on solving a different problem? Is the syntax really best to express the problem, or just an attempt at doing so with an existing language? You can usually tell if this isn’t the case if you find yourself using language constructs meant to solve a different problem to make it easier to read. A good example is “Fluent” APIs in C# or VB.NET. These use lambda expressions for property assignment, where the intent of a lambda is to enable a pipeline of code to modify a variable via separate functions. You can see the mismatch here in the funky syntax, and in observing the low comprehension of someone new to the concept without explanation.
  • Are there technologies available that make the API easy to test, but have a small to (highly preferred) nonexistent impact on the syntax itself? This is a big one for me, I hate using interfaces just to allow testability, when dependency injection or convention based mocking can do much better.
  • If generation is used to create the code, is it easy to reuse the generated code once it has been modified?

You’ll notice one consideration I didn’t include – how well it integrates with existing libraries. This is because a DSL shouldn’t need to – it should be designed from the ground up to either leverage that integration underneath the covers, or leave that concern to another DSL.

When you begin to include these considerations in evaluating a particular coding technology, it becomes obvious that the clarity and focus of an API is many times more important than the number of lines of code a wizard or generator can create to help you use it.

For a powerful example of this, create an ADO.NET DataSet and look at the code generated by it. I’ve seen teams spend hours trying to find ways to backdoor the generated code or figure out why it’s behaving strangely until they find someone created a partial class to do so and placed it somewhere non-intuitive in the project. The availability of Entity Framework code first is also a nod towards the importance of comprehension and a focused syntax over generation.

Why continuously deliver software?

Since I adjusted the focus of my subject matter on this blog over the past couple of weeks, one of the main subjects I’ve been talking about is continuous delivery. This is a term coined in a book by the same name. I’m attempting to summarize some of the concepts in the book, and putting an emphasis on how the practices described in it can be applied to development processes that are in trouble. I’ll also discuss specific technologies in the Microsoft and Ruby community that can be used to implement them.

If you really want to understand this concept, I can’t overemphasize the importance of reading the book. While I love blogs for finding a specific answer to a problem or getting a high level overview of a topic, if you are in a position to enact change in your project or organization it really pays to read the entire thing. It took me odd hours over a week to read and I purchased the Kindle version so I can highlight the important points and have it available to my mobile phone and browsers.

That being said, I want to use this post to dispel what continuous delivery is not, and why you would use it in the first place.

Continuous delivery is not

  • Using a continuous integration server (Team Foundation Server, CruiseControl.NET, etc.)
  • Using a deployment script
  • Using tools from Microsoft or others to deploy your app into an environment

Rather, the simplest description I can think of for this concept is this.

“Continuous delivery is a set of guidelines and technologies that when employed fully, enable a project or organization to delivery quality software with new features in as short a time as possible.”

Continuous delivery is

  • Requiring tasks to have a business case before they are acted upon
  • Unifying all personnel related to software development (including operations) and making them all responsible for delivery
  • Making it harder for personnel to cut corners on quality
  • Using a software pattern known as a “delivery pipeline” to deliver software into production
  • Delicate improvements to the process used for testing, configuration, and dependency management to eliminate releasing low quality software and make it easy to troubleshoot problems

I’ll continue to blog about this and I still encourage you to read the book, but one thing that really needs to be spelled out is why you would want to do this in the first place. There are several reasons I can think of that might not be immediately apparent unless you extract them out of the bounty of knowledge in the text.

Why continuously deliver software?

When personnel consider their work done but it is not available to users:

  • That work costs money and effort to store and maintain, without providing any value.
  • You are taking a risk that the market or technologies may change between when the work was originally desired and when it is actually available.
  • Non-technical stakeholders on the project cannot verify that “completed” features actually work.

When you can reduce the time it takes to go from an idea to delivering it to your users:

  • You get opportunities for feedback more often, and your organization appears more responsive to its customers.
  • It increases confidence in delivering on innovation.
  • It eliminates the need to maintain hotfix and minor revision branches since you can deliver fixes just as easily as part of your next release.
  • It forces personnel to focus on quality and estimating effort that can be delivered, instead of maximum work units that look good on a schedule.

And lastly: when personnel must deliver their work to users before it can be considered done, it forces the organization to reduce the amount of new functionality they expect in each release; and to instead trade volume for quality and availability.

Refactoring to the realities of your delivery process

GGDFC44UZPDY

If you are a developer that writes code (yes, some don’t), you’ve inevitably been boxed into the “refactoring justification corner”. At some point you realize that a task you’ve been assigned affects more than just the code you thought it did, and that you’ve got a deeper design change to deal with.

Earlier in my career, when this would happen I was at product companies and we would just work overtime, get help from another resource, or be late. When this started to happen more often, we’d include “refactoring time” in our estimates. These were both insufficient approaches, and led to management looking at refactoring as “you didn’t do it right the first time”, and us feeling like we were doing something wrong. This is a manufacturing driven economy mindset, with fixed effort and materials, that doesn’t account for the reality of software projects. But we also had things to learn.

I see refactoring as falling into two distinct categories. How you react to it when it pops up, and which type of refactoring you are encountering has a big impact on your options.

Functional refactoring

The first of these, functional refactoring, occurs when code you originally thought didn’t need to be touched creeps into the picture to complete functional requirements. Basically, if you don’t do this refactoring, you can’t make the feature work. The tension here is stronger when you work directly for the company making the product, because being a dedicated resource you are usually thought of as an expert and held directly accountable for your actions on the company as a whole. You made a rough estimate, got into doing the work, and found the effect on design is bigger than you originally envisioned. Leaders who are uneducated as to the realities of the trade see this as you not doing your job correctly.

Since I started doing consulting 5 years ago, I was lucky to work for an employer (and still do) who understands that this is simply the nature of the beast, and we have processes in place to deal with it. When you aren’t intimately familiar with a codebase, or even a set of classes in a codebase you deal with all of the time, the rough estimate is just that – rough. As a development or project manager, part of your job is to instill the understanding into your culture that building software is not like building a house, as we are using materials that are unproven, trying to meet requirements that are in conflict with each others’ goals, using personnel with subjective evaluation of skills, and encounter architectural “works of art” at times. We include this opportunity for changes in complexity during the engagement as something clients must acknowledge as a possibility in our statements of work.

When this happens on a consulting engagement, we ask ourselves: can I do the extra work without disrupting the estimates for my next tasks? If so we just do it. If not, we schedule time to meet with the client, and explain the situation. At this point, we offer an estimate for the additional work, and give them a chance to either pay for the change, or opt not to do it. On large projects, we will occasionally give clients this work without additional fees, but it only happens once or twice, regardless of the size. Otherwise we get into a situation where many small changes add up to one big chunk of unpaid work.

At a product company, your process needs to be in line with realities of the trade in much the same way. Personnel should know that software development is one of the most unpredictable jobs in the world, and that they must be prepared to allow for extra time to complete tasks that turn out to have a greater cost. To do this properly, the organization must embed this into its culture, and developers have to feel safe that they can communicate this without being reprimanded. If development leads say it’s OK to communicate discovered extra effort, but they ridicule their developers every time they do it, they lose their respect and will have a hard time keeping their trust with future mandates or cultural changes.

The bottom line is that the business should be able to make factual decisions on what they want to pursue without expecting heroics to save them when unplanned complexities occur. If a task that was estimated to take 1 week blows up into something that takes 4, divide the new work up into smaller units and throw what can’t get done that iteration back onto the backlog. If the business can’t afford the overall effort completely, assign the developer a new task and throw it on the bottom of the backlog again. Agile and SCRUM processes allow for businesses to react quickly to market and technical changes – they do not predict the future or prevent development teams from encountering unknown complexity.

Cross-cutting refactoring

The second type of refactoring we encounter relates to nonfunctional requirements or patterns.

Before you start iteration one of your project, your business analysts or customer stakeholders should have requirements for nonfunctional aspects of the system. These include things like max response time (pages should load in under 2 seconds), throughput (the system should support 1000 requests per second to page x without causing other performance requirements to be exceeded), auditing (all changes to data should include who made the change, when, and what was changed), and archiving strategies (when do we purge old data). Testers should be able to help developers create automated acceptance criteria at the beginning of the project that run during later phases of your build process to ensure these are being met. You’ll need to create a separate environment that is a clone of production to measure these accurately.

Cross-cutting refactoring can also occur when patterns are not established prior to starting the project (or as part of the first few iterations). Something as crucial as your validation approach, error handling approach, data access strategy, dependency injection integration points, and security model should be established before any other features start getting built.

The reason for this early priority on patterns and nonfunctional requirements is that refactoring to meet cross-cutting requirements is one of the most expensive to encounter, because it typically impacts most code assets in one or more layers or silos of your system’s architecture. If you’ve already built 50 forms and now change (or come up with) your validation approach, you’ve got 50 existing assets that have to be massaged into following the pattern. The forms may have been implemented in a way that was simple to meet the initial requirements, but is not sufficient given the new cross-cutting ones. If you establish these cross-cutting requirements up front, the pattern is available to follow at the beginning of working on any task that encounters that pattern, and reduces the opportunity for waste through existing incompatible implementations.

As a development lead or manager, it is your responsibility to ensure that time is spent identifying cross-cutting patterns as early as possible on the project. Leverage your business analysts for the nonfunctional ones, and leverage your developers to identify the patterns. If you don’t do this, it is of no fault to your developers that as you introduce these into the backlog, “visible progress” in the project may slow to a standstill to implement them.

It is for this precise reason that nonfunctional requirements, and establishing of patterns, need to be backlog items that can be prioritized in their own right. This gives the business the power to decide if it is more important to accept credit card payments (functional), or to allow 1000 simultaneous requests (nonfunctional). Refactoring is an important tool to be used as necessary when you know what kind you’re dealing with, and why it has occurred. The better you get at understanding the causes for it, the more comprehensive of planning you can do to ensure a smooth delivery cycle as the iterations of your project progress.

Forgotten oracles – is fortune telling controlling your value stream’s destiny?

I’m taking a break from my posts on continuous delivery to talk about a related trend that continues unabated. Ask yourself, have you ever heard any of these phrases uttered in your delivery process (or perhaps said them yourself)?

  • “That algorithm is simple and doesn’t require testing.”
  • “Only a few of our customers use that alternative configuration, it should work.”
  • “I’ve used this pattern hundreds of times. It will work fine for this application.”
  • “There’s enough information in the requirements for any developer to implement this. Don’t schedule any time to do more analysis.”
  • “This component is too simple to need a code review.”
  • “We only found a few bugs in regression last time. We can’t afford a full test cycle. Get it out there”.”
  • “They are late on their deliverable, but it will only take a couple hours to integrate once we get it.”
  • “If that happens it’s a catastrophic event. There’s no way to test for it, just disclaim our liability in the license agreement.”
  • “Production isn’t very different from staging. We’ll just fix it in production if we find something that breaks.”
  • “Our customers aren’t going to use that case much anyway. Just focus on the happy path.”
  • “Our customers use all of these features. If we don’t include them all in the new version, our customers won’t use the product.”
  • “Can you believe this single request performance? We will scale to thousands of requests!”.

These statements are sadly made all the time. They are a symptom of the pressure teams feel when the software delivery process is thrown together, and is not gated with automated metrics or process steps that can’t be skipped. It’s easy for someone to declare “all code will be 80% tested before we ship”, but if you don’t have an automated process for making sure that’s enforced, it can be circumvented as release time nears.

There’s no way to realistically automate every decision (and I loathe the day where someone tries to sell me on a technology for that one), but when we predict the future part of taking that risk is measuring it, and looking for trends. This is business intelligence at its core, applied to your development process.

To validate predictions, it’s necessary to capture metadata related to every task assigned to members of your team (that includes you, management) as well as keep a log of predictions and outcomes. This log should include predictions that were made, by whom, and what process or design decisions were made on the basis of the description. It should also include a description of a metric that can be used to judge whether other events occurred as a result of the prediction in some measurable fashion.

As an example, if a prediction is made that one feature is more important to users than another, measure the usage in your software and calculate the results for analysis at regular intervals over the lifetime of the product. This is harder to do with shrink-wrapped products, but it can be done. Microsoft and Google will often allow you to “opt in” to providing additional feedback about their products.

If a prediction is made that a certain feature is stable enough to not warrant exhaustive testing, add that prediction to the log and measure the time spent fixing bugs related to it. You may find yourself surprised at the cumulative waste caused by a bad design or process decision where the impact comes as small individual costs instead of one big noticeable bang. The old saying “death by 1000 cuts” applies here.

The goal of capturing these predictions is twofold. First, it’s important that if a prediction turns out to be false, we know about it. When predictions result in actions, and the results of those predictions are never validated, we lose valuable insight into our ability to consistently make good decisions, and may keep making the same bad ones. Agile teams regularly review team members’ ability to hit estimates to help them estimate better. Shouldn’t the same be the case for predictions that effect our delivery process?

The second goal of capturing these predictions is to open our eyes to the available data for insight into the delivery process that we may not even know is there. Once you start tracking the outcomes of your predictions, you start looking at information to validate the results that may have never been investigated before, and this gives you more tools to use to make future predictions. There’s a ton of data out there in your requirements tracking, defect tracking, source control, and operations monitoring tools being captured but its value has not been fully leveraged. If you’ve got the data, it just makes sense to use it.

When you make production an island, it takes a long time to get there

My post yesterday touched on one of the subjects related to software development that has really crystallized some of the process breakdowns I see in too many organizations out there. There is much time spent measuring developer output, but missing the overall cycle of going from idea to users. When organizations begin to measure this, the next step is to measure the activities within.

Of all the phases in a typical delivery cycle for software, the most costly in improperly automated environments is that of deploying to production. We spend hours writing unit tests, maybe some integration tests, and perhaps even writing a full automated acceptance suite but still significant time is spent getting that code to work right in its eventual “production” environment.

Some signs that this might be happening to you:

  • Deploying to production keeps folks working long past the planned duration, involves numerous personnel and is a high stress event.
  • Code that was accepted in test doesn’t work in staging or production.
  • Things that work in production after the latest deployment don’t work in the other environments, and an operations person has to be contacted to find out what they changed recently.

Before I go much further, lets define what I mean by production. In an IT department with internal applications, production may be a farm of web servers and a database cluster servicing one instance of several applications used by the organization. For a shrink-wrapped product, production will be your users’ computers. The cost on cycle time of not properly testing your application in its environment before delivering it can be significant.

Since production environments are a company’s IT backbone bread and butter, operations personnel (or those of your customers) have a motivation for keeping things as stable as possible. Developers however, are motivated by their ability to enact change in the form of new features. This tends to create a conflict of interest and most organizations’ answer is to lock down production environments to only be accessed by operations personnel. An alternative strategy, one outlined in continuous delivery, is to start treating the work operations does related to setting up and maintaining their environment with the same rigor and process as the software being deployed to it.

Life before source control – are we still there?

Consider an example. An organization has 4 environments – development, test, staging, and production. Development is meant to be an environment in which programmers can make changes to the environment needed to support ongoing changes. Test should be the same environment, but with the purpose of running tests and manually checking out the application like a user would. Staging should be the final place your code goes to verify a successful deployment, and production simply a copy of staging. You may be thinking already “I can’t afford a staging environment that has the same hardware as production!”.

It’s acceptable for staging not have the exact specifications of production, but you should minimally try to have two nodes for every scalable point in the topology. If production has a cluster of 4 databases, staging needs to have 2. If production has a farm of 10 web servers, staging needs to have 2. With this environment in place, you are still testing the scaled points in your architecture, but without the cost of maintaining an entire cluster. This is obviously easier to do with virtualization, but take care to not use a staging environment that is significantly more or less powerful than production if using it for capacity and performance testing. You cannot have a staging environment that has half the servers of production and just double the performance you are experiencing to assume production will provide twice the capacity. Measuring computing resources does not occur in a linear fashion as one might assume.

Continuing with the example, consider what work would be like without source control. When you make a change to your code, you would have to manually send that code and make its changes on each developer’s machine. Maybe you could make things a bit easier by creating a document that tells developers how to make the changes to their code. This is ridiculous right? Sadly this is exactly how many organizations treat the environment. A change made in one environment is manually made in all the others, and the opportunity for lag between making those changes and human error is large.

Making the environment a controlled asset

The way out of this mess is to start thinking about the environment as a product that deserves the same process oversight as the software being deployed to it. We spend so much time making sure code developers write is tested, but it’s just as easy to break production by making one bad configuration change. To get around this, we need to change the way the environment is managed and leverage automation.

  1. Create baselines of environment operating system images for each node required by your application (database server, web server, etc.). These images should have the operating system, and any other software that takes a long time to install already setup. Don’t have anything pre-configured in these images that can change from one environment (dev/test/prod etc.) to the next.
  2. Create deployment scripts that you can point to a network computer or VM using datacenter management software (Puppet, System Center etc.). These scripts should install the baseline image on the target computer. Work with operations to determine the best scripting technology to use for them. Operations personnel typically hate XML, but using PSake (a powershell deployment extension) or rake is usually acceptable.
  3. Create deployment scripts that run after the datacenter management step and configure the environment suitable for your software. This includes setting up permissions, adding users to groups, making configuration changes to your frameworks (.NET machine config, Java classpath, Ruby system gems etc.).
  4. Create configuration settings that are specific to each of your environments. This would optimally be one database table, XML, or properties file with the settings that change from one environment to the next. Put your database connection strings, load balancer addresses, web service URLs etc. in one place. I’ll do a future post on this point alone.
  5. Create deployment scripts that apply the configuration settings to the target environment.
  6. Store all of these assets in source control (other than maybe the OS images, which should be on a locked down asset repository or filesystem share).

Once this is in place, you should be able to point to any computer or VM on your network that has been setup by IT to be remotely managed and target a build to it. The build should setup the OS image and run all your deployment scripts. From this point forward, the only way any change should be made to the environment is through source control.

This change provides us with a number of benefits:

  • Operations personnel improve their career skills by learning to write scripts to automate changing the environment and these can be reused in all of the other environments. If you want to change the configuration of the database for example, this change once made in source, will propagate to ALL environments that are deployed to from the same build.
  • Developers can look in source control to see the configuration of the environment. No more sending an email to operations to find out what varies in production from the other environments.
  • Deploying new builds will test the latest code, with the latest database changes, along with any environment changes. This is the only way to really test how your application will run in production. Any problems found in staging will also be found in production, so you get a chance to fix them without the stress doing so in production adds.

There are a couple more things to mention here. First, if you are deploying shrink-wrapped software, you probably have many target environments. To really deliver quality with as few surprises to your customers, you should setup automated builds like this for each variation you might deploy to. Determine minimum hardware requirements for your customer, test at this minimum configuration, and also test any variances in environment. If you support two versions of SQL server, you really should be testing deployment on an environment with each of these different versions for example.

One more thing – for organizations in which production settings are not to be made visible to everyone, simply have a separate source control repository or folder with configuration settings for production, and give your build the permissions to pull from that repository (just the configuration) when setting up a production node. Developers will still need elevated permissions or to coordinate with more-privileged operations personnel to find the answer to their questions about how production is setup, but the code for applying environment configuration settings to the other environments will be accessible via source control, simply with different values than production.

Once you have an automated mechanism for setting up and configuring your environment from a build, you need a way to piggy back that process on top of your continuous integration server. I’ll leave that for my next post.

Cycle time – the important statistic you probably aren’t measuring

When teams develop software, they use products from other vendors to aid them in following their chosen process. Usually data is captured during development that can be used to create reports or do analysis from these other vendors’ products resulting in some insight into capability. We can answer questions like “how long did this bug take to close?” or “how long after this work item was created, was it marked as completed?”.

The most common statistic analyzed in agile teams is “team velocity” which is a measurement for how much your team can get done in one iteration (sprint). Managers love this statistic because it helps them figure out how efficient a team is, and can be used to calculate potential rough estimates for future availability of some feature.

However there is a much more important metric to your business related to software development, and to measure it correctly we need to redefine or at least clarify a regularly misunderstood word in development processes, and that’s being “done”. Too many teams I encounter work like this:

  1. Business stakeholder has an idea
  2. Idea is placed in product backlog
  3. Idea is pulled off backlog (at some future iteration/sprint) and scheduled for completion
  4. Developer considers the task “done” and reports this in a standup meeting
  5. Developer starts work on the next task
  6. Tester finds bugs 2 weeks later
  7. Developer stops his current task, switches to the old one, and fixes bugs
  8. Months from now, someone does a production deployment that includes the feature, and users (as well as business stakeholders, unfortunately) see it for the first time

The duration of time that has elapsed between the first and last step above is known as cycle time. This is an important statistic because it measures the length of time that it takes to go from an idea, until that idea is available to users. Only when the last step is completed is a feature truly “done” and due to a lack of embedded quality and deployment verification in most processes, often a team or individual’s efficiency is determined by omitting everything after #4 above.

It doesn’t matter if your team has developed 20 new features if they aren’t available to users, and they can’t be made available without significant disruption to ongoing work until they have sufficient acceptance tests. This is similar to lean manufacturing, in which you have inventory on the shelf that isn’t being used but this costs something to create and store. We can optimize our cycle time by measuring and working to improve all aspects of the process within the start and end of a cycle.

Reducing cycle time is a key tenet of continuous delivery, which seeks to automate and gate all the phases in your development process with the goal of improving an organizations’ efficiency at delivering quality features to their customers. To improve cycle time, there are many things you can do but I’ll start by talking about analysis and acceptance.

Analyze and accept during the sprint

Many development teams attempt to do requirements analysis on features before or while they are on the backlog, but before they have been added to a sprint. This is a mistake for a couple of reasons:

  • It spends effort on a feature that has not been scheduled for implementation. The backlog is about waiting to act on work until the last possible moment, to reduce waste and embrace the reality that up-front design (waterfall) doesn’t work.
  • It encourages managers to cram as much into a sprint as possible, assuming all developers need to do is “write the code” and misses the cost of doing analysis in measuring overall efficiency.

In reality, a feature should be added to the backlog and prioritized there without effort being attached to it. When that item becomes high enough on the list to schedule for the sprint, it is assigned to a developer and they work with a business analyst or tester during the sprint to write acceptance tests for the feature. These acceptance tests should be automated when implemented, but a tester should be able to write in English a description for what constitutes sufficient acceptance. Developers write the tests first, and then write code to pass the tests using test-driven development approaches.

Often teams new to this approach will schedule too much to get completed in one sprint. This is a learning experience and over time, you will get better at scheduling smaller units of work into sprints, and describing features at a level of granularity necessary for completion by a single developer. During this adjustment period, be prepared that features added to a sprint, once analysis and acceptance is done, will often be identified as too large to complete in the sprint and need to be split up into smaller tasks on the backlog – only scheduling the ones that can be developed AND acceptance tested prior to the end of the current sprint.

This may seem like a trivial process nuance but the goal is to pursue continually delivering new features to your users as quickly and with as little defects as possible. This can only be done if the acceptance criteria for the feature is clear, and there is a repeatable means for verifying it. Automated acceptance is a must here, as manual testing means a longer cycle time.

Once you start accepting this definition of being done, you can start to look at all the pieces of your process that make up cycle time and optimize them. Managers and development leads love to suggest ways that developers can be more efficient, but they rarely look at opportunities for process improvement in business analysis, testing, and deployment. Often, these are more costly to cycle time than development itself, which tends to be limited in opportunities for optimization by the skill of your resources.

I’ll go into more detail about individual practices within your software delivery process that can reduce cycle time in future posts.

Foregoing assumed value in favor of rapid feedback

The goal of developing any software should be to provide functionality useful to the majority of its users.

While doing business analysis or writing user stories for a feature of a project (especially those that are an attempted re-design of an existing one), it is important (and exciting) to brainstorm, be visionary, and think up great ideas for how you can please your customer base. However when planning those features for release, it is tempting to attempt to complete all of those stories before making the feature available to users.

The reasoning behind this argument usually sounds something like “our customers have used the product for years with these features, and they will not use it if they are not all present”. Another spin on this is “our competitor has these features and we will not be competitive without them”. There are several flaws in this argument.

  1. The argument assumes that users are currently using all the features. Unless you are measuring the use of the feature in the field (google analytics etc.) and have data to back up this claim, it is highly unlikely that a compelling offering could not be made available to users with a smaller subset of features.

    This applies to competitive analysis as well. Comparing your planned features to an existing product sheet will simply align you with them, which can be a disaster if many of their features are unused by their customers and you will now be spending money building them too. It also reduces your ability to differentiate yourself from them.

  2. The argument assumes that users will not provide accurate feedback on their needs of the software. When you choose to implement the kitchen sink around a feature, what you’re really saying is, “I know more about the user’s needs than they do, so I will decide everything to offer them”.

    When you go this route you spend excessive time getting to market, excessive capital implementing features that may not even be used, and place release cycle pressure on yourself by having a larger workload – making it less likely that you will be in the relaxed mindset necessary to listen to your customers and be able to respond to requests for changes.

    It’s more efficient and realistic to simply release the smallest subset of those features necessary to make initial use of them available, measure usage and gather feedback, and give users exactly what they want once they’ve used the feature. While it’s true that this approach can result in designs that are different from what you originally envisioned, your vision is not as important as the successful adoption of a feature by its users.

  3. The argument weights delivering assumed value over used value. What this means is that by focusing development on robust implementation of features that have not been even initially deployed to users, the backlog and priorities are being driven on assumed need. Even if your customers tell you they need a feature, unless you are measuring that they are using it in the field, and they are providing you with feedback that they like it, you are taking a risk with the effort needed to implement it. It makes sense to reduce that risk so that if you deploy a feature that turns out to not be useful, the lost capital is minimal.

Where I’m going with this is that organizations should spend serious time reviewing their backlogs of features, working with user experience experts to come up with designs that deliver the smallest, simplest design that accomplishes what you think the user needs and then get it out there. It is always more viable to bolt on a feature that you verify is needed after an initial offering than to spend money on assumptions only to find that it was a waste.

The 7 Deadly Sins of Backlog (Mis) Management

When a team or organization decides to go agile, one of the key practices to follow is letting a backlog drive the rhythm and order of development tasks. You can read my previous posts on backlog management and sprint execution to get an overview of how some typical teams I’ve worked with have used it to great effect.

Unfortunately however, many teams fail to understand the nuances of the backlog and that using it effectively requires strict adherence to a set of rules, or all the flexibility that can be gained with agile processes is lost. This post attempts to highlight some mistakes I’ve seen made on teams adopting a backlog that can be fatal for those that fail to heed them.

#1 – Starting the next sprint without communicating and getting buy-in from all parties

This one should never occur if you have read my previous post, or practically any SCRUM/Agile process, but it’s amazing how often it happens. Technical leads can assign tasks to the next sprint and communicate them to their direct superior, without that manager or resource also communicating them to the rest of the business. I’ve seen this happen where 2 weeks into the sprint executives are upset about their feature not being delivered in the current sprint. I’ve also seen this happen where folks higher up in the organization add tasks to the next sprint and commit the development team to them without giving the development team a chance to give their sign off as well.

It is absolutely critical that anyone who can make a decision that impacts the feature set, design, implementation, or testing of the software sign off somehow on the tasks slated for development in the next sprint. It’s preferable to have a document or approval somewhere since early agile adopters need to be held accountable to their commitments to see the process work, and improve their analysis, estimation, and scheduling skills as a result.

#2 - Calling sprint tasks done when they have not been tested

This one gets controversial since some teams feel that testing should be done for a feature in one sprint in the next, but I’ve found that if it is done this way it often leads to two problems. First being that developers tend to not test their code, figuring they have an entire sprint past that to find and fix bugs. Second being that developers tend not to involve QA early enough in the process. As the design of a feature emerges during a sprint, QA resources should be involved regularly at key decision points, so they can formulate test cases as soon as requirements or stories are agreed upon, even before they are built.

Inevitably, bugs will be found in code. So developers should have a percentage of their total workload for a sprint reduced to allow for some percentage of that time to be spent fixing bugs and communicating with QA. Optimally a sprint should include enough time for a tasks to be completed with test cases for each decision created just-in-time for each decision, as well as at least one round of full testing of the feature to complete. This also assists in determining at the end of a sprint whether work remains to complete it. In the case that it does, a new task should be added to the backlog to finish the work that is remaining. This allows the backlog to be prioritized such that completion of the task can be deferred if it becomes less important when the sprint concludes. This leads into the next point.

#3 – Failing to carry over unfinished tasks into the backlog

In teams just starting to adopt agile, many times they are coming from a prior culture of finger pointing and unrealistic goal setting that results in a protectionist mindset. Developers and managers first adopting agile will have a tendency to want to show all tasks added to a sprint complete at the end of that sprint. The thought is that if they don’t finish the tasks, they can’t show progress to the executives further up who may be skeptical of agile. Unfortunately, this is the wrong approach and sets a precedent for the unrealistic goal of having that happen all the time.

Part of following an agile process is providing transparency into problems with estimation, design, and planning and this can only be done if people are allowed to make mistakes and improve. A developer who fails to complete a task on time will learn to estimate better in future sprints if given the chance, or just learn to hide their failure better if forced to act as though their tasks are complete when they really aren’t. A developer who begins to implement a feature only to find it is more complicated than originally intended or the scope missed key things will also learn to do better design work if given the chance, or they will continue to do a shabby job at defining things if forced to act as though they can still meet the deadline for ill-defined tasks. For a healthy team following the backlog, it should be normal for sprints to end with some tasks incomplete at the end-of-sprint meeting. It is the job of the development manager to communicate this and the reasons for allowing it to his superiors.

#4 – Failing to revise estimates for tasks during a sprint

This is related to #3 above and also embraces the reality of the unpredictability of software development. When a task is assigned to a sprint without the waterfall type documentation such as requirements documents, screen mockups, class diagrams, etc. it is completely normal for a developer to discover that the original estimate they provided was too small to complete the task in that sprint. At this point there are a number of possible solutions.

If the task is determined to be much larger in scope than intended, it can be deferred to the backlog and flagged as requiring a re-estimate. The task on the top of the backlog that can fit into the current sprint for that developer can then be assigned to be worked on in its place.

The developer could also provide an estimate for the time it would take to scope the new work (not implement it) and upon completion of that task, work with their superior to determine whether that work can be completed within the sprint, or started in the next sprint depending on the cost of the design and/or discovery effort.

Lastly, the developer can also break the work up into smaller tasks, with accompanying estimates for each and placed back on the backlog, allowing the team to determine which of those sub-tasks is most important and can fit within the current sprint. This must involve all parties (see #1) to take into account technical dependencies.

#5 – Changing the tasks in the current sprint during development

It’s not uncommon for a team to start a sprint, only to have an executive attend a key meeting with a vendor or a salesperson close a big deal where the feature they just envisioned or sold is now the hot item in their mind. Without a backlog, the tendency is to shift all efforts to this feature. This can mean cancelling an entire project when done in waterfall fashion. With the backlog in place however, changes in prioritization that occur after the sprint has started must be placed in the backlog and prioritized accordingly, and not replaced or added to a developer’s current workload. There are several reasons for this.

First, the effort of understanding a feature prior to starting it, and optimally communicating with people to get information about it is lost when a shift like this occurs mid sprint. It tends to result in lost information that needs to be re-communicated when that task is picked up again. Also, it lowers the morale and rhythm of the team by not allowing members to feel “safe” when in a sprint that they can finish their tasks without disruption. Executives and development managers should feel free to go wild with great ideas for their software – but if it changes the current design or adds to it, those changes must go on the backlog. Even if there is an impact on the currently designed feature, that impact is easier to absorb as a future refactoring than the disruption to understanding, rhythm, and predictability afforded by the “lock-in” that the sprint’s set of fixed tasks pulled off the backlog provides.

#6 – Waiting until just before the next sprint to re-prioritize the backlog

Once you’ve read the above “sins”, you can see that a sprint is a highly organic activity. During the sprint, tasks may carry over to the next one, or the business’ needs may change. It is important that development managers continue to communicate with stakeholders to look at the backlog and regularly review it for correct prioritization at key points of the sprint. Whether it’s once a day, once a week, or halfway through the sprint – make sure you take time to look at the backlog. As the project is developed it is normal that insight to future tasks that haven’t started yet arise out of development of current ones. Capitalize on this and keep the backlog up to date as much as possible. If you don’t, you risk having the team wait for executives and management to make decisions on what’s coming up on the next sprint while it should really be underway.

#7 – Committing to release dates based on far-off estimated sprints

One cool technique I’ve seen involves creating a list of backlogs for sprints several cycles out. Let’s say you have a backlog containing 50 task, and you are about to start sprint #2. You determine that approximately 5 units of work (tasks, features, hours etc.) can be completed by your team per sprint. So you assign 5 units worth of tasks to sprint #2, #3, #4, and #5. With a little calendar manipulation it’s easy to come up with an estimated release date. But I should caution you strongly against doing so with this many sprints. As you add more sprints with estimated scheduled tasks, your ability to predict the release goes down in relation. So if you only have 2 sprints left to go, and your team’s burn rate is pretty predictable, you should be much safer in committing releases to the business than with a larger number like 5 sprints worth of work. This is simply due to the nature of unknowns that are found during the development of each sprint. I have found in practice it is not enough to simply come up with some blanket number (like 20%) to account for unknowns and apply that to all of the sprints. A sprint 2 cycles out may have a feature that when started is determined to take 3 times as long as originally planned because its true complexity wasn’t understood at the time it was added to the backlog.

Hopefully it is obvious by now that going agile does not mean razor sharp predictability, consistently hit deadlines, and zero scope creep. Rather, it embraces the reality of software development and favors planning for reality and building a team that operates with tools at its disposal for dealing with change instead of a set of unrealistic goals that sound great in a meeting but fall apart and cost both time and money in practice. The payoff is continually delivering tested, complete features with a little practice, and a team that knows how to adapt to any situation without panic.