Calendar with glasses and marker

The seduction of the two-week sprint

The more I look at teams that have two-week sprints, the more convinced I am that it is a bad idea. I know this is a controversial stance, but controversy has rarely stopped me from exploring an idea.

So let’s ask these questions: Are two-week iterations really a good idea? Is it a goal all teams should strive for? Is it OK for a company to mandate two-week iterations for all teams? The more teams I encounter, the more I start to see some problems with the two-week iteration, problems that no one seems to want to address.

Don’t get me wrong. For many teams, a two-week iteration is great. My goal here is to help you look under the covers and make sure that your two-week iterations are healthy ones, not ones that cause bigger problems than you might realize.

Below, I explore four points:

  • Whether it is OK to mandate two-week iterations.
  • The typical arguments in favor of two-week iterations.
  • The arguments against two-week iterations.
  • Recommendations on what to do with this information.

For the rest of this discussion, allow me to define two terms. A short iteration is a two- or one-week iteration, with two weeks being more common. A long iteration is a three- or four-week iteration, with three weeks being more common. OK, let’s dig in.

The best agile development conferences of 2017

Is it OK to mandate a two-week iteration?

I remember when RUP (Rational Unified Process) advocated six- to nine-week iterations, then Scrum came along with its four-week iterations. Since then, many teams have gone down to two. SAFe (the Scaled Agile Framework) also advocates the two-week iteration, but in that case, it is just a suggestion, one that has worked well for most SAFe adopters. Now there are even teams that do one-week iterations!

As I mentor companies that are already doing agile, the two-week iteration is almost ubiquitous. Most of the clients I’ve worked with encourage two-week sprints as the default iteration for new teams, and some have even mandated it.

When I hear a mandate such as this, my hackles rise and my teeth grit. The word mandate has no place in my agile world. It undermines the idea of a high-trust environment and self-organizing teams. These are two concepts that many of my clients have a hard time adopting, especially as we go up the management chain. I like to teach my executives, managers, coaches, Scrum masters, and practitioners that “there are no musts and no can’ts in agile.” As an agile coach, I believe it’s my job to point out the risks and potential consequences of various choices, but it’s not my job to make decrees. This shows respect for and trust in the team, which usually lead to better results.

The consequences of a mandate

As an example, a team I once worked with was using agile beyond just the software teams. In one ceremony, someone said, “By the way, you must register for the picnic by Friday.” After the ceremony, the team members asked what I thought of the fact that they were using agile for non-software. I said it was great, but noted that they had used the word must in the meeting instead of pointing out the risk of not doing something. Frustrated, the woman who had said it asked, “What else should I have done? If they don’t register, they won’t get any food at the picnic! Oh, wait. I think I just got your point.” And she did! It is far more motivating to hear, “If you don’t register for the picnic by Friday, you won’t get any food at the picnic.” Subtle? Perhaps. But it illustrates the point.

So first and foremost, no, it is not OK to mandate two-week iterations. Instead, organizations should strive to point out the risks of a team not doing a two-week iteration. And interestingly enough, when they try, they often list debatable ideas such as “there is no ‘flat spot’ such as you get in a three-or-more-week iteration,” or longer iterations can lead to “catastrophic sandbagging” (a low-trust statement if ever I heard one!). 

Beyond my initial, somewhat emotional reaction to the word mandate, the more interesting question is: Why was a two-week iteration seen as so critical by the organization's agile leadership team?

And what if the two-week iteration is, in fact, a bad idea for some teams? How can we tell if a two-week iteration is a good idea or a bad idea? Is it almost always good? Almost always bad? I have some recommendations and some strategies for answering these questions, but first, let’s look at some typical arguments in favor of two-week sprints.

Typical arguments in favor of two-week sprints

Short iterations lead to faster improvement

This is the most compelling argument I’ve heard for short iterations. Short iterations allow you to improve both your product and your process faster than long iterations. In terms of product, your stakeholders steer you more often due to more frequent demos. In terms of process, you reflect and adapt more often than with long iterations.

In addition, for new agile teams, it generally takes three iterations before the ceremonies all start becoming muscle memory. For short iterations, that is just three to six weeks away versus nine to twelve weeks for long iterations.

So to summarize, short iterations have these potential accelerators:

  • Pro 1: More frequent demos, which mean more opportunities to ensure the solution is aligned with real needs
  • Pro 2: More frequent retrospectives, which mean more opportunities to experiment and improve on the agile process
  • Pro 3: More frequent metric data points, which mean seeing trends sooner and taking corrective action sooner
  • Pro 4: More frequent agile ceremonies, which mean new teams will learn and internalize them faster

All four of these can certainly be true—but all four have some interesting counterpoints.

Pro 1: More frequent demos: There are three interesting problems here.

  • Problem 1: Stakeholder availability: Many teams find that their stakeholders can’t attend demos every two weeks. When some teams graph stakeholder attendance over time, it shows an overall drop in attendance versus long iterations. So we get fewer corrective opportunities instead of more! If your demos still inspire strong attendance, then your short iterations are looking good!
  • Problem 2: Abandoned demos: If demos feel overly frequent, teams might stop doing demos at all. If your team still feels that it can run relevant demos every one to two weeks, then this isn’t a problem.
  • Problem 3: Boring, overly technical demos: When talking about the content of a demo, teams with short iterations are more likely to report that their demos are of technical tasks, which aren’t very interesting to their stakeholders. Teams with long iterations are more likely to show demos that tie into the business more clearly. I’ll show you why this is later, in the “Terrible stories” section. If your demos are business/user-focused and exciting even when they’re frequent, then short iterations aren’t a problem.

Pro 2: More frequent retrospectives: There’s one issue here.

  • Problem 4: Too busy to innovate: Teams that I have worked with that have short iterations and that measure retrospective actions created and retrospective actions closed over time tend to show lower values than teams with long iterations. Analyzing why has caused me to notice a pattern: The short-iteration teams feel so much time pressure that they have no appetite for executing ideas from their retrospectives, which are lower-priority. Long-iteration teams are generally more open to spending time on innovations. If your team is closing most of its retrospective actions even in short iterations, then there’s no problem.

Pro 3: More frequent metric data points: This is definitely true. However I’ve noticed two interesting issues here.

  • Problem 5: No metrics: Teams doing short iterations are the least likely to generate metrics. So even though they could generate more metrics by having iterations more often, they don’t seem to do it! Yet long-iteration teams seem to have much more of an appetite for them. This is only a problem if your team isn’t generating metrics at the speed of its iterations.
  • Problem 6: False earned value: Earned value is all about how much value your team has earned versus planned. In waterfall, earned value is a mess, because teams earn value for closing tasks, not for producing valuable software. So lots of false positives are reported. Everything might look great, but it isn’t. Iterative promised a better “earned-value metric” because value was earned only when valuable software was demoed, not when some technical tasks were done. With short iterations, teams are creating stories that are not really valuable. Bottom line: More granular metrics taken over short iterations are as inaccurate as waterfall metrics were! If your metrics are truly for valuable software, then this isn’t a problem.

Pro 4: More frequent agile ceremonies

  • Problem 7: Bad habits: Short iterations can make you learn agile ceremonies faster, but their time pressure can also encourage you to form bad habits that are hard to cure. If you do short iterations and you’ve made it through Pros 1-3 above without succumbing to any of the six problems mentioned above, then you can be sure that you’re probably not forming any bad habits for your team.

The flat spot

Short-iteration advocates often say the middle weeks of long iterations have a “flat spot” where productivity goes down. Here are some exact quotes I’ve heard that accuse long iterations of having lower productivity:

  • “Short iterations have no flat spot, whereas the middle weeks of long iterations have a 70% higher chance of falling back to mini-waterfall within iterations.”
  • “Short iterations have a single weekend, acting like a mental pause—it refreshes. Longer iterations have multiple weekends that break psychological continuity.”
  • “Short iterations keep the pressure on. Long iterations begin the tendency toward ‘mañana’ syndrome, the ‘I can wait till next week because I have three or four weeks to go’ type of thinking.”
  • “Catastrophic sandbagging. A general human behavior is to save the hard tasks until the end. We see it all the time in two-week iterations. With longer iterations, procrastination increases, and since you've loaded more points into the iteration, pow! I have seen teams take seven to nine iterations to break the habit (double the three to four iterations it took for teams with two-week iterations).”

You get the idea. But when we share comments such as these with high-performing teams that use long iterations, they tend to laugh. Let’s dissect these statements.

  • “Short iterations have no flat spot, whereas the middle weeks of long iterations have a 70% higher chance of falling back to mini-waterfall within iterations.”

It is unlikely the speaker has actually measured this. Teams doing long iterations often see their highest productivity in the middle iterations. These are the weeks with no planning, demos, or retros to slow them down. Three-week-iteration teams especially claim that the middle week is their favorite because it offers a high-focus five-day run.

If you don’t believe me, run this experiment:

Simply compare the number of story points completed in three two-week iterations with two three-week iterations.

Also, note the cost of ceremonies. You do one planning, demo, and retro session per one or two weeks, versus one per three or four weeks. But be careful with this one. Planning for three or four weeks will take longer than planning for one or two weeks. As will the demo. But the retrospective tends to be the same duration for long and short iterations. So while the cost of ceremonies is higher, it is only a small amount higher.

Finally, look at the impact of a one-day holiday on a short iteration versus a long iteration. Your capacity takes a 10% to 20% hit per holiday during short iterations, but that is cut down to a mere 5% to 6.7% hit during long iterations! For a two-day holiday, the numbers double. The net effect of this is that short iterations are more difficult to plan when there is a holiday, whereas long iterations usually just ignore it. In other words, these teams have a more stable velocity per iteration. This is also true for sickness, personal holidays, and other individual days off. A missed day has a smaller effect on overall velocity per iteration. We’ve also seen coach absences having a higher impact on the velocity of short iterations than long iterations for similar reasons.

  • “Short iterations have a single weekend, acting like a mental pause—it refreshes. Longer iterations have multiple weekends that break psychological continuity.”

So one weekend is refreshing, but two or three are harmful? When measuring velocity for teams, we’ve seen no evidence of this.

  • “Short iterations keep the pressure on. Long iterations begin the tendency toward ‘mañana’ syndrome, the ‘I can wait till next week because I have three or four weeks to go’ type of thinking.”

In my experience, it can lead to the opposite mindset: “Whee! A week with no planning, demo, or retro! High velocity, here we come!” I’ve seen no evidence that it leads to procrastination

  • “Catastrophic sandbagging. A general human behavior is to save the hard tasks until the end. We see it all the time in two-week iterations. With longer iterations, procrastination increases, and since you've loaded more points into the iteration, pow! I have seen teams take seven to nine iterations to break the habit (double the three to four iterations it took for teams with two-week iterations).”

I love this one. I mentioned above how it is a low-trust statement. Beyond that, the speaker states that this behavior crops up regardless of iteration length. RUP taught our teams to do high-risk work first, and DAD embraced this as well with its Risk-Value Lifecycle. The teams I work with break this habit in Iteration 1 if trained to think high-risk first. Which we do.

Bottom line: The flat spot seems more like a coaching issue than an iteration-length issue. If your short iterations have a stable velocity, then that’s fine. But long iterations are known for their stable velocities.

Arguments against two-week iterations

Terrible stories

This is probably the number one argument against short iterations.

Examine your stories. Are they “vertical” stories that deliver value to a production environment? If yes, then short iterations are great. But I find that most of the teams I engage have stories that are not valuable to the end user—they are valuable to the developers. Things such as “set up a database table,” “create a user interface widget,” “test component x,” or “get requirements for w” are important tasks, so they should be represented as tasks. But they aren’t stories. Watch for teams that don’t create tasks for their stories. Their stories may already be at the task level, so creating more tasks for them is confusing.

Take a look at the before-and-after example from a real team below. Do you understand the first three stories? How about the three stories after they chose to switch to vertical-valuable stories? Even someone with no domain experience can see which stories are valuable to end users. I see this on the vast majority of teams doing two-week sprints. Not all, but most.

Team X stories with mandated two-week sprints (examples from Sprint 6.2):

  • Story 1: CUBSO713—Modify mainframe to call CUMSI032
  • Story 2: IM exception handling to publish and update driver and DR program
  • Story 3: Testing of CUBSO713 using KISO708P

Team X stories after committing to vertical stories (examples from Sprint 6.5):

  • An ORS representative can view GS-GSD information
  • An ORS representative can link external applications leveraging task data
  • A PEAK representative can view stop refund information

The product owner was completely unable to prioritize the original stories, so the team suggested they should have a “technical product owner” rather than fixing the stories. After the team made the shift, the product owner found prioritizing the work extremely easy.

When I first suggested the team fix the stories, team members had plenty of arguments against it.

  • We’re a component team, so our stories have to be this way.
  • If we did stories as you suggest, they could never be done in two weeks!
  • How are all the other teams doing this? We know them. They can’t go to production in two weeks either! Their stories are just like ours.

So wow. All great points. But here is the scariest part. If teams have tasks for stories, then they can do two-week iterations. And their metrics will look good. But in fact, they are still doing waterfall, with two-week planning sessions.

Also, beware of teams that split stories just before the demo to get partial credit on velocity. This should be more of an exception than the norm. And when a team does this, it should only get partial credit if it can demo part of the acceptance criteria. Then it should try to get better at planning and this type of splitting before the iteration starts, not at the end.

If your stories are vertical, valuable, and to-production every sprint, that’s great. And if you rarely split stories for partial credit and always split a story on acceptance criteria when you do, your short iterations are fine.

A rise in technical stories

I remember when a “technical story” was rare. With short iterations, we see a significant increase in technical spikes, technical stories, architectural runway, enabler stories, and other fancy constructs that basically mean “working ahead” instead of in-iteration. The truth is that these “technical stories, etc.” should just be tasks under a true story.

When teams had three or four weeks to work, they could get those tasks done and demo the story all within the long iteration. But with short iterations, if they put that task under a story, suddenly their stories would take more than one iteration, which breaks the definition: a story can be done within an iteration. So, in order to not break that rule, technical tasks have evolved into special technical stories far more often than before.

Now, think about how that affects the demo. Who wants to see these technical stories? A subset at best. Also, think about how that affects velocity metrics. Teams that count the points of these technical stories as velocity are basically claiming “earned value” for tasks! Sound familiar? Yup, that’s what we did in waterfall earned value: claiming value based on tasks instead of production releases. Yikes!

Teams that don’t set points on these technical types of stories have more accurate velocity/earned-value metrics. However, capacity planning takes a hit, since technical stories draw down the velocity. But like a black hole, they now do it invisibly.

Other teams put points on technical stories, but they only use them for capacity planning for the next iteration. They don’t count them in actual velocity or use stacked bar charts to distinguish them. Nor do they add an “impediments to velocity” slide that lists the technical stories and impediments that reduced velocity.

If you have few technical stories and mostly valuable stories and if you don’t count technical stories as velocity, then using short iterations will be fine.

Impacted demos

All of the above stories are hurting demos. In companies that do this, I see decreases in demo excitement and attendance and increases in skipped demos. I also see some companies that stop doing demos altogether. This is all the result of poorly constructed stories.

If your demo attendance is high and the demo is considered to be of high value, then remaining on short iterations is fine.

Big requirements up front

Another problem is so-called story-grooming or story-refinement sessions. These happen well in advance of the iteration they will be developed in. That is moving teams closer to “big requirements up front,” a.k.a. waterfall. Back when RUP had six-to nine-week iterations, doing requirements in the same iteration was easy. Even the one-month Scrum iterations made that doable. But with two-week iterations, people are pulling requirements out to make it work—and pulling out architecture (technical stories, architectural runway, etc.) as well.

Worst of all is that the whole team is often not present at these story-writing ceremonies. The point of stories is to just be bookmarks for conversations. If the team isn’t present as the stories are created, they lose all that nonverbal requirements knowledge, which is the backbone of stories. Now stories start getting entire documents attached to them, rather than being trim little bookmarks.

If you do most of your story refinement (acceptance criteria, tasks, updated estimate) inside of the iteration where they are due, and those iterations are two weeks or shorter, then it’s fine if you continue using short iterations.

Abandoned engineering disciplines

Back in the waterfall and RUP days, we did a lot of disciplined engineering. UML sequence and class diagrams allowed our teams to integrate code on the first attempt. Today’s agile teams seem to have abandoned much of the engineering discipline that allowed us to create high-quality software. This includes abandoning disciplined design practices, architectural practices, and bigger gaps between architectural constraints versus designs.

Many of the teams I’ve coached never learned these techniques, and with two-week iterations, they had no chance to do so. If teams have a chance to build excellent engineering discipline, they will keep that discipline when they move to agile. Others succumb to the two-week deadline and start cutting corners.

If your engineering acumen is high before you start doing short sprints, there’s no problem.

Compromised quality

Our teams define quality with two major metrics: defects and value. In other words, a high-quality system has few defects and provides high value.

One pattern we are seeing with short iterations is an increase in defects and a decrease in value due to low-quality behaviors such as abandoning engineering practices.

Coupled with the previously mentioned lower demo attendance, abandoned demos, and overly technical demos, many short iterations are logging the worst quality metrics in both defects and value!

But if your short iterations have above-the-bar defect and value metrics, then you don’t have to worry.

Recommendations

So that was a lot of information. But at this point, you should be clear on what the issues are. In the end, you have to ask: What’s more important: short iterations, or potentially shippable software? I believe it’s the latter.

Next I will show you how to use all of that information we just explored above. My recommendations are split into four sections:

  • Determine if your iteration length is good
  • Adjust your iteration length if needed
  • Get rid of component teams, at least virtually
  • Measure releases

Determine if your sprint length is good

Here is a summary of questions to help determine if your iteration length is helping or hurting your team.

You may need longer iterations if:

Stories ...

  • Are not valuable in the eyes of the business
  • Are not well formed (“A user can” or “As a role, I want to do an action so that I can benefit”)
  • Are too technical
  • Are horizontal component-oriented, not “vertical slices”
  • Resemble tasks but are called stories
  • Are lifecycle tasks (analyze this, design that, test X, deploy Y)
  • Involve spikes in every iteration
  • Can't be split well and can't be done in one iteration
  • Are often split late
  • Are started in one iteration and finished in the next
  • Have high scrap
  • Have high WIP (work in progress) and high lead time
  • Don't follow “INVEST” (independent, negotiable, valuable, estimable, small, testable)

Demos ...

  • Have low attendance
  • Have low perceived value
  • Are boring to your stakeholders
  • Are often canceled
  • Are very technical (except component teams)

Retrospectives ...

  • Generate actions that don't get implemented
  • Include experiments that are not measured

Team ...

  • Feels stressed out by iteration length

You may need shorter iterations if:

  • You’re demoing a lot of stories at a time, perhaps 12 to 15 or more
  • Your demos take longer than 90 minutes

Adjust your sprint length if needed

  1. Visualize your flow from idea to production.
  2. If that flow is four weeks or less, that’s your iteration length.
  3. If that flow is over four weeks, three thoughts:
    1. Look for ways to optimize. Can you get your flow to production down to four weeks?
    2. If the push to production is the bottleneck, change your definition of done to a preproduction environment and use the “potentially shippable” workaround.
    3. If test, architecture, requirements, etc. are your bottleneck, modernize your process.
  4. You can also use step three if your process is four weeks or less to see if you can increase your pushes to production. The most optimized companies, such as Amazon, can do 1,079 deployments per hour.
  5. If you absolutely can’t do any of this, some teams allocate tasks to two different iterations but only claim points for that story in the final iteration. While this is an anti-pattern, it’s still better than creating tasks as stories, and it still keeps the team focused on measuring value delivered.

Get rid of component teams, at least virtually

Functional teams or feature teams are aligned with business or user-valuable work, so they can follow the above guidelines for stories as is. But component teams are generally creating a solution used not by customers or business people but by internal technical teams.

First, please know that feature teams are more desirable than component teams, since they are the ones that get true earned value. Also note that in companies with teams doing short iterations, I am seeing a rise in component teams, another side effect of time frames being too tight to create value.

The biggest problem with having component teams is they kill your flow. Each team has a backlog of work, prioritized however it sees fit. The true stories that span these teams still exist, but because they are not first-class citizens, they move much more slowly. Teams have to negotiate priorities among their various backlogs, and that leads to a higher cost of delay.

One technique to create real feature teams

  1. Write down the names of all the team members you have, regardless of what team they are currently on or what vendor they work for, etc.
  2. Using math (teams of no more than nine people), figure out the approximate number of teams you would have. Give each team a product owner and Scrum master. Then fill out the teams so they have all the roles they need to deliver from idea to production, while still staying under ten members.
  3. Or alternatively, have the product owners pitch their products to the team members and let them self-organize into teams aligned with the product owner they like.
  4. Those are your new teams! Give them fun names and prosper!
  5. Note: If you can’t reorganize your teams for real, do this exercise anyway and call them “logical teams.” Act as if you had permission, and then use them to create vertical stories and prosper!
  6. You don’t need permission to do this. You just need agreement from the team members, and then you just execute.

If you absolutely can’t get rid of your component teams

You should still write well-formed stories and demo your work. But write stories that target your component’s users—typically, internal teams or internal roles. Since those parties will be using your component, the demos should still be very interesting to them. If they aren’t interested, that might be a problem!

Note that teams that don’t stick to one of the canonical story formats are much more likely to create tasks masquerading as stories. Component teams are even more prone to this. “A user can…” or the newer but more common “As a role, I want to do this action so that I can benefit” format will help to mitigate this habit.

If there is no external or internal user to tie your story to (see a rise in technical stories above), you may still want to do the “…so that I can benefit” part.  But teams that just write whatever they feel like and call it a story are the most likely to be doing tasks instead of stories and thus are the most waterfall-like teams of them all.

Measure releases

The most important metrics an organization can use are release metrics. If you want to know if your move to agile is working, you need to measure release quality, predictability, productivity, and engagement. Going agile should lead to more frequent releases of better value. Iteration length doesn’t tell you if you’re doing more frequent releases, unless your iterations are always to production.

If a team has four-week iterations and delivers every iteration, while another team is on two-week iterations and delivers once a quarter, then their iteration length is irrelevant. If they have good stories and demos, two weeks is fine. If they show the symptoms above, let them pick a longer iteration. Either way, they only deliver once a quarter.

Free your team from the two-week default iteration

Short iterations can deliver great benefits in the form of faster learning cycles and higher-value solutions. But they can just as easily increase defects and team member stress while decreasing value, predictability, and productivity.

The metrics for two-week iterations can look great, but for reasons I mentioned previously, the metrics can be deceptive. For people who only look at the metrics and not the underlying quality of the stories (value delivery versus tasks completed), short iterations seem clearly better because they have higher “velocities.” And that’s the seduction. But if the story quality is, in fact, going down—if value is not truly being delivered but the measures seem great—we are being tricked into mandating two-week iterations at the expense of true success.

Trust your teams to decide their iteration length. Don’t mandate one. Ensure you use quality, predictability, and engagement metrics to gauge team health, not just productivity. And use the above test questions to ensure that your current iteration length is the right one for the team.

By the way, I’m sure you won’t find it surprising that my favorite iteration length is three weeks. It’s what my rookie teams use when we launch them. Unless they disagree, of course. Then I trust their choice and point out any risks I see.

The best agile development conferences of 2017
Topics: Agile