Even if you are on the right track, you will get run over if you just sit there.
– Will Rogers
As you can see by the title of this blog (or shall we get fancy and call it an article), it is about Agile Metrics. And as you can also glean from the title, this is yet another blog about this topic. However, I plan to be different – I’ll come after this topic from purely personal experience and reflection. Now I’m not saying the other writers do not reflect upon their personal experiences, but over the last eighteen months the teams I worked with have done an excellent job at forming out and tracking metrics. These metrics do two things:
- Provide the stakeholders with insight regarding to what the teams are doing, how well they are doing it, and what is coming when.
- Provide the team with measurements that can be monitored to identify and understand how the process of software development and deployment is working.
These both lead to consistency, trust, and continuous process improvement. Now don’t get me wrong, metrics alone don’t provide these things – but, capturing metrics, measuring, and then taking stock in the measurements do help establish consistency, foster trust, and drive continuous improvement.
We chose six metrics to capture:
- Roadmap Effort Achieved. This equates to enhancements, new features, or business value activities performed during the iteration versus defects and other technical tasks which although are required for sustaining the product, they don’t have the agreed upon value with the Product Owner.
- Utilization. This is broken out into two metrics: (1) Ideal Utilization which considers actual hours booked against all project activities divided by the team’s ideal hour capacity; and (2) Planned Utilization which considers actual hours booked divided by the team’s planned capacity. The first measure is generally mid 60% to low 70% depending on the roles and the level of the team.
- Defects Raised. If this isn’t clear, then nothing is – just kidding, this metric tracks those defects that are born out of all touch points including level 3 support, testing, and those raised by developers that should have been caught during continuous integration cycles.
- Sprint Task Complete. This metric is similar to a sprint task burn down; however, it is a measure of how close was the team at completing tasks they signed up for. This measure did not include tasks introduced during the sprint – only those they signed up for. The challenge we find with this measure is that if a team is small, this number could be skewed. However, this metric clearly indicates either too much cross-winds hitting the team, lack of focus, or simply too large of stories/tasks.
- Sprint Effort Complete. Effort relates to hours, so this gives a pretty good barometer of how close the team came to finishing the tasks. We measure this for two reasons – (1) smaller teams that may not complete but 1 of 3 tasks looked bad, but the reality was they finished 95% of the work and just need another day or two to get it across the finish line and (2) this allows for us to understand impact of estimates and cross-winds (a.k.a. support). The team still got recognized for the effort, but there was recognition that either through more focus or better break down of tasks may help complete what was committed.
- Estimate Accuracy. This type of metric is not unusual in most traditional PMOs; however, the goal here was to drive the behavior of breaking down tasks more and not estimating something you didn’t understand. The result of this awareness helped the organization to build features consistently and follow best practices surrounding analysis and design. We also found it valuable with prototyping in that the team was able to break down the tasks in a fairly repeatable way and the estimates only got better during these periods.
How have these metrics worked out?
These metrics have served us well; however, some of these metrics can be frowned upon since they appear to be too focused on performance. But what we’ve found is that you adapt the metrics as the teams mature and the inevitable organizational change occurs. So for instance, Utilization was always a product of ideal hours – therefore, you generally saw 60-68% utilization for teams that performed project tasks and played a role in level 3 support. By plotting the utilization over time, we arrived at a median utilization rate of 67%. Once we felt this stayed level, we turned our focus to planned utilization – meaning are we planning our sprint capacity accurately and therefore filling up the buckets properly. We are an organization that uses story points; however, we’ve found velocity to be cyclical based on the phase of the project and possibly the opportunities brought forth by the business (a.k.a. several large deals sign and we find ourselves involved in implementations that are not managed via Scrum, thus velocity collapses). Nonetheless, now that we’ve moved to looking at planned utilization, we are able to establish a level of consistency and trust with the business that when we say we have X capacity and we can deliver by Y – the business is lock-step saying the same thing because they too see the metrics.
Well, we’ve learned that the data wasn’t useful in short cycles – the data was useful once you had five or six sprints worth of data points, then adjustments to process could be made. Also, by this time, the reasons you would expect things to go sideways were very clear there on the charts (e.g. defect rate went up prior to the release cycle – as airspeed is slowed on the feature development and the focus turned to technical debt). We also learned that tracking estimate accuracy and utilization altered the teams’ behaviors – and not in a good way. The team started finding ways to work the metrics in the favor. They did so even after being told that we weren’t measuring their performance based on these numbers, but using them for identifying opportunities for improvement and demonstrating successful delivery. We had folks that would sandbag their estimates, or underestimate and work late then log work for the exact hours they estimated.
We also learned the obvious and, frankly what we wanted to see, and that is that once a team started to understand the metrics – they matured as a team. Part of this was the equation of time and the improvement of the Scrum/Agile processes. The other part was the fact that we chose specific metrics to focus on – especially those surrounding productivity and quality. We started conducting group code review meetings and pairing up more on tasks that we deemed to have higher risk.
What to Look at Next
We need to start look at tracking business value better – looking at using earned value. The challenge we have is that we generally don’t put a cost on projects – we are a small software firm that generally runs with department budgets only and assess costs along the line of business only. It seems that we would make better decisions along the way if we kept cost and revenues in plain view during the development process.
Also, need to look at measurements surrounding quality automation (unit test coverage and LOC test coverage).
I’m sure there are other areas, but now that we have a good handle on consistency and trust in delivering what we said we would, it is time to focus on business value and look deeper into quality. Both of these will help make the products and the business more successful.