With agile being now the most popular approach to IT project management, the variety of projects it’s being used for is extensive. Agile is no longer reserved for small, dynamic products. It has turned out that more complex, potentially critical projects can also benefit from agile practices. Nevertheless, some adjustments had to be made – such as managing risks. Risk management is a no brainer in systems where human lives are at stake. However, I would like to address another type of projects, the ones that most of us are probably involved in – products that in case of malfunction will not cost lives, but might have some serious (usually financial) repercussions. Even though “risk” is not the most attractive of terms, it’s worth bringing it up during your agile meetings. And who knows, maybe being the killjoy at the planning phase will allow your team to actually achieve more joy by the end of the project?
The simple fact is that there’s always risk in a project (and agile projects are no different). Sometimes it’s unavoidable, sometimes manageable, but it’s there. Pretending we don’t see it (or indeed, not seeing it at all) is a dangerous strategy. However, once we acknowledge the risk, we’re able to make informed decisions and take responsibility – which is what allows our projects to thrive.
The idea of managing the potential threats to a project is a well-researched topic. There is the golden standard of risk management, the ISO 310001, which defines the following steps:
- Establishing the context
- Risk identification
- Risk analysis
- Risk evaluation
- Risk treatment
And on top of that, constant monitoring and review.
As one can assume, there’s a lot of processes, templates, guidelines, and requirements connected with each step. Not very exciting, yes, but that’s good. You wouldn’t want to fly a plane with autopilot software delivered by someone who felt that risk evaluation should be fun.
While not all of us work with safety-critical software, it doesn’t mean there are no threats associated. Not every system needs a systematic, highbrow risk policy, but all would benefit from thinking about potential threats early on and monitoring them along the way. Your team develops an app integrated with a payment system? Or maybe your system helps to plan drug administration to care home residents? Depending on the domain, you can choose how much attention should be devoted to risk management. In general, there are:
- Low risk apps. Examples are tourist guides, simple games, or weather reports. The apps that in case of malfunction will cause discomfort and users might be annoyed, but that’s it. If you work on such apps, there’s a high chance that you haven’t really put too much thought into the potential threats. And to be honest, there’s not much more work that should be done here. Focus on security, make checklists, write tests. Good advice can be found here owaspsamm.org/assessment/ and here www.commoncriteriaportal.org/cc/
- Medium risk apps. These are the apps in which you store some more delicate data, connected to the health or privacy of the users, or handle complex payments. The system that, in case of a malfunction, can cause some serious financial problems or influence users’ wellbeing and health.
- High risk apps. The systems that take people to the moon or at least to the stratosphere and the ones that will save your life if you need a complicated surgery. The safety-critical crème de la crème. Here www.springer.com/gp/book/9783319702643 and here arrow.tudublin.ie/cgi/viewcontent.cgi?article=1127&context=scschcomcon you can find some interesting ideas.
All that follows below concerns these medium risk apps in an agile environment. Where it feels that filling in a risk matrix and xls spreadsheets will be awkward, but still some degree of risk awareness is needed. You can use all of the suggested practices or just one, depending on the project’s needs.
It’s the step where you think about all the things that can go wrong. The more visions of impending doom, the merrier (in a twisted, risk-oriented way), so a brainstorming technique works great here. Gather as many people engaged in the project as you can – especially the ones who always undermine any idea you come up with. Now it’s their time to shine and you can actually appreciate it.
When is the right time to organize such a meeting? If you’re using Scrum, such discussion could be held after at least the first version of Backlog is created, so that you have some idea about the functionality and the domain, but not after the main architectural decisions have been made. When you have already thought about what you want the system to do, there comes the time to think about what you wouldn’t want it to.
Listing all users and roles in the system can be a great start – much like with user stories. Reverse user stories, if you like. You can also start with the functionalities of the system. One of my favorite techniques are Hazard (or Abuser) stories2. They can follow a pattern like:
“As a result of <definite cause>,
<uncertain event> may occur,
which would lead to <accident event>”
“As a result of a lost internet connection,
data about the drugs dosage might not be refreshed for a long time,
which would lead to wrong drug administration.”
Or just follow the structure of the backlog items that you have in your project – more like anti-features.
You will notice shared patterns between the hazards described this way. Some causes will appear more often than others, as would the unwanted effects. These will be the elements you’ll need to pay more attention to.
It’s a good idea to keep the list of hazards in the same tool that you use as your issue tracker. The goal is to store it somewhere people can actually see it. Since we use JIRA at Bright Inventions, I will refer to it as an example. JIRA allows you to create multiple boards per project, with different filters, so a Risk Board or something like this can be a good idea. You can use a customized issue type, like “Hazard” or “Risk factor”, and filter it out only on your Risk Board.
Risk analysis & evaluation
Once you have your hazards listed, you need to take a step back and breathe deeply. It may seem like a lot of bad things can happen, it’s quite overwhelming. However, not all of the identified threats are as likely to happen or have the same consequences. Risk is defined as the probability that harm occurs multiplied by the severity of that harm – so a potentially life-threatening situation that’s highly unlikely to happen can have a smaller impact on the project than something far less dangerous that affects your system a few times a day. There is an appropriate matrix for that3 but any scale would do. Think about these two dimensions and based on that prioritize the hazards; High – Medium – Low would be enough.
Another thing to consider is how these risks are introduced into the system. No one will actually implement Hazard Stories on purpose. What will be implemented though is the User Story connected with the Hazard Story. This is the crucial link that should also be reflected in the issue tracker. For each hazard, there should be at least one feature which could unwillingly introduce it to the system. During implementation of this particular feature, we should be aware of the bad twin: the hazard task. If you’re using JIRA, it can be a good idea to even create some customized relations. User stories can also have subtasks dedicated just to overcoming the hazard – connect them as well. They will be especially useful at the QA stage.
Risk treatment & monitoring
There’s little value to a list of hazards if you’ll never look at it again. Treat your risk items as a part of the backlog, just with different purposes than the regular tasks. Think about them when planning sprints, check the connected hazards before moving a development task to Done. Adding “Check connected hazards” as a step to the Definition of Done4 is a good way to remind everyone about it and turn it into a habit.
As I mentioned before, you can assign development tasks to counteract the harm presented by the risk items. Sometimes you just need to add 2-Factor Authentication or handle an offline mode to stay confident. Sometimes though, you look at a hazard and can do only as much as to spread out your hands helplessly and say “Yes, this can indeed happen”.
There is an approach to reducing risk called ALARP – As Low As Reasonably Practicable5. Sometimes it would be too costly to do anything about a particular threat, or there are few options left anyway. Clear communication, informing the stakeholders about the residual risk can prove to be the only solution.
The crucial part is to be honest with yourself and your team about the risks. Team members should feel comfortable mentioning any concerns – only then can you manage the risk in a meaningful way and feel collectively responsible for the final result. The agile way. Knowing the risks, then reducing the ones we can and ALARPing the rest is in a way a definition of responsibility. It’s not about avoiding any risk at all cost. You will never be 100% safe, but that’s life. Wear a helmet and ride the bike, fasten your seatbelt and drive the car, encrypt the connection and let the users enjoy your great system. In the end, you’ll have to suck it and see.
Senior Project Manager & Scrum Master at Bright Inventions. She is an agile romantic with a scientific mind and a zeal for learning. She divides her time between university teaching and enhancing project management processes. Believes in people-centered processes and the power of retrospective. Loves music, MTB rides and good food. Privately, a mum of two wild boys.