Tuesday, 5 August 2014

The Effects of Agile Development on BPM

As a BPM designer I come across the term "Agile Development" all the time. The meaning of the term seems to change from project to project. Often a misconception is that if you use an Agile Development methodology, you can do the same amount of work in less time, this is obviously not the case! This post is not about the details of agile development methodologies as there are many more qualified people to talk on this subject. This post is more to do with how some of these development practices can affect the way BPM projects are implemented. This post does have a focus on IBM BPM (formally Lombardi TeamWorks) as this is my area of expertise, but I think the issues discussed will affect most BPM platforms.

Agile software development is a group of software development methods based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams - Wikipedia

Unfinished Processes

Because of the iterative nature of Agile Development, this often results in functionality not fully developed by the time it reaches a production environment. On several of the projects I have worked on, the time scales were such that this would result in a process making it to production where only the first half of the process was complete. This would result in a window of time to complete the development for the second half of the process while in-flight instances were making their way through the first half. Obviously, some process instances would fly through the process reaching the unfinished code before the second piece of development was completed. To stop the process instances falling off the edge into oblivion, we used a simple holding pattern to halt the instances until the code was completed.

The "Wait?" decision gateway allows us to switch the holding functionality on or off depending on the sate of code on the other side. This is usually controlled by an EPV so it is editable live without the need of snapshot release. If the holding functionality is switched on the token will wait on the "Wait" task until either the task is completed or the attached message event is triggered. The message event uses the same correlation ID for ALL process instances so that we only need to fire the message event once to move on all cases. The reason we use a message event attached to a task is so that we have the opportunity to progress individual cases by completing the task. It also gives us the ability to easily tell how many cases are waiting at this point in the process. It is important that the message event does not use Durable Subscription for the message event or your process instances will not be held again if you need to switch the holding functionality back on.

Changing Tasks

Another side affect of iterative development methodologies is that functionality my change between deployments. What was originally a simple task may have evolved into a much more complex beast which requires more data inputs/outputs and more complex processing. Often it is simple to develop the new functionality on top of what is currently available but usually the in-flight instances may need more care. For example, if the new improved task requires more data, this cannot be simply added as a new input as instances which are currently sitting on this step will not have that data available. In this situation I prefer to pass only the ID relating to the business data into each task so that this information can be loaded fresh from the system of record each time. This also makes sure that the data being used is up-to-date at all times throughout the process and doesn't rely on passing large amounts of business data around the process. The obvious downside of this is that tracking business data at the BPD layer is then impossible and also driving decision gateways from business data is also more difficult. There are ways of getting around these issues but I am not going to discuss them here. If this approach still leaves the in-flight instances in danger because the task has changed beyond recognition, we use a more extreme approach. In this situation the safest thing to do is to leave all current tasks on the old code, and only allow new tasks onto the latest version of the code. We do this by disconnecting the flow into the old task but leaving it in the process. The flow is then attached to the new task. This allows all tokens which are currently sat on the old code to use that task, but all new tasks will use the new code.
The original task which is now disconnected can be safely deleted when all tokens have moved on. This approach also works well for changing sub processes where you want old in-flight instances to use the same sub-process. I know that some people will be thinking "why just not migrate the instances on to the latest snapshot if you want them not to use the latest version of the code?". Unfortunately this isn't always possible as the latest snapshot may include fixes/changes to other areas of the process which are required.