Risk Prediction module
The risk prediction module adds machine learning to Release to predict if a release or task will fail, or how much time it will take to execute it. This topic describes how you can use the risk prediction module. For more information about installing the module, see Install the Release Risk Prediction module.
The module shows the predictions in Release in two ways:
- A new risk assessor which brings attention to releases that have a high probability of failure.
- A new release or template Risk Forecast page where you can see details of predicted risks for the release and its tasks.
Risk assessor
When a release is created or updated, the module makes predictions of the risk level. In this situation, the risk level is defined by the probability of the release having at least one task failed or being overdue. If this probability is higher than a certain threshold the release risk score is increased:
By default, the risk score of the release is increased by 35
points to “Attention needed”. If you use the custom risk profile, then you can change these settings in the risk profile settings. For example, if you do not want to see the predictions in the risk assessments, you can drag the setting of “Release probable to fail or to be aborted” to 0
.
Using the Risk Forecast page
To view the Risk Forecast page, go to any release or template and select Risk Forecast from the Show drop-down menu.
The Risk Forecast page shows a grid with the release, phases, tasks, and the predictions organized in two columns: Predicted Duration and Predicted Risk.
In a release in progress, the Risk Forecast page shows what the predicted risks were for that release before it started. You can view the predictions for finished releases and compare them with the actual history of the release. In a template, the Risk Forecast page shows the risks of a release that can be started from that template.
The duration is predicted only for tasks that are executed, for example:
- Manual tasks
- Script tasks
- Not for task groups
- Create Release tasks
A task’s duration is predicted as a weighted median of its similar tasks.
The predicted risk column combines two different probabilities:
- that a release or task will have failures or delays
- that a release will be aborted or a task will be skipped.
If one of these probabilities is higher than a specific threshold, an orange or red icon is displayed on the row. When you hover above a cell, you can see the tooltip message.
For more detailed information about the prediction, click on it to open a window that shows similar releases or tasks in the past, and a summary of information such as durations, failures, and delays.
Similar releases
The release titles displayed in this window can be clicked to open the relevant release. You can use the information for each release to analyze if something went wrong in the past and determine what actions to take to ensure that it does not happen again. Based on this information, you can improve the current release or the template used to start the release.
The list of “similar releases/tasks” is ordered by relevance, comparing multiple features with the current release/task.
Example: If you have not yet run a release from a specific template, but there are other releases in the archive located in the same folder that have the same number of phases and a similar set of tasks, those releases will appear as “similar releases”.
These predictions can help you to optimize your release pipeline. Since the predictions are based on historical data, they can provide strong insights into potentially problematic releases and tasks in your organization. A release that is predicted to be risky is likely to encounter problems while executing; for example it may be delayed, or aborted, or some tasks may need several restarts before they are successfully completed.
One possible way to manage releases with a higher risk prediction would be to monitor them closely while they are in progress, so that if any problems arise, the team can respond immediately.
A release that is predicted to be risky or one that has a high chance of being aborted is likely to contain tasks that have a high chance of being skipped or probable to fail. By viewing similar tasks, you can get a better overview of why this happens. Perhaps certain tests are often skipped, or a certain deployment is prone to failures. In each case, you can see more detailed properties of those tasks - for example their input properties - to determine the likely cause of failures, such as a problematic deployment environment or a flaky test. This knowledge is very valuable when deciding what action to take to improve the release chains.
Prediction mechanics
All the release content is evaluated by machine learning models to predict the risk level of the release and its tasks. Examples of analyzed release features:
- values of release variables
- number of phases, tasks, types of the tasks
- the template and folder used to start the release
- the method used to start the release: from a trigger or manually
- release title and description
- how long ago the release started
Examples of criteria analyzed for tasks:
- task type and if it is manual or automatic
- input property values (example: which environment it is deploying to)
- the kind of release and phase in which that task resides
- task title and description
- how long ago the task started
When displaying similar releases or tasks, the Relevance column shows how similar the release/task is to the one displayed in the list. Releases/tasks are compared by the features listed above.
- High relevance means that almost all properties are the same: it started from the same template and its structure did not change while executing.
- Medium relevance means that releases are quite similar but something has changed: either the template where they started was changed between the executions, or the releases just have a similar appearance in their templates or folders.
- Low relevance means that releases are different in several properties and might not be related to each other. Such releases are only shown when not enough relevant releases could be found in the archive.
These relevancies of the neighboring tasks are taking into account when calculating the predictions. The greatest importance is given to the highly relevant tasks, followed by the medium relevant tasks. The tasks with low relevance contribute less to the calculated prediction.
The predictions are made as if a release has not yet started.