Data Driven Scrum – 5 Key Questions

Data Driven Scrum (DDS) enables lean data science project agility and addresses the key challenges that have been identified when using Scrum in a data science context.

This post describes five common questions teams might encounter when trying to implement DDS, as well as how to address these challenges.

1 How to integrate “Create – Observe – Analyze” within a project life cycle?

Each Backlog Item in DDS should answer a key question or improve an analysis / system in a testable way.  The “Create / Observe / Analyze” structure within DDS is designed to ensure that, within each iteration, the team is creating something, observing and measuring its performance, and then analyzing the results. The goal is to determine if the key question or improvement that motivated the Backlog Item was in fact successfully resolved.  If it was not resolved, then the team would analyze why not and incorporate that insight into their future iterations.

When using a sequential life cycle

For teams approaching a project by sequentially executing each major life cycle phase of the project in its entirety before moving on to the next phase, the “Create / Observe / Analyze” structure provides an iteration level life cycle within each of the major life cycle phases. In this construct, the create/observe/analyze is similar to the team’s data science project life cycle, but at a smaller scale where each create/observe/analyze iterative step is within the project phase life cycle phase (e.g., CRISP-DM phases), and where each create/observe/analyze iteration creates incremental learning.

When using a life cycle in each iteration

However, in contrast to sequentially executing each major life cycle phase of the project in its entirety before moving on to the next phase, it is much more typical for teams to focus on quickly delivering one vertical slice within an iteration (i.e., going through the entire life cycle in one DDS iteration). In this situation, if the team is using CRISP-DM as their life cycle framework, the ‘create’ DDS component might consist of the first four phases of CRISP-DM (business understanding, data understanding, data preparation, modeling). This ‘create’ phase might require, for example, building a predictive model. But the work of the DDS team is not just focused on creating a model, in that there need to be tasks to understand the effectiveness of the model. Hence, there is an ‘observe’ phase required in DDS (which might include the deploy phase of CRISP-DM). In this example of creating a model in the creation phase, the team would collect data on how the model is working. Note that the model does not have to be deployed to observe the results. A different situation might be that the model is observed via back-testing. In this situation, the back-testing observations would be in the evaluation phase in CRISP-DM.

Ensure a focus on analyze

In any event, once data has been collected on how the model is working (via the observations), the team then needs to ‘analyze’ the observations. In other words, the team needs to collectively review / analyze the results of the observations (i.e., how the model is working) and determine what insights can be generated from those observations. This can typically be thought of as part of the evaluation phase of CRISP-DM. The results of this ‘analyze’ phase will likely impact the priority of some of the Backlog Items and/or the creation of new Backlog Items.

As one can see, the DDS team might need to deploy a model for the observe phase, and then go back to the evaluation phase for the analyze phase (going back to a previous phase is not an issue when using CRISP-DM).

This simple example helps to explain DDS’s focus on create/observe/analyze, in that most data science life cycles do not have a clear and distinct focus on analyzing what has been created and using that analysis to refine future work items. 

2 How to handle “no deadlines / timelines”?

First, via Product Increments, the team can still have overall project milestones (e.g., release something in 3 months). This can help set expectations which can help the team prioritize the iterations, especially as they get close to the milestone deliverable.

In addition, when using DDS, the team still meets on a regular basis (such as the iteration review meetings) where everyone can see progress being made. These meetings are not tied to a specific iteration, but rather, are calendar-based. Note that each Backlog Item has a high-level estimate, so this high-level estimate can help set a rough expectation of when an iteration will be completed.

Finally, it should be noted that having capability-based iterations is a more honest approach, in that many iterations have a lot of uncertainty with respect to the amount of work required to complete the iteration (i.e., the amount of work to answer the hypothesis of that iteration). The use of fixed-time iterations presents a false certainty to stakeholders of when the work will be completed, which does not actually exist. In other words, when data science teams use time-boxed iterations, teams often pad their estimates (e.g., only commit to what they think can be done in half the iteration).   

Become Data Science Team Lead Certified

Learn how to use DDS and master the skills to help lead data science projects. Grow with the Data Science Process Alliance’s training and certification programs.

3 What to do about a build-up of technical debt?

What is technical debt

Technical debt is the implied cost of additional rework caused by choosing an easy (limited) solution now instead of using a better approach that would take longer. To address the project’s technical debt, first the team needs to make the debt visible. The technical debt should be tracked in the Backlog Items to be done, and the technical debt as well as its impact on the team’s productivity should be highlighted in the iteration review meeting.

Reducing technical debt via a specific iteration

To reduce large technical debt, the team can pull a backlog item that is focused on technical debt reduction into an iteration just as they would for any other backlog item.  In this case the backlog item should still be broken down into “Create / Observe / Analyze” tasks, where the “create” step would focus on a reduction of the technical debt, and the observe and analyze steps would focus on measuring and analyzing the impact of the reduction of technical debt and understanding the potential new capabilities of the team as a result of that iteration.  So, for example, if the team took steps to streamline a data pipeline in the hopes that a model could be trained on a larger data set in a shorter amount of time, the observation and analysis should focus on metrics around that and potential uses of the new capabilities. This approach is most appropriate for larger tasks such as data migrations, changes in tooling or architecture, etc.

Reducing technical debt by adding to an iteration

For ongoing or smaller scale technical debt reduction the team may instead elect to include targeted technical debt elimination as part of an iteration that involves updating or improving an existing element that has been identified as having technical debt.  In this situation, while adding to or improving an existing model, the team may elect to also improve the maintainability of that model.

Is accumulating technical debt OK?

To reduce the rate of accumulation of technical debt, the team should agree upon conventions and standards that all finished work should meet.  Some teams accomplish this by having a “Definition of Done”. These standards should be designed to find an optimal balance between ensuring that the team can rapidly iterate and prototype and limiting the rate of accumulation of technical debt. In short, it might make sense to accumulate some technical debt, but at some point, that debt must be addressed.

4 How could DDS teams work with other teams?

The DDS framework is a single team framework that is designed to be compatible with the Scrum@Scale scaling framework, as well as SAFe (Scaled Agile Framework). To work with other teams that are coordinating with Scrum@Scale or SAFe, the DDS team exposes the necessary interfaces to collaborate with other teams via DDS’s roles and artifacts.

Specifically, each DDS has its internal workflow which is defined via their capability-based iterations. However, the DDS team exposes the necessary interfaces to collaborate with other Scrum teams (or other DDS teams). The table below summarizes the touch points between the DDS team and the Scrum (or DDS) teams. For example, the DDS Iteration Review meeting easily maps to the Scrum Sprint Review meeting, and both meetings can be at the same frequency.

Team TouchpointDDSScrum
Metascrum RepresentationProduct OwnerProduct Owner
Scrum of Scrums RepresentationProcess ExpertScrum Master
Product / Release FeedbackIteration ReviewSprint Review
Metrics and TransparencyItem Backlog / TaskboardProduct Backlog / Sprint backlog
Integrating DDS and Scrum Teams

The following table reviews the key SAFe and Scrum@Scale events and shows which DDS role would represent the DDS team at these events.

SAFe EventsSAFe RolesDDS RepScrum@Scale RolesScrum@Scale Event / Meeting
ART SyncProduct ManagerProcess ExpertChief Product OwnerScaled Daily Meeting
Planning (PI)Product ManagerProduct OwnerChief Product OwnerScaled Sprint Planning
Retrospective (PI)Release Train EngineerProcess ExpertScrum of Scrum MasterScaled Retrospective
Measurement (PI)Product ManagerProduct OwnerChief Product OwnerScaled Sprint Review
System Demo (PI)Release Train EngineerProduct OwnerScrum of Scrum MasterScaled Sprint Review

5 If someone finishes their tasks earlier than others, what should that person do?

In DDS, if someone finishes a task, but there is still work being done by others on the current iteration, the goal would be for the person that finished their tasks to help someone who still has work to be done (i.e., the goal as a team is to finish the iteration as soon as possible). If it doesn’t make sense to help the other person (e.g., all tasks will be finished by the end of the day), then the person could work on refining the item backlog or start a new iteration.

Learn More

Share this Post: