Ad Hoc Data Science

ad hoc  (adv) – “for the particular end or case at hand without consideration of wider application”

Merriam Webster Dictionary

High Reliance on Ad Hoc Processes

Without established methodologies for managing data science projects, teams often resort to ad hoc practices that are not repeatable, sustainable, or organized.

Such teams suffer from low project maturity without continuous improvements, well-defined processes and check-points, or frequent feedback (Saltz, 2015).  On a capability maturity model scaling from 1 – “ad hoc and chaotic” – to 5 – “anyone can reproduce it”, some data science teams are between level 2 and 3 but most are at still at level 1 with the project management undocumented and just in the senior data scientist’s mind. A Capgemini study likewise finds a low level of project maturity for big data projects:

  • 45% of companies have a “clear roadmap with timelines and milestones”
  • 26% of companies have “well-defined criteria for use-case selection”
  • 33% of companies have “well-defined KPIs to measure the success of initiatives”

ad hoc

Does Ad Hoc work for Data Science?


Ad hoc processes provide its users with the freedom to decide how to tackle each problem as they come up. Although sometimes requiring extra effort to develop these processes through trial and error, ad hoc might be appropriate for one-off projects coordinated by individuals and small teams (Saltz & Shamshurin, 2016). Moreover, by focusing on the particular project or task-at-hand without regard for its impact on other projects or areas of the organization, one can start working on a project with minimal administrative overhead or concern to comply with procedures.

Weaknesses and Challenges

However, data science is rapidly maturing beyond siloed data scientists into a team sport involving professionals with diverse skillsets, beyond just data science (Spoelstra, Zhang, & Kumar, 2016). While ad hoc processes cannot and often should not be eliminated, over-relying on them often leads to numerous problems for data science projects and teams:

  • Thwarts Learning: A process is a “mechanism to improve on what you’re doing. You can measure specific inefficiencies and next time you perform the same process you can improve.” It is like “version control for your work product” (Spoelstra, Zhang, & Kumar, 2016). Without processes, learning might suffer.
  • Slows Information Sharing: Poor processes for storing, retrieving and sharing documents wastes time as people look for information and increases the risk for using the wrong version (Spoelstra, Zhang, & Kumar, 2016).
  • Delivering the “Wrong Thing”: Lack of effective processes to engage with stakeholders increases the risk that teams will deliver something that does not satisfy stakeholder needs (Sutherland, 2014) (Project Management Insitute, 2017) (Domino Data Lab, 2017).
  • Stakeholder Frustration: Lack of processes to manage proper expectations may diminish stakeholder engagement (Project Management Insitute, 2017).
  • Lack of Reproducibility: Further building on past projects might be “impossible given inconsistent preservation of relevant artifacts like data, packages, documentation, and intermediate results” (Domino Data Lab, 2017).
  • Poor Coordination: Coordination, defined as “the management of dependencies among task activities” is the biggest challenge for data science projects (Espinosa & Armour, 2016). Poor processes decrease coordination and can result in confusion, inefficiencies, and errors.
  • Scope Creep: Without proper processes to determine what is included and excluded in a project, the scope of the project may balloon out of control (Sutherland, 2014) (Project Management Insitute, 2017).
  • Limited Project Monitoring and Control: Without proper insight into the project, management struggles to know when a project gets off track and needs intervention. Moreover, “unless you have an agreed methodology and enforce it, you won’t know who is cutting corners and with what consequences” (Vorhies, 2016).
  • Compromised Quality: Lack of coordinated data cleaning or quality assurance checks for data science projects can lead to erroneous results (Domino Data Lab, 2017)(Akred, 2015).
  • Forgetting a Step: Without process control, teams may simply forget to follow a critical step (Saltz, 2015).
  • Team Morale Killers: Working in confusing, chaotic environments can be frustrating which may lower team members’ ability to focus and their motivation.
  • Decreases Chances for Succes: The Capgemini study found that organizations that planned their big data initiatives were roughly twice as likely to be successful than those whose relied on ad hoc processes (below).


Following more mature project management approaches will not eliminate project management issues but can reduce them and increase the chances of success. Aside from small, simple projects executed by individuals or small teams, ad hoc is generally not an appropriate fit for data science projects.

Rather, your best bet is to develop a more mature process that combines a data science life cycle with an agile coordination framework.

Curious? Read our White Paper

Learn the five unique challenges of data science projects and how to overcome them.

Get a grasp on CRISP-DM, Scrum, and Data Driven Scrum.

And understand how to leverage best practices to deliver data science outcomes.

data science project management - defining a better data science process