Why care about Data Science Project Management?

Data Science is booming…

Demand for data science continues to drive a hiring frenzy and fuel massive investments in data science. By 2020, the job market for data scientists is forecasted to grow 28% over 2017 (Columbus, 2017), and there will be an estimated 2.7 million data science and analytic roles job postings. Revenue from big data and business analytics is growing 11.7% annually and will surpass $200 billion by 2020 (International Data Corporation,2015). Investments are pouring in. According to a New Vantage survey of Fortune 1000 firms, 63% of firms expect to invest over $10 million in big data by 2017 (up from 24% in 2012) and 27% of firms to invest over $50 million (up from 5% in 2015).

…Yet is being held back

Data science is emerging from the intersection of several up-and-coming technologies; and yet, one of the most fundamental issues that stifles this nascent field from its true potential is not technical. Rather, like most endeavors, data science success is dependent on the effective execution of a project.

Yet, little attention has gone toward data science project management as the spotlight shines on new technologies and capabilities. This has left data science teams struggling to implement their projects. John Akred, Co-founder and CTO of Silicon Valley Data Science, explains that “We’ve met a lot of data science teams that understand how to do the data science, but they don’t have any real method of managing the data science project” (Akred, 2016). Likewise, Mark Clerkin, Data Scientist for High Alpha comments, “I’m scheduled in everything else I do but I don’t have a rigorous philosophy for [data science project management] because I haven’t found one” (Clerkin, 2017). Tyler Foxworthy, CEO and Chief Scientist at Vertex Intelligence, agrees that “there really isn’t a standard for how to manage and scale up data science teams” (Foxworthy, 2017).

Likewise, academic research in data science project management is lacking. “As a new field, much has been written about the use of data science and algorithms that can generate useful results […] Unfortunately, less has been written about how a group could best work together to execute a data science project” (Saltz, Shamshurin, & Crowston, 2017). A review of 296 articles and posters examined in the proceedings from the 2014 IEEE Big Data conference found that only 8% mentioned “any aspect of the socio-technical challenge in doing a big data project” (Saltz, 2015), and the first data science project management controlled experiment was not published until 2017 (Saltz, Shamshurin, & Crowston, 2017). While the technical aspects are certainly crucial to progress data science as a field, repeatable and useful results are challenging without management methodologies and tools (Saltz, 2015).

In practice, data science project management tends to focus too heavily on ad hoc practices, ignores teamwork aspects, or confuses data science as software engineering. In fact, “82% of the data scientists surveyed did not follow an explicit process. However, it is encouraging to note that 85% of the respondents thought that adopting an improved process methodology would improve the teams’ results” (Saltz, et al, 2018).

The result is an industry that suffers from low project success rates with Gartner (2017) estimating the big data project failure rate at 85%.

…But the outlook remains bright

However, by expanding the focus of data science to include proper team-based project management techniques, organizations will more effectively convert data science investments in time, talent, and technology into tremendous value.

Next: Project Failures >

References