What is an R&D Approach?
The data science process can also be viewed as a research endeavor that transitions into an engineering project. As such, some organizations suggest combining agile projects with traditional research methodologies. Google Brain and DemandJump are two companies that split data science into these two general buckets.
Research and Sprint
Google Brain, a deep learning research group at Google, does not ascribe to a rigid project management approach but loosely divides work into unstructured research time and semi-structured development sprints. Ryan Poplin, a Machine Learning Technical Lead who conducts genomics research, explains that most of their work is research-oriented and does not fit well into standard project management approaches (Poplin, 2017).
At Google Brain, project teams are very fluid and loosely defined. Individuals generally have broad freedom to change teams and tend to prioritize their own work based on their interests and the broader team needs. Teams, consisting of 6 to 8 scientists and engineers, meet on a quarterly basis to determine if their projects should continue; projects that do not receive votes are decommissioned. Project managers tend to be hands-off from the daily research and instead focus externally to collaborate with stakeholders. Because so much of their teams’ success depends on the quality and availability of data, the project managers devote much of their time to procuring data sets that meet the researchers’ needs. Much of this work is just tracked in spreadsheets (Poplin, 2017).
Occasionally, they need to closely collaborate to produce a deliverable such as a proof of principle which Poplin describes as “a smallish project to prove a concept”. To complete the deliverable, the team comes together in an intensive output-focused two-week sprint; however, their concept of a sprint bears only some resemblance to Scrum’s definition of sprint. Team members collaborate closely and have daily standups to plan their work. The project manager takes a more hands-on approach to track work items, record bugs, manage a burn-down chart, interpret issues, help them execute work as a team, and hold team members accountable. Poplin says without the project manager, they would not be able to execute effectively as they would otherwise likely ignore project responsibilities and “just go back to the research” (Poplin, 2017).
Data Science Research and then Engineering Development
A similar hybrid approach that uses unstructured research and structured development cycles is employed at DemandJump, an Indianapolis startup that offers an artificial intelligence marketing platform. Tyler Foxworthy, Scientific Advisor at DemandJump, sees data science as distinct from engineering and that the two disciplines should be managed differently. He explains, “For any type of problem that is unknown, you need to have two batches of time – the research and then figure out how to productionize it.” He compares his work overseeing a data science team to that of a thesis advisor – he helps set up a problem for his team and provides guidance but otherwise allows them to conduct their own research during largely unstructured time. Foxworthy believes that “you can’t put a time box on open problems because you can’t schedule insights.” Rather, “it’s better to scope specific time for research.” Eventually, when the underlying problem becomes well-defined and solve-able, then it’s time for the development phase. At that point, Foxworthy “gets the project manager and engineering involved because you should be able to scope the time.”
Data science is often a research endeavor that needs to be productionized which makes research and development approaches appropriate. However, research phases are difficult to monitor and control which requires discipline from its users to focus on producing value and trust from management to provide them with freedom. This hands-off project management approach during research could fall victim to the risks of low maturity processes. Ryan Poplin admitted that the Google Brain approach isn’t for everyone but works well for them as highly motivated researchers whose work is usually individually-focused (Poplin, 2017). Additionally, dividing the overall project into phases (research and then development) is counter to the agile practice of providing “vertical slices” of value frequently from the start of a project; rather it somewhat mirrors the phased approach of waterfall whereby value delivery is deferred until later project phases. In summary, these approaches can be effective but are perhaps best reserved for mature team environments whose work is primarily research-focused.