The following is an interview between Nick Hotz and Dr. Matthew Edwards.
Dr. Matthew Edwards
NICD & Matthew’s Background
Nick: Can you describe your work at the National Innovation Centre for Data?
Matthew: At the National Innovation Centre for Data (or NICD as we’ve come to be known) I lead data science projects for large public and private sector organisations and work on research projects for the Advanced Research Centre at The Alan Turing Institute. The public and private sector organisations we work with at NICD come to us because they are interested in using cutting-edge data science tools and techniques but do not necessarily know where to start with them. The data science projects I lead on give them the opportunity to learn in a collaborative and supportive way.
Nick: What is your professional background? How did you get into managing data science projects?
Matthew: I gained my PhD in Statistics at Newcastle University through the Centre for Doctoral Training (CDT) in Cloud Computing for Big Data. The CDT has close links with NICD, and so because of this I had strong prior knowledge of the services that NICD provides and the breadth of work that they do. As these services are collaborative and educational, and I’ve always been interested in education and mentoring, I pursued a role at NICD. I was fortunate enough to secure one and have been here for over three years now.
Matthew’s DSTL Course Experience
Nick: How has the Data Science Team Lead course helped you?
Matt: I was interested in developing data science processes for NICD to standardise how we approach our collaborative projects and make us more efficient. In my search for resources on data science processes, I came across the Data Science Process Alliance (DSPA). The site included many useful blogs around the topics that I was interested in exploring and I noticed that they also provided a Data Science Team Lead (DSTL) course that covered many of the things I wanted to learn. After completing the course, I felt I had a much better understanding of data science processes and the key ideas behind many of them. This has strongly influenced my thinking and approach towards the development of data science processes.
Nick: Could you describe the DSPA training?
Matt: The course was composed of four modules, each including both video and text-based learning materials. Both the video and text materials could be worked through in small chunks which made studying the course alongside a busy work schedule very easy. Each module concluded with a thirty-minute talk with you, which I found very useful as it gave me an opportunity to ask any questions that I had.
Nick: What was your favorite part of the course?
Matt: I found the talks with you that took place after each module the most valuable part of the course. Not only were they valuable for consolidating the material I had just studied, they were also very enjoyable. I was able to ask very specific questions about the course material and how it related to the data science projects that I lead at NICD.
Improving NICD’s Data Science Project Management
Nick: Before getting DSPA Team Lead certified, how did you manage data science efforts?
Matt: I initially led data science projects with more ad hoc processes. I developed various processes through trial and error, and there were processes that worked and would be developed further and processes that did not work as well and so were ultimately abandoned. I developed these ad hoc processes in a largely undocumented manner, so they were not rolled out to others across the team.
Nick: After taking the DSPA Team Lead course, how have you refined NICD’s data science process?
Matt: I have provisionally proposed the use of CRISP-DM as our main data science workflow and Kanban as our go-to collaboration framework. I believe that the hierarchical structure of CRISP-DM, which is clearly organised into phases, tasks and outcomes, is great for communication and that the Kanban board is ideal for our level of collaboration. I have started to incorporate the CRISP-DM phases, tasks and outcomes and the Kanban board into the data science projects that I lead.
Nick: Could you provide a bit more information on your refined data science project management process?
Matt: I have begun incorporating the phases, tasks and outcomes of a modified CRISP-DM workflow into a project template that can be easily shared and used across the whole team to organise their project code.
Nick: Would you like to explain the framework that you developed?
Matt: Initially, the idea was to create a framework to help standardise the delivery of our data science projects across clients and to improve our overall efficiency and productivity. Interestingly, however, this has since developed into the idea that a framework for data science projects would not only improve the projects that we lead ourselves but could also improve the projects that our clients lead in the future if the framework is included as part of what we deliver to them.
Matt’s Parting Advice
Nick: What advice do you have for others looking to manage data science projects?
Matt: Learn as much as you can about the different types of data science processes. All organisations perform data science in different ways and the type of workflows and collaboration frameworks that are required is likely to vary significantly from organisation to organisation. Incorporating processes into our data science projects incrementally has been very useful as it has given us the chance to develop processes that really suit our needs rather than processes that were originally developed for different organisations.
Nick: Is there anything else you would like to add?
Matt: I have found the DSTL course and the general learning of data science processes very enjoyable. I feel like I have improved considerably as a data science lead and am now able to approach our varied and often very challenging data science projects in a more streamlined and efficient manner. I have developed a lasting interest in data science processes and wish to pursue this further as part of my role at NICD.
To learn more, explore the Data Science Team Lead course overview or jump in and learn more through these articles: