Team

CRISP-DM for Data Science Teams: 5 Actions to Consider

Posted on

While there is no standard process for a team to use when working on a data science project, CRISP-DM (CRoss-Industry Standard Process for Data Mining) is one framework that is often considered for data science projects. Perhaps because of this, there are lots of web sites describing the 6 phases of a CRISP-DM project, and […]

Agile

10 Data Science Project Metrics

Posted on
Measuring Data Science Project Performance

Ironically, data science teams that are so intensely focused on model measurement often don’t measure their own project performance which is problematic because… …But wait! Data scientists measure all sorts of metrics. Of course, they will closely monitor data science metrics and KPIs such as RMSE, F1 scores, or correlation coefficients. Such metrics are critical […]

Managing Data Science as a Research Effort

Posted on

Similarities of Data Science and Research Efforts In many ways, a data science project looks like a research project, in that both require significant effort exploring a problem that typically doesn’t have a known answer. For example, in data science, it’s often not clear where there is “value in the data”, which is similar to […]

Managing Data Science as Software Engineering

Posted on

Similarities of Data Science and Software Engineering Projects In many ways, data science looks like software engineering. Both require significant coding to address an underlying business problem or opportunity, which typically requires frequent stakeholder interaction. Furthermore, when a production data science model is required, just as for traditional software systems, there is a requirement to include […]

CRISP-DM

Posted on
CRISP-DM Life cycle

What is CRISP DM? The CRoss Industry Standard Process for Data Mining (CRISP-DM) is a process model with six phases that naturally describes the data science life cycle. It’s like a set of guardrails to help you plan, organize, and implement your data science project. Business understanding – What does the business need? Data understanding – What data do we have / need? Is […]

Waterfall

Posted on
managing data science projects with waterfall

What is Waterfall? Waterfall, also referred to as the classic life cycle or traditional project management, originated from manufacturing and construction and was applied to software engineering projects starting in the 1960s. A waterfall project flows through defined phases such as shown in the diagram to the right. Some waterfall models include variations of these […]

Traditional Approaches

Posted on

Waterfall is the classic highly-structured project management approach that dates back to antiquity and was common in software 10 – 20 years ago. Realizing the need for a process specific to data mining, CRISP-DM was defined in the late 1990s. Both approaches could be applied to data science. Waterfall, traditional software development life cycle (SDLC), and predictive […]