What is SEMMA? The SAS Institute developed SEMMA as the process of data mining. It has five steps (Sample, Explore, Modify, Model, and Assess), earning the acronym of SEMMA. The data mining method can be used to solve a wide range of business problems, including fraud identification, customer retention and […]

Read more

KDD and Data Mining

KDD and Data Mining What Is the KDD Process? Dating back to 1989, the namesake Knowledge Discovery in Database (KDD) represents the overall process of collecting data and methodically refining it. The KDD Process is a classic data science life cycle that aspires to purge the ‘noise’ (useless, tangential outliers) […]

Read more
Data science life cycle per Domino Data Labs

10 Ways to Manage a Data Science Project – Part IV: Emerging Approaches

So are there new emerging approaches that are data science native? Microsoft’s Team Data Science Process (TDSP), Domino Data Lab’s Data Science Life Cycle, and the Data Science Process Alliance’s Data Driven Scrum (DDS) are approaches that are both data science native and agile. There are pros and cons specific to each approach but they share some fundamental principles.

Read more