The following is an interview by Jeff Saltz with Hector Rangel.
Jeff: Can you tell me a bit about yourself?
Hector: I’m an industrial engineer and a business professional with 8 years of experience driving growth, change and optimization for numerous companies from various industries. I’m passionate about making a positive impact by helping people and companies reach their goals. I’ve been at Arena Analytics for almost 4 years.
Jeff: How about Arena Analytics – can you also provide some background on your company?
Hector: Arena Analytics is a consulting firm, based in Mexico City, specializing in innovation, process automation and artificial intelligence solutions applied to the improvement of business results. We provide solutions for both medium and large companies that enhance the strategy, processes, technology, and human talent of our clients. The use of AI models and Data Science in general is a key differentiator of our value proposal.
Our clients come mainly from retail, consumer goods and B2B industries but also telecommunications, manufacturing, hospitality and supply chain logistics. We use AI to solve challenges such as inventory optimization, pricing and promotion, sales forecasting, and other supply chain and commercial optimization opportunities. In short, like many AI/Data Science consulting organizations, we apply Machine Learning, Deep Learning, Bayesian Methods and Evolutionary Algorithms to tackle problems with clear goals and ROI definitions.
Jeff: What is your current role at Arena Analytics?
Hector: I lead efforts on how to apply data science across fields such as Supply Chain Planning, Market Basket Analysis, Portfolio Optimization, Business Analytics, and Go-to-Market Strategies.
Jeff: Before getting DSPA Team Lead certified, how did you manage data science efforts?
Hector: We typically used a modified waterfall approach. A key reason was that our clients wanted to know the cost and timeline of the project upfront. But this was a challenge, in that we created project proposals that used complex models, and described specific results, without having full knowledge of the data available or how the models would behave. Perhaps not surprisingly, this led to many execution challenges for our data science projects – such as teams having to work overtime to meet commitments.
In fact, we tried many different approaches – daily meetings, weekly meetings, PowerPoint status reports, Gantt status reports – but nothing really improved our process and minimized our project execution issues.
In terms of team communication, everyone tried their best to have open and frequent communication, but it was not done in an organized way, and as a result, stakeholders gave feedback throughout the project to different members of the data science team, but this feedback was hard to properly integrate into the project. This was also partly due to the fact that the roles of DS team members were assigned within the DS team but not socialized with the rest of the stakeholders.
Jeff: After taking the DSPA Team Lead course, how has Arena Analytics’s data science process been refined?
Hector: Well, first, we defined a set of lean agile principles. We also increased our focus on the team considering all phases of the project, as well as focusing on small value-generating project improvements. In addition, it made me realize that we needed to improve our communication and collaboration to ensure our work is useful for the customer, as well as to communicate progress to all stakeholders.
As you can see, we took the key lean and agile concepts taught in the TL course and phrased them in a way that our entire team could internalize.
Jeff: Could you provide a bit more information on your refined data science project management process?
Hector: Yes, we now have a process with four clearly defined phases:
In phase one, which we call Business understanding / project proposal, we explore the business problem with the client and make sure that our proposal for the work allocates time to explore information, test algorithms and models and monitor the effectiveness of the solution. We also generate an initial list of possible experiments, which is stored in our item backlog. In addition, in this phase, we also make sure that the client understands the challenges of the project, so their expectations are aligned with what might be possible.
In phase 2, which we call Project preparation and launch, we assign roles and prepare the first detailed version of the activities, or experiments, that the team will be performing. This list of possible experiments is prioritized by the product owner via an estimated potential business impact, combined with a very high level estimation of effort for that experiment.
In phase 3, which we call Project execution, we divide each experiment into one or more iterations, where each iteration consists of the life cycle phases after business understanding – acquire and prep data, analyze and model it, evaluate and test, deploy and monitor. Note that each iteration has a goal of creating something, observing the results of that creation and then analyzing the results to refine future iterations. The observe and analyze tasks are jointly done with the product owner and the data science team.
Our goal is that each iteration delivers incremental results to the client, and in each iteration, the team will validate the scope and objective of the project. This might be just one of four phases, but it is where we spend the most time! In fact, this is where we use Data Driven Scrum. Well, I guess Data Driven Scrum is also the driving force behind phase 3. I’ll add that our iterative execution is how we make sure that we are focusing on the key client needs, which are actually useful to the client.
And finally, in phase 4, which we call Knowledge management for retrospective, we generate insights, and document as appropriate, to streamline future projects. Note that our knowledge management phase focuses on improving future projects not just from a process perspective but also from a modeling perspective.
Jeff: You are DSPA Team Lead Certified – could you describe the DSPA training?
Hector: It was a useful course – it covered both the theoretical and practical aspects of Data Science Project Management. Specifically, the training covered key concepts such as what is lean agility and why is it useful within data science projects, what are some typical lifecycle frameworks, such as CRISP-DM, and why are they useful. It also covered the typical coordination frameworks, such as Scrum and Kanban, as well as what are their strengths and weaknesses. Finally, the course also covered Data Driven Scrum and how could it be used within an organization.
The training also validated my belief that our software development efforts are different than our data science efforts, in that our data science projects have more ambiguous requirements, increased difficulty in measuring project success, time estimation uncertainty, such as in data cleansing and model development, and finally, more uncertainty with respect to the quality of the available data or even knowing if the data was relevant for the desired predictive model. This meant that our data science projects should have a different process than our software development team process.
The mentoring sessions were a very valuable aspect of the course, as was the flexibility to go back over the material at any time.
Jeff: Thanks for your time – anything else you would like to add?
Hector: Not really, I think this covers it – but I will add that I do think that the course and certification would be useful for others who lead data science projects!
To learn more, explore training via our Data Science Team Lead course.
Or jump in and learn more through these articles:
Master Data Science Projects
Data science projects are unique. That’s why data science leaders like Hector turn to the Data Science Process Alliance. Unlike other consultancies, we focus on just data science project delivery.
Learn to overcome data science’s unique challenges and to leverage its unique opportunities.