Should you have a centralized data science team or several decentralized teams?
There are numerous options for a data science team structure in mid- to large-sized organizations. Yet, many organizations struggle to decide among having:
- A single centralized data science team (also known as “data science center of excellence” or as an “enterprise” or “shared services” team
- Several decentralized data science teams which are smaller teams embedded within different parts of the organization
- Or…some hybrid data science team organizational structure
For a small firm, this might be a moot question as it might only be able to support a single team. However, for larger organizations (or startups that are laying a foundation for future growth), this question gets more interesting.
Data Science Consulting and Training
Jeff and I have gotten a lot of consulting and training requests. Therefore, contact us or review our consulting services or individual training course.
But for now, let’s proceed with looking at the pros and cons of a centralized data science team.
Centralized Data Science Team
The Data Science Center of Excellence Model
A centralized data science unit contains nearly all the organization’s data scientists in a single organizational structure.
This group might have multiple teams with multiple managers, all reporting to a Chief Data Scientist (or similar title such as “Director of Data Science”, or “Chief Analytics Officer”).
Centralized teams tend to be bigger which means they are can have all the team roles needed to deliver a full-fledged data product. This includes data engineers, software engineers, product managers, project managers, and business analysts in this centralized organization. If not, these roles sit in sister teams that are responsive to each other’s needs.
The chief data scientists, in conjunction with the senior leadership of the organization, sets the cross-organizational data science goals and strategies. Then data science leadership, in conjunction with functional management, product managers, and program managers prioritize data science projects across the organization, while engineers build and support centralized, shared infrastructures.
- Talent Management: Dedicated functional managers with data science backgrounds are better suited to lead data science teams. Because of the larger org, data scientists have clear promotional opportunities with ample on-the-job training opportunities – particularly for junior data scientists. With a wide variety of project types, data scientists can grow their skillsets across various domains without changing org placement.
- Systems Management: With data, cloud, software, security, DevOps, and similar engineers embedded into or sitting close to the data scientists, the team has dedicated talent who leverage best practices to effectively build, scale, and support AI-native products. This centralized approach also facilitates the discovery and adoption of common tools and infrastructure which helps keep the organization running efficiently on cutting-edge technologies, provides bargaining power during vendor contract negotiations, and simplifies operational maintenance.
- Organizational Strategy: Centralized (and ideally) executive-level leadership can drive an optimized organizational data science strategy by deploying data scientists on the company’s most important projects. Furthermore, the centralized view enables a cross-departmental strategy that is best for the entire organization.
- Political Influence: A cohesive unit (particularly one led by an executive) tends to more effectively communicate, define, and drive an impactful vision (as compared to smaller, siloed teams).
- Support spread too Thin: Because a single org structure is looking across the entire organization, it might not be able to allocate resources to individual business unit needs.
- Slow-Moving and Bureaucratic: The downside of central planning is that some business units might feel that their needs are not appropriately prioritized or that the planning process moves too slowly to respond to time-sensitive issues.
- Domain Knowledge: Being somewhat removed from the business units, data scientists might need time to gain in-depth understanding of the business unit’s domain space.
Decentralized Data Science Teams
Business Unit–Specific Data Science Teams
An alternative organizational structure is to go without a centralized data science org and to allow business units to develop their own data science functions.
Typically these teams are going to be smaller, less likely to be full stack, but are more attuned into the business units’ needs.
Implemented correctly, de-centralized teams overcome the weaknesses mentioned for centralized teams.
- Responsiveness and Agility: Because they are dedicated to their respective business functions, dedicated teams can respond to their business needs with minimal administrative burden.
- Business Unit Prioritization: Having a decentralized structure enables each business unit to more directly prioritize their own data science efforts.
- Domain Knowledge: Sitting along the business operations and analysts, the data scientists can more quickly and deeply learn their domain space.
However, business-aligned teams might suffer from challenges that centralized teams can more easily overcome:
- Siloed Talent: Individual or small teams of data scientists might feel isolated as they are unable to vet their ideas with and learn from like-minded peers which could lead to lower-quality results or employee turn-over.
- Lack of Managerial Focus: Such teams might report to business managers who lack data science backgrounds and whose focus is spread across multiple business functions.
- Inability to build Production Systems: Smaller teams likely won’t have engineers to work closely with. This might be okay if the team is focused on ad hoc analyses but they might need to lean on IT for product-level systems. In practice, this rarely works smoothly because development tends to fall into the short-comings of waterfall-style development.
- Lack of Economies of Scale: Even if each team were able to productize their work, the whole organization might be left with myriad of infrastructures that increase operational maintenance costs and limit contract negotiation power with vendors.
- Siloed Knowledge: By focusing on just their given business units, data scientists are less likely to be aware of and take advantage of models and insights developed by other teams. Projects spanning multiple departments will especially be challenging.
- Organizational Data Science Strategy: Hah! Good luck!
Hybrid Data Science Teams
The choice is not binary. Hybrid structures can help take advantage of the relative strengths of centralization and decentralization but also introduce their own problems.
Two such structures follow.
Strong Centralized Hybrid
Like a centralized structure, a single organizational data science leadership team sets the organizational data science strategy. Its management team serves as functional managers to hire, develop, and promote data scientists. Sister (or embedded) teams of engineers enable production deployment.
However, the data scientists are assigned to (and might even sit with) various business units and focus on the same domain-specific problems. Breadth of knowledge can be gained by rotating data scientists among the various centralized sub-teams.
In short, the organization gets a centralized infrastructure, a common data science strategy, and effective talent management, and the business units get somewhat dedicated teams who are knowledgeable about their specific needs. The major downside is that communication overhead is higher and significant friction might pull data scientists in different directions if data science management and business management are not aligned.
Centralized Data Science Consultants
A more de-centralized hybrid has the data scientists report up through the business units. Meanwhile, a centralized data scientist leader or leadership team consults with the individual teams to encourage best practices and facilitate cross-departmental knowledge sharing.
If empowered, such a central data science team could even set the organizational data science strategy; however, their ability to implement the strategy might be challenging without a dedicated team. This centralized team or a centralized project management office can manage the entire project life cycle up through productization. IT operations could maintain then deployed systems. In short, this compromise partially addresses the challenges from the centralized and decentralized models.
So…What works best?
The answer is highly dependent on the broader organizational structure. Centralized data science teams naturally work for organizations that might not be able to support multiple teams or when business units struggle to hire data scientists. However, their ability to respond to various domain-specific needs tends to wither as the organization grows which lends more toward a hybrid model or at least a stakeholder-focused and carefully coordinated centralized structure.
The only structure I’d advise against is a purely decentralized model. The incremental cost of developing at least some sort of centralization effort pays for itself.
Data Science Process Alliance Services: Data science projects are unique but most agile or project management services are generalized toward software. We’re a bit different – and focus just on data science.
The rest of this website: This post is part of the Team Management series which includes posts where you can:
- Discover How to Lead Data Science Teams
- Learn about the 8 Key Roles for Data Science Team
- Understand the difference between Data Science and Software Engineering
- Assess 10 Ethical Questions for data science
- Know Why you (probably) Need a Product Manager
- Explore how to apply CRISP-DM for Teams
- Get 5 Tips for Remote Data Science Teams
- Review Lessons from 20 Data Science Teams
- Ensure you understand the difference between Data Science and Software Engineering Teams
Curious? Read our White Paper
Learn the five unique challenges of data science projects and how to overcome them.
Get a grasp on CRISP-DM, Scrum, and Data Driven Scrum.
And understand how to leverage best practices to deliver data science outcomes.