Does a single centralized data science team or several decentralized teams work better?
Many organizations struggle between having a single data science “center of excellence” (sometimes known as an “Enterprise” or a “Shared Service” team), which is leveraged across the organization and having smaller teams embedded within different parts of the organization.
For a small firm, this might be a moot question as it might only be able to support a single team. However, for larger organizations (or startups that are laying a foundation for future growth), this question gets more interesting. Let’s start with centralized teams.
Centralized Data Science Org Structures
A centralized data science unit contains nearly all the organization’s data scientists in a single organizational structure. This group might have multiple teams with multiple managers, all reporting to a Chief Data Scientist (or similar title such as “Director of Data Science”, or “Chief Analytics Officer”). Organizations with full-stack teams may also include data engineers, software engineers, product managers, project managers, and business analysts in this centralized organization. If not, these roles sit in sister teams that are responsive to each other’s needs.
The chief data scientists, in conjunction with the senior leadership of the organization, sets the cross-organizational data science goals and strategies. Then data science leadership, in conjunction with functional management, product managers, and program managers prioritize data science projects across the organization, while engineers build and support centralized, shared infrastructures.
- Talent Management: Dedicated functional managers with data science backgrounds are better suited to hire and manage data scientists. Because of the larger org, data scientists have clear promotional opportunities with ample on-the-job training opportunities – particular for junior data scientists. With a wide variety of project types, data scientists can grow their skillsets across various domains without changing org placement.
- Systems Management: With data, cloud, software, security, DevOps, and similar engineers embedded into or sitting close to the data scientists, the team has dedicated talent who leverage best practices to effectively build, scale, and support AI-native products. This centralized approach also facilitates the discovery and adoption of common tools and infrastructure which helps keep the organization running efficiently on cutting-edge technologies, provides bargaining power during vendor contract negotiations, and simplifies operational maintenance.
- Organizational Strategy: Centralized (and ideally) executive-level leadership can drive an optimized organizational data science strategy by deploying data scientists on the company’s most important projects. Furthermore, the centralized view enables a cross-departmental strategy that is best for the entire organization.
- Political Influence: A cohesive unit (particularly one led by an executive) tends to more effectively communicate, define, and drive an impactful vision (as compared to smaller, siloed teams).
- Support spread too Thin: Because a single org structure is looking across the entire organization, it might not be able to allocate resources to individual business unit needs.
- Slow-Moving and Bureaucratic: The downside of central planning is that some business units might feel that their needs are not appropriately prioritized or that the planning process moves too slowly to respond to time-sensitive issues.
- Domain Knowledge: Being somewhat removed from the business units, data scientists might need time to gain in-depth understanding of the business unit’s domain space.
Business Unit–Specific Data Science Teams
An alternative organizational structure is to go without a centralized data science org and to allow business units to develop their own data science functions.
Implemented correctly, de-centralized teams overcome the weaknesses mentioned for centralized teams.
- Responsiveness and Agility: Because they are dedicated to their respective business functions, dedicated teams can respond to their business needs with minimal administrative burden.
- Business Unit Prioritization: Having a decentralized structure enables each business unit to more directly prioritize their own data science efforts.
- Domain Knowledge: Sitting along the business operations and analysts, the data scientists can more quickly and deeply learn their domain space.
However, business-aligned teams might suffer from challenges that centralized teams can more easily overcome:
- Siloed Talent: Individual or small teams of data scientists might feel isolated as they are unable to vet their ideas with and learn from like-minded peers which could lead to lower-quality results or employee turn-over.
- Lack of Managerial Focus: Such teams might report to business managers who lack data science backgrounds and whose focus is spread across multiple business functions.
- Inability to build Production Systems: Smaller teams likely won’t have engineers to work closely with. This might be okay if the team is focused on ad hoc analyses but they might need to lean on IT for product-level systems. In practice, this rarely works smoothly because development tends to fall into the short-comings of waterfall-style development.
- Lack of Economies of Scale: Even if each team were able to productize their work, the whole organization might be left with myriad of infrastructures that increase operational maintenance costs and limit contract negotiation power with vendors.
- Siloed Knowledge: By focusing on just their given business units, data scientists are less likely to be aware of and take advantage of models and insights developed by other teams. Projects spanning multiple departments will especially be challenging.
- Organizational Data Science Strategy: Hah! Good luck!
Hybrid Organizational Structures
The choice is not binary. Hybrid structures can help take advantage of the relative strengths of centralization and decentralization but also introduce their own problems.
Two such structures follow.
Strong Centralized Hybrid
Like a centralized structure, a single organizational data science leadership team sets the organizational data science strategy. Its management team serves as functional managers to hire, develop, and promote data scientists. Sister (or embedded) teams of engineers enable production deployment. However, the data scientists are assigned to (and might even sit with) various business units and focus on the same domain-specific problems. Breath of knowledge can be gained by rotating data scientists among the various centralized sub-teams. In short, the organization gets a centralized infrastructure, a common data science strategy, and effective talent management, and the business units get somewhat dedicated teams who are knowledgeable about their specific needs. The major downside is that communication overhead is higher and significant friction might pull data scientists in different directions if data science management and business management are not aligned.
Centralized Data Science Consultants
A more de-centralized hybrid has the data scientists report up through the business units. Meanwhile, a centralized data scientist leader or leadership team consults with the individual teams to encourage best practices and facilitate cross-departmental knowledge sharing. If empowered, such a central data science team could even set the organizational data science strategy; however, their ability to implement the strategy might be challenging without a dedicated team. This centralized team or a centralized project management office can manage the entire project life cycle up through productization. IT operations could maintain then deployed systems. In short, this compromise partially addresses the challenges from each centralized and de-centralized teams.
So…What works best?
The answer is highly dependent on the broader organizational structure. Centralized teams naturally work for organizations that might not be able to support multiple teams or when business units struggle to hire data scientists. However, their ability to respond to various domain-specific needs tends to wither as the organization grows which lends more toward a hybrid model or at least a stakeholder-focused and carefully coordinated centralized structure. The only structure I’d advise against is a pure decentralized model. The incremental cost of developing at least some sort of centralization effort pays for itself.