As organizations continue to expand their data analysis competencies, the data science function is becoming more of a team sport with numerous team roles within the data science group.
Of all these roles, the difference between the data analyst and data scientist role is perhaps the most confusing. Indeed, the roles have similar responsibilities. Data analysts and scientists have many overlapping skills. And, organizations have their own take what these roles should do.
Jeff tackled some of these topics in his previous post on data analytics project management. This post expands on his post and aims to further clarify the distinction of these roles by exploring:
- What are the fields of data analytics and data science?
- What are data analytics vs data scientist roles and responsibilities?
- What skills do they need?
- What tools do they use?
- What are their career paths?
Data Analytics vs Data Science – The Fields
I’ll start by acknowledging that the distinction is confusing and that you’ll find a lot of different definitions out there. Many viewpoints generally are consistent but some actually contradict each other. For example, in researching this post, I found one article which viewed data science as a subset of data analytics. Yet, another saw data analytics as a subset of data science.
What is data analytics?
Simply put, data analytics is the field of collecting and making sense out of data. It goes back millennia – at least as far back as Ancient Egyptian census tabulations. However, the modern data analytics field emerged in the 1980s with the adoption of relational database systems and the release of Microsoft Excel in 1985.
Today, the data analytics field encompasses a wide variety of systems and analysis techniques that draws on statistics, information management systems, and productivity tools like spreadsheets. The goals are broad but the most common objective is to make this data accurate and accessible for analyses to drive decisions.
What is data science?
It’s a more specialized field that grew out of the turn of the 21st century as companies began collecting more and more data. Companies like Google and Facebook saw these large data sets as an opportunity for competitive advantage.
Yet, traditional data analytics techniques struggled to scale with this advent of big data. Thus, the data science field emerged to generate actionable insights from the collection, preparation, analysis, visualization, and management of large and unstructured data sets.
The term “data science” is still somewhat vague and even controversial. In fact, many leaders such as famed statistician Nate Silver even argue that the field of “data science” is just a rebranding of statistics (Stats and Data Science Views, 2013). Others view the term as just a hyped term for data analytics.
What’s the difference?
However, I as well most people in the industry recognize data science as a distinct discipline and that relative to data analytics, data science tends to:
- Focus more on predictions (versus the historical or current focus of data analytics)
- Leverage unstructured and big data sets
- Use more advanced algorithms (think neural net vs. computing the mean)
- Rely on the scientific method to prove hypotheses
- More heavily engage computer programming
- Be more research-focused
Data Analyst vs Data Scientist – Roles and Responsibilities
It’s no surprise that there’s a lot of varying thoughts on what each of these roles does.
For example, I came across an article stating that data scientists don’t collaborate as much as data analysts. However, I cannot stress enough the importance for data scientists, as well as data analysts, to communicate and collaborate with their team members.
Another article argues that domain knowledge is more important for data analysts. This is contrary to the popular Venn diagram (above) of a data scientist which paints “domain knowledge” as one of the three intersecting domains a data scientist needs to master.
Regardless, much of the confusion lies with the fact that both roles indeed have significant overlap…
Overlap in Roles and Responsibilities
Both data analysts and data scientists:
- Clean, manipulate, and query structured data
- Explore datasets, often by producing new data structures and visualizations
- Convert raw data into information that can drive strategic action
- Conduct recurring and ad hoc analyses to support operational tasks
- Find and investigate outliers
- Calculate statistics such as averages, variances, or skewness
- Uncover current trends in data
- Assess the quality and efficacy of data sets
- Communicate their findings to both technical and non-technical teams
So let’s now explore some differentiating factors…
Big Data Focus
You might notice the first bullet in the above list lists “structured data” as an overlapping focus for both data analysts and scientists. However, data scientist responsibilities extend beyond this as they commonly wrangle and analyze big data sets whose sheer size or lack of structure exceed the capabilities of structured relational databases. Some data analysts also work with big data but their focus tends to be more on traditional structured data.
Time Horizon Focus
While both roles center around data, one of the biggest differences between data analysts and data scientists is the time horizons they focus on. In a personal interview, Ashwin Pingali, CTO of DataOwls, explains that “Data analysts tend to focus their analyses on historical patterns and current trends. Meanwhile, data scientists are more likely to extend these patterns to predict future trends. In other words, data analysts look at what happened, while data scientists try to predict what will happen and how to influence future patterns.”
Reporting vs Modeling Focus
Data analysts are more likely to focus on developing operational metrics, financial reports, KPIs, and dashboards. Data scientists likewise focus part of their time on this but typically not as their end goal. Rather, it a step (often referred to as “exploratory data analysis”) toward a broader goal to figure out how to best model the data to achieve a specific outcome.
Master the skills and gain the confidence to deliver data science projects and to lead data teams. Grow with the Data Science Process Alliance’s consulting and certification programs.
Data Analyst vs Data Scientist – Skills
Overlapping Skills (…and virtues)
Regardless of whether you’re a data analyst or scientist, you will need:
- A love for numbers
- A solid foundation in probability and statistics
- Strong communication skills to be able to inspire action based on data-driven analyses
- An inquisitive mind to ask questions (to stakeholders and to the data)
- Patience … both spend a lot of time cleaning data
- SQL to query (and occasionally setup) databases
So what separates a data scientist from an analyst? I asked this to Justin Butlion, Founder of projectbi.net. Justin explains that “A data scientist is more like a mathematician, software engineer or statistician than an analyst. A data scientist can much more easily work as a data analyst, than vice versa. The real work of data scientists is to solve complex challenges using algorithms, machine learning, advanced mathematics, and big data tech […]” which often extends beyond the skillsets of a data analyst.
Let’s dive into some of these…
Training in the Scientific Method
Routine data analyst tasks such as reporting on descriptive statistics or developing visualizations generally don’t require a strong scientific mindset. However, data scientists are on a mission to uncover or even prove something that is otherwise uncertain. To accomplish this, data scientists are experts at the full scientific method from identifying key questions to ask, to hypothesis testing, and to reporting on their conclusions.
A data analyst should have a basic understanding of various algorithms such as linear regression, logistic regression, and clustering techniques. But the expert-level skills of understanding which algorithm to apply, how to apply it, and how to measure the results are the hallmark of a data scientist.
Math and Stats
To effectively model data, data scientists are applied mathematicians with a focus on statistical learning. A lower bar of math skills is required by data analysts.
Data Analyst vs Data Scientist – Tools
As discussed in the responsibilities section, both roles manipulate data. Yet, how they do this tends to be different.
Data Analyst Tools
Data analysts tend to prefer Excel-based formula, Visual Basic for Excel, R, sometimes Python, or rely on licensed no-code or low-code software like Tableau, Alteryx, or auto-ML tools to get the job done. You’ll often see them use:
- Databases: SQL (advanced) and Relational Databases
- Visualization tools: Tableau, Power BI, Qlik
- Spreadsheets: Excel or Google Sheets
- Modeling Tools: SAS, RapidMiner; AutoML tools like H2O, Amazon SageMaker, or DataRobot
- Programming (often at intermediate level): Excel VBA, R, or Python
Data Scientist Tools
Data scientists likewise leverage many of these same tools. However, not all of these tools scale with big data. Moreover, data scientists tend to prefer building their own solutions. As such, data scientists generally prefer programming (and specifically Python) to manipulate and model data and (sometimes) to automate their predictions. Moreover, driven by the need for more computational power, data scientists commonly rely on cloud-based systems. Here’s an example list:
- Programming (more expert level): Python, Scala, Spark, R; also C, C++, or Java are common for tenured scientists coming from an engineering background
- Databases: Both NoSQL and traditional SQL Databases
- Cloud computing: AWS, Google Cloud, or Microsoft Azure
- Big data: Hadoop, Elastic Map Reduce, Pig, Hive, Impala
- Visualization tools: Python libraries such as Matplotlib, Seaborn, or ggplot; might also use commercial-licensed software such as Tableau, Power BI, or Qlik
- Spreadsheets (generally lower emphasis): Excel or Google Sheets
- Modeling Tools (generally lower emphasis): SAS, RapidMiner, AWS SageMaker, H2O, DataRobot
Data Analyst vs Data Scientist – Career Paths
A bachelor’s degree is generally the bar of entry for a data analyst position. It’s common to find entry-level data analyst positions that have no or minimal years of work experience required. However, senior data analyst roles will ask for multiple years of experience and sometimes master’s degrees. Specific domain knowledge is often requested (e.g. a background in web analytics for a digital marketing data analyst).
The bar is higher for data science. Higher-level education is the most common path into the field as most companies expect a master’s (or even a Ph.D.) and/or at least a few years of work experience to get your foot in the door. This coincides with the view that data scientists need to have an overall higher level of quantitative training and ideally some research experience.
Yet, there are numerous paths to data science. For example, one of my former data science team managers had a drama degree but built up his data experience through a boot camp and on-the-job experience. Others go without formal educational programs, are self-taught, and demonstrate their experience through Kaggle competitions or their Git repositories.
Job titles are tricky as each organization has its own take on what to call each position. And it doesn’t help that many organizations tack the “data scientist” title onto a role for sake of making it sound “sexy” and to attract more applicants.
Common Data Analyst Titles
Some of the common data analyst title roles include business intelligence analyst, operational analyst, database analyst, financial data analyst, reporting analyst, or sometimes simply “analyst”. These more specific titles indicate an intersection with another field.
Common Data Scientist Titles
Meanwhile, common alternative job titles for data scientists include quantitative analyst, financial engineer, economist, physicist, or professor.
Data scientists command higher salaries. See the US-based salaries below of data analyst/scientist roles from the 2021 Robert Half Salary Guide:
|Title||25th percentile||50th percentile||75th percentile||95th percentile|
|Business Intelligence Analyst||92,000||115,750||139,000||189,250|
|Data Analyst/Report Writer||86,250||103,250||122,250||146,750|
|Data Warehouse Analyst||84,500||105,250||126,250||165,000|
|Data Reporting Analyst||64,500||79,000||95,000||120,750|
Glassdoor.com in its “50 Best Jobs in America for 2021” report finds an even more drastic difference in salaries between the roles with the data scientist median base salary at $113,736 and data analysts at $70,000. Moreover, Glassdoor ranks data scientist at the #2 best job (behind Java developer) while data analyst comes in at #35.
I don’t blame you! There is indeed a lot of overlap in these fields and roles. However, there are numerous distinct differences. In short, data scientists tend to have:
- more programming capabilities
- greater ability to work with big data
- deeper knowledge in applying algorithms, statistics, and mathematical modeling
- more academic training
- more years of experience
- higher compensation
Yet, don’t take his as somehow that data analysts are inferior. Rather, data analysts play broad and critical roles both within a data science team and across the organization but just in a different capacity with a different focus.
For those already working in corporate environments, understanding these differences can help you better work with your peers in those roles. For managers, you are better equipped to hire the right person and manage team members serving in those roles. And for career entrants and students, understanding the difference can you plot out a career path.
- Take our Data Science Team Lead course to learn more about the various team roles and how to effectively lead data science teams
- Blog Post: How to Lead Data Science Teams
- Blog Post: 8 Roles of a Data Science Team
- Blog Post: Project management for Data Analytics
- Blog Post: Data Science vs Software Engineering
Master Data Science Project Management
Apply the latest data science process research with practical tips from the field.
With over 6 hours of on-demand content and 2 hours of personalized coaching, the Data Science Team Lead course provides the leading agile project management certification focused on data science projects.
Deliver data science outcomes. Differentiate yourself. Get certified.