Why is it important for a data scientist to learn statistics?

Statistics is the art and science of extracting answers from data. It helps in decision-making in an uncertain environment along with, there are times it also helps in decision-making in certainty.

Primarily, the purpose is to make good decisions from data. The decisions made using data are consistent compared to those made through opinions. Therefore, we need to make decisions using models involving data.

At times we collect data, but we will not cover the entire population as it is huge. So we collect data from samples and then try to understand something about the population by analyzing the data collected from or collected using samples.

Suppose our task is to find the average height of people in India. Is it possible to go and measure each height and then take the average? The answer is no, as the Indian population is enormous and practically not possible. So the alternative is a nationwide collection of samples randomly or based on the criteria to learn the average height of people.

There are two types of models in statistics, and these are called descriptive statistics and inferential statistics.

  1. Descriptive statistics use graphical and numerical procedures to summarize data and to transform data into information.
  2. Inferential statistics provide bases for forecasts, predictions, estimates and are used to transform information into knowledge.

In short, descriptive statistics gather, sort, summarize data from samples and Inferential statistics uses descriptive statistics to estimate population parameters.

Nowadays, everyone is moving towards deep learning, as it is like a black box and rarely data scientist knows what is happening inside. Moreover, scientists are more interested in the accuracy of results. If they are getting good accuracy, they are happy to use that model. These tools work well when we have an enormous amount of data. However, classical machine learning is still valuable when limited data and limited computing power to make predictions. In this case, statistics play an essential role, and all data scientists should know the basics of statistics to analyze and argue the results.

References

 120 total views,  1 views today

Scroll to Top
Scroll to Top
%d bloggers like this: