Distribution plots are used for Univariate analysis. They display a single continuous feature, which helps us learn parameters such as mean, variance, and standard deviations. There are mainly three distribution plots, Rug plot, Histogram, and Kernal Density Estimation (KDE) plot. This note will see all these types of plots and perform hands-on on the dataset called dm_office_sales.
Rug Plot
It is a one-dimensional distribution plot and merely adds a dash for every single value. Furthermore, these plots are drawn by making the y-axis value zero and keeping the x-axis value or vice versa. In other words, we can say that out of the two axes, only one axis has values, and a plot is drawn based on those values.
Histogram
It is constructed by binning the data and counting the number of observations in each bin. The objective is usually to visualize the shape of the distribution. Therefore, the number of bins needs to be large enough to reveal interesting features. On the other hand, small enough not to be too noisy.
Kernel Density Estimation (KDE)
It is a method of estimating a probability density function of a random variable. In other words, we can say that it is a way of estimating parameters of a continuous probability distribution.
174 total views, 1 views today