Essentials of Data Science – Probability and Statistical Inference – Introduction to Probability Distributions

In the previous note on the Probability and Statistical Inference, we have seen the the important concept of probability and statistics which are as follows: At the very first, we have seen the basic theory of probability and how to model a random phenomenon by satisfying the axioms of probability. Further, we explore the random variables …

Essentials of Data Science – Probability and Statistical Inference – Introduction to Probability Distributions Read More »

 22,422 total views

Essentials of Data Science – Probability and Statistical Inference – Quantiles and Tschebyschev’s Inequality

In the previous note on the Probability and Statistical Inference, we have learned expectations, moments, skewness, and kurtosis to measure the central tendency, dispersion, symmetry, and peakedness of probability curve or distribution, respectively. Introduction We define quantiles in terms of the distribution function. The value for which the cumulative distribution function is: is called the p-quantile. Here, is a value …

Essentials of Data Science – Probability and Statistical Inference – Quantiles and Tschebyschev’s Inequality Read More »

 21,121 total views

Essentials of Data Science – Probability and Statistical Inference – Skewness and Kurtosis

In the previous note on the Probability and Statistical Inference, we have learned expectations and moments for the probability distribution of a random variable which gives central tendency and variability of the values of a random variable, respectively. In this note, we will further extend the concept of moments and study the other characteristics precisely the shape and peakedness, of the probability …

Essentials of Data Science – Probability and Statistical Inference – Skewness and Kurtosis Read More »

 212 total views

Essentials of Data Science – Probability and Statistical Inference – Moments and Variance

In the previous note on the Probability and Statistical Inference, we started a new topic that characterizes the probability distribution of a random variable to get the hidden statistical information about the probability distribution. One of the statistical tools is the expectation of random variables or expectation of the probability distribution of a random variable. We have seen …

Essentials of Data Science – Probability and Statistical Inference – Moments and Variance Read More »

 22,185 total views

Automatic Language Identification in Texts – Polyglot

In the earlier note on langid of this note series automatic language identification, we introduced how to detect language using the langid tool, which uses a naive Bayes classifier with a multinomial event model over a mixture of character n-grams and trained over 97 languages. It provided additional tools for model building, training, tokenization, etc., that are helpful …

Automatic Language Identification in Texts – Polyglot Read More »

 32,359 total views

Automatic Language Identification in Texts – LangId

In the earlier note on sparknlp of this note series automatic language identification, we introduced how to detect language using the sparknlp library, which uses pre-trained deep learning models generated using CNN architectures in TensorFlow/Keras. Currently, they have published pre-trained models that can detect 375 languages, which is significantly higher than any other open-source library. Introduction In this …

Automatic Language Identification in Texts – LangId Read More »

 21,776 total views

Automatic Language Identification in Texts – Sparknlp

In the earlier note on langdetect of this note series automatic language identification, we had introduced how to detect language using the langdetect library, which uses a Naive Bayes classifier with character n-gram to detect language.  In this note, we introduce another language identification library, which is a part of the sparknlp package. They had designed and developed Deep Learning models …

Automatic Language Identification in Texts – Sparknlp Read More »

 27,968 total views

Automatic Language Identification in Texts – Langdetect

In the note series of automatic language identification, we had introduced how to detect language using the gcld3 library. Moreover, it is designed to run in the Chrome browser, written in the C++ programming language, based on a neural network model, and supports over 100 languages/scripts. In this note, we introduce another language identification library called LangDetect. …

Automatic Language Identification in Texts – Langdetect Read More »

 32,355 total views

Automatic Language Identification in Texts – GCLD3

In the previous note on automatic language identification, we had introduced how to detect language using fasttext. Fasttext is a library created by Facebook’s AI research lab for efficient learning of text representations and classification. In this note, we introduce another language identification library called Google Compact Language Detector v3 (GCLD3). GCLD3 is designed to …

Automatic Language Identification in Texts – GCLD3 Read More »

 12,244 total views

Automatic Language Identification in Texts – Fasttext

Language detection is vital in Natural Langauge Processing (NLP), as different NLP tasks or activities are language-dependent. Moreover, finding the best language detector that can support most natural language, short text, and multilingual texts is difficult. However, the Fasttext library performs well compared to other automatic language identification libraries such as gcld3, langdetect, langid, nltk_textcat, …

Automatic Language Identification in Texts – Fasttext Read More »

 31,390 total views

Scroll to Top
Scroll to Top