In early 1970, historical data were analyzed by the human, and these data were stored and managed on the operational or hierarchical database. Gradually, technology developed, and data storage and processing moved from hierarchical to relational databases. It had given significant flexibility to store and manage structure data.
Evolution of storage, processing, and data analysis technology
Further, technology evolved, and in the early 1980 and onward, the data warehouse concept became prominent. The data storage and processing were managed via data warehouses. These data were used to analyze business transactions.
Furthermore, in 2000, the stream computing data in motion processing evolved, and in 2010, the major transformation happened, and the real-time analytic processing was adapted to improve business response. The era of Big Data technology development begins.
Operational database → Data warehousing → Stream Computing → Real-Time Analytic Processing
- Online transaction processing (OLTP) via DBMSs
- Online Analytical Processing (OLAP) via Data Warehousing
- Online Real-Time Analytics Processing (RTAP) (Big Data Architecture & Technology)
In the old model, few companies are generating data, and all other are consuming data. However, in the new trend, all of us are generating data, and all of us are consuming data too.
What is driving Big Data
Earlier, the goal was to find the business intelligence from data using descriptive and prescriptive analysis, and also, the complexity was moderate. However, nowadays, along with business intelligence, predictive analytics have taken place to predict future events. This makes the system more complex and also requires distributed high computing power and storage.
- Business intelligence: Ad-hoc querying and reporting, data mining and techniques, structured data, typical sources, and small to mid-size datasets.
- Predictive Analytics and Data Mining: It requires optimizations and predictive analytics, complex statistical analysis, all types of data, and many sources, massive datasets, and more real-time.
Comparison of complexity vs. Business value → Business Intelligence to Predictive Analytics and Data Mining.
Big Data Analytics
Big data is more real-time in nature than traditional Dataware (DW) house applications. Traditional Datawarehouse architectures are not well suited for big data apps. Shared nothing, massively parallel processing, scale-out architectures are well-suited for big data apps.
Big Data Technology
It starts with data processing and storage technologies, data clearing and analytic, and deep insight from the data for business objectives.
- Technology for Processing and storing data:
- Hadoop, Verbica, MapReduce, Esper, Kdb, Greenplum, ETL, ECL, Teradata, Netezza, etc.
- Technology for Big Data Analytics:
- SPSS, AMPL, SAS, MATLAB, R, Python, Hive, etc.
- Technology for Extracting and predicting deep insight from the data:
- Artificial Intelligence including Machine learning, Deep Learning, sentiment analysis, social network analysis, etc.
The business objectives are mass customization of services, quicker response to market trends, identifying real-time cost optimization, faster, more accurate decision making, better and more holistic R&D and autonomic supply chain management, etc.
References
- Big Data Computing, By Prof. Rajiv Misra, IIT Patna.