Tuesday, 22 December 2015

What Do We Call Big Data Analysis?

The original and much cited definition from the Gartner project mentions the “three Vs”: Volume, Velocity and Variety and a fourth V "Veracity" added by IBM and other Vs followed later (Value, Variability etc.)

Volume - Huge and continuously increasing size of the collected and analyzed data. Magnitude of BigData is much larger than that of data managed in traditional storage systems. People talk about terabytes and petabytes rather than gigabytes. What is considered as Big today may not be big a couple of years down the lane

Velocity - Speed at which the data is generated and input into the analyzing system. Forces algorithms to process data and produce results in limited time as well as with limited computer resources

Variety - Heterogeneous, complex data representations - Structured as well as semi-structured and unstructured data repositories

Veracity - Quality of the data and its trustworthiness and its pre-processing

Value - Big Data Analysis seeks to economically extract value from very large volumes of a wide variety of data. Provide novel insights into data, application problems, and create new economical value that would support better decision making

Big Data brings about a multitude of new issues, some already known, and some still to be discovered. Big Data analysis is bringing a new dynamism to the fields of Data Mining and Machine Learning

Makes a lot of the research done in the past obsolete since previously designed algorithms may not scale up to the amount of new data now typically processed, or they may not address the new problems generated by Big Data Analysis. Big Data analysis requires a different set of computing skills from those used in traditional research.

