History and emergence of Big Data terminology
One of the first publication using this term was in July 2000 from Francis Diebold of the University of Pennsylvania. For the first time, the term Big Data was related to modelling information. He named it “Big Data” phenomenon, and already described it like an opportunity to access to “quality relevant data[1]”. Besides, Big Data has greatly gained in popularity, since 2009 Big Data started to show up as a marketing term in many press releases and stories. At the same time, the technology Hadoop emerged. This technology has accelerated the business products growth of companies like Facebook, Yahoo or Twitter, from a technologic perspective, it was a new framework for storage and large-scale processing of data-sets. In 2010 the trends emerged mainly thanks to IBM and Oracle holding their biggest Information Management conferences and start to present Big Data as a product asset.
However Big Data is poorly defined by a part of the community of scientists who worked on this topic. Some of them saw it as an opportunity and just a fad (Abiteboul 2012). Moreover, it simply exists no single unified definition. One of the most common used definition on the field is the one of Gartner’s. They define Big Data with the regular 3 V, Volume, Velocity and Variety. (Gartner 2012). In this definition two aspects are noteworthy. The Big Data is no longer considered as a capacity of storage, first, they introduce Variety of different data types, unstructured/structured for example. Secondly, the Velocity qualifies the speed at which data are created, collected and analyzed. An additional dimension is added by the company IBM to address the uncertainty of the data: Veracity (Schroeck et al., 2012). Veracity refers to the question of the reliability of ascertain data type. And the last V stands for value, the value has been introduced to qualify pertinent and useful scenario utilization of Big Data, for example, business scenario for customer sales, enhancing the 360º View of Customers.
Most often, Big Data is defined by volume of data, in the reference, “Big Data: The next frontier for innovation, competition, and productivity” a white paper about the business opportunity written by McKinsey , the scientist who leads research on global economic and technology trends describe more the opportunity has a capacity (Manyika 2011). There is a growing awareness across companies that Big Data addresses more than just volume of data (Schroeck 2012). Although, each IT editor has developed his own definition[2], for example Oracle contends that Big Data is the derivation of relational database driven business decision making.
What is Big Data – Volume ,Variety,Velocity,Value and Veracity[3]
Oracle has long been a leader in information management and analytics for structured, mostly enterprise transaction data, but its introduction of the Oracle Big Data solutions is demonstrating product vision and commitment to the growing importance and potential value to Oracle customers of incorporating, relating and analyzing unstructured data for new insights.
On its side, Intel has concretely formalized links for Big Data to organizations “generating a median of 300 terabytes (TB) of data weekly”, especially since Intel communication and product offers were the first partner to start a company project on Big Data strategy. Historically it’s natural for a hardware constructor, like Intel, Xerox or Vmware… to have this market value, otherwise clients will go on cloud technology based on virtualization and specialists like Amazon, Google, And Microsoft.
On its side, Microsoft provides a notably succinct definition: “Big Data is the term increasingly used to describe the process of applying serious computing power – the latest in machine learning and artificial intelligence – to seriously massive and often highly complex sets of information”[4]. Moreover, Microsoft continues to accelerate the integration of a strategy based on Mobile and Cloud. On the topic of Big Data, they introduce IA breakthrough by using words like machine learning and artificial intelligence in their products, communications and definition of what is Big Data.
Version 3.0 of the Big Data Landscape, from Matt Turck, now at FirstMark
For each IT editors the discussion is oriented on different topic that are matching with product solution, every definitions introduces new concepts and new IT technologies. We will details the different expectations and opportunity on marketing by using this technology.
It is also very important to take the new technological pure players challengers into consideration. First Google, Amazon and now Facebook, are creating and mastering the data from the Web, Online searches, posts, and customer behavior. They are platforms that capture aggregate consumer and provide services, data to marketing IT department. They are new competitors and partners for classic editors. And this new companies are redefining the marketing, especially some industry like the advertising market and e-business strategies.
The Lines between Software and Hardware Continue to Blur[5]
[1] “Big Data” Dynamic Factor Models for Macroeconomic Measurement and Forecasting [ http://www.ssc.upenn.edu/~fdiebold/papers/paper40/temp-wc.PDF ]
[2] Unified by data a survey of Big Data definition : Jonathan Stuart Ward and Adam Barker – School of Computer Science University of St Andrews, UK
[3] What is Big Data [ http://www.datatechnocrats.com/tag/big-data/ ]
[4] The Big Bang: How the Big Data Explosion Is Changing the World – Feb. 2013[ http://www.microsoft.com/en-us/news/features/2013/feb13/02-11bigdata.aspx ]
[5] The Lines Between Software and Hardware Continue to Blur – The Wall Street Journal – Dec. 2012 [ http://online.wsj.com/news/articles/SB10001424127887324677204578188073738910956 ]