When it comes to Big Data, obviously there’s a lot of it flying around these days. Data is being produced at astronomical rates. In fact, 90% of the data in the world today was created in the last two years! The term “big data” can be defined as data that becomes so large that it cannot be processed using conventional methods. The size of the data which can be considered to be Big Data is a constantly varying factor and newer tools are continuously being developed to handle this big data. It is changing our world completely and shows no signs of being a passing fad that will disappear anytime in the near future. In order to make sense out of this overwhelming amount of data it is often broken down using five V's: Velocity, Volume, Value, Variety, and Veracity.
First let’s talk about velocity. Obviously, velocity refers to the speed at which vast amounts of data are being generated, collected and analyzed. Every day the number of emails, twitter messages, photos, video clips, etc. increases at lighting speeds around the world. Every second of every day data is increasing. Not only must it be analyzed, but the speed of transmission, and access to the data must also remain instantaneous to allow for real-time access to website, credit card verification and instant messaging. Big data technology allows us now to analyze the data while it is being generated, without ever putting it into databases.
Volume refers to the incredible amounts of data generated each second from social media, cell phones, cars, credit cards, M2M sensors, photographs, video, etc. The vast amounts of data have become so large in fact that we can no longer store and analyze data using traditional database technology. We now use distributed systems, where parts of the data is stored in different locations and brought together by software. With just Facebook alone there are 10 billion messages, 4.5 billion times that the “like” button is pressed, and over 350 million new pictures are uploaded every day. Collecting and analyzing this data is clearly an engineering challenge of immensely vast proportions.
When we talk about value, we’re referring to the worth of the data being extracted. Having endless amounts of data is one thing, but unless it can be turned into value it is useless. While there is a clear link between data and insights, this does not always mean there is value in Big Data. The most important part of embarking on a big data initiative is to understand the costs and benefits of collecting and analyzing the data to ensure that ultimately the data that is reaped can be monetized.
Variety is defined as the different types of data we can now use. Data today looks very different than data from the past. We no longer just have structured data (name, phone number, address, financials, etc) that fits nice and neatly into a data table. Today’s data is unstructured. In fact, 80% of all the world’s data fits into this category, including photos, video sequences, social media updates, etc. New and innovative big data technology is now allowing structured and unstructured data to be harvested, stored, and used simultaneously.
Last, but certainly not least there is veracity. Veracity is the quality or trustworthiness of the data. Just how accurate is all this data? For example, think about all the Twitter posts with hash tags, abbreviations, typos, etc., and the reliability and accuracy of all that content. Gleaning loads and loads of data is of no use if the quality or trustworthiness is not accurate. Another good example of this relates to the use of GPS data. Often the GPS will “drift” off course as you peruse through an urban area. Satellite signals are lost as they bounce off tall buildings or other structures. When this happens, location data has to be fused with another data source like road data, or data from an accelerometer to provide accurate data.
Ignoring Big Data won’t make it go away, and while it may not immediately kill your business it shouldn’t be ignored for very long. The results of Big Data can generally be directly measured making it easy to determine a return on investment. Big Data is a tool definitely worth looking into.
To find out more about big data you can read some of our previous Big Data posts.