ABSTRACT

In today’s technology-driven world, data is being produced at an astronomical rate. As a matter of fact, more than 85% of the data in the world today has been created in the past 2 years. Also, experts suggest a 4,300% increase in annual data production by 2020 [1]. This staggering amount of data generated every second, which is to be processed and analyzed, has given rise to the domain of big data. Expounded as the data that requires processing beyond the conventional methods, big data size is an ever-varying factor. Added to that, with the fear of being left behind, many business entities today are becoming data-rich, but with very poor data insight. They are storing data that they have no idea what to do with, without extracting meaningful information from it. Furthermore, these stored data have a finite life span as after some time they become outdated. As a result, novel tools are being deployed to extract valuable information from this ocean of data collected and stored.