The Rhythm of Data

It is obvious that we are inundated with data. The only near certainty suggests that this inundation would continue for the foreseeable and distant future. 

It is well recognized that we live in a world that possesses a massive volume of data and continues to generate data at an immense rate. For example, estimates suggest that there were 44 zettabytes of existing data at the beginning of 2020 and in 2018 daily rate of data creation was at 2.5 quintillion bytes per day (see https://seedscientific.com/how-much-data-is-created-every-day/). 

Further, these data appear in many forms including numerical, text, images, audio, and video data. As a result, in addition to storing and accessing these data, there is a need for methods that can capture unique features of these data. For example, a transcript of an audio interview, which is a textual representation of these data, fails to capture features such pitch and tone that can substantially change the meaning and sentiment being conveyed. 

By rhythm of data, I mean issues that relate to volume, rate of increase, and diversity of today’s data. In this blog series, I intend to document the challenges that current data present, discuss the potential solutions, and present my thoughts. As I learn and document my learning, I am hoping it will benefit others that follow.


Leave a comment