As with many expressions nowadays, "big data" often is tossed around but rarely understood. How many safety leaders fully grasp what big data is or how it is relevant in the safety industry?

Big data has been defined as "an accumulation of data that is too large and complex for processing by traditional database management tools." Though this definition may seem rather vague, there are a few key components worth delving into in the pursuit of understanding the concept.

First, let's take a look at the phrase "an accumulation of data that is too large and complex…" Again, do not be alarmed by the broadness of this portion of an already broad definition.

In big data, "large and complex" actually is quite intuitive. For example, an Excel file with 10 rows and 2 columns would not be considered "large or complex" by most professionals today. However, 30 years ago, before the average office employee was at least somewhat familiar with spreadsheet tools, a 10 x 2 sheet may have been a source of panic. Thus summarizes the first curious point about big data: It's all relative.

The second portion of the definition drives home this theory of big data relativity: "…for processing by traditional database management tools." Today, the average computer has approximately 8GB of ram. This is more than adequate to handle a spreadsheet size of the above example.

In fact, today's various types of spreadsheet software can handle over 1 million rows and 16,000 columns. Although the reality remains that most modern laptops will slow to a frustrating crawl if you attempted to open a workbook of that size, the software theoretically does have the capability, and is therefore technically not incompetent.

Again, the inadequacy of the applications – and consequently, the second half of the definition of big data – are in the eye of the beholder.

So, we see that relativity is an important factor in determining what big data is. But relativity is not a clear and concise definition, and if you're a mathlete and student of logic, you're probably begging for a more precise definition. The research company Gartner coined another term for big data as "The 3 Vs."  They are as follows:

Volume: Why Is Big Data Relevant?

Thanks to Gartner, we finally can delve into how big data is relevant in workplace safety, starting with volume. This is fairly self explanatory. In order for data to be classified as big data, there needs to be lots and lots of it.

One of the best examples in the safety world is the number of inspections, and the more the merrier. As safety professionals gravitate more towards behavior-based safety-type (BBS) inspections versus inspections based on pure compliance, the number of "items" observed on worksites is increasing. In BBS, we no longer are just making sure that the guard on the saw is up to code, but also that the worker using that particular saw is utilizing the correct PPE and that are he or she is operating that machine correctly. By observing more and more activity on job sites, more and more data is generated.

Running on the assumption that safety professionals are collecting and reporting on that data rather than just filing it in a cabinet somewhere, more data is a large part of more revealing reporting. Additionally, a best practice in BBS includes a more diverse set of workers performing these inspections. That is, where once only safety professionals observed the workforce, anyone now is capable to observe safe or at-risk behavior among any of our colleagues. Today, more areas of industry are observed by more people coming from more backgrounds than in days past. As a consequence, the volume of data collected surpasses that of just "data" and graduates into "big data."

Variety: Methods of Data Collection

The next "V" is "variety." Variety indicates the different methods in which data is collected. Not too long ago, standard practice for safety inspections was to rely on good, old-fashioned pencil and paper to complete fixed checklists. In fact, that likely remains standard practice for many of you reading this.

Conversely, thanks to advancements in technology, we now enjoy the ability to perform inspections via mobile devices and tablets. What used to be primarily analogue now is assimilating nicely into the digital world.

Due to the different types of mobile devices at our disposal, the "variety" of our data has increased dramatically. Sensor technology – such as those used on production lines and forklifts – and the Internet also have changed the game of data collection. Think of them more as streams of data, taking on a new "form," departing from the paper-based checklists of yesteryear. Images and video are other formats of information we collect on a daily basis, all geared at understanding where the risk is on our job sites.