In an increasingly digital world, the amount of data a small business must parse gets larger by the year. Learn what big data means for your SMB.
The internet has only been around for three decades, but in that relatively short time, it's become one of the most important tools at our collective disposal. As a small business owner, you can use it to collect data that helps you make informed business decisions, run predictive analytics for future sales, and enhance the customer experience.
All of those functions are the result of big data. By learning how to digest and use it, your small business can turn valuable insight into action.
What is big data?
At its core, big data is what it sounds like. Thanks to advances in technology, we can collect and understand massive and complex data sets that stream in at an incredible rate. Since these large data sets can come from a wide range of sources at a volume that humans can't comprehend, we rely on advanced data processing software to make that data usable.
Sites like Internet Live Stats make it easier to visualize big data and the speed at which an insane amount of information flows through the internet. For instance, ILS estimates that 100.5TB of internet traffic, 85,836 Google searches and 9,139 tweets are sent in a single second.
Big data comes from more sources than just the internet, though. Your car's onboard computer collects thousands of data points about your driving habits that the manufacturer can use to determine future changes to their cars, while insurance providers can use that same data to adjust your rates.
"Modern big data tools allow us to quickly analyze the outcomes of the past and the state of the present to decide what action would be the most effective in a particular situation," said Ivan Kot, senior manager at Itransition.
Through the use of such a tool, Kot said, the kind of data that flows through an external source (like the internet) or an internal source (like in-house call centers and website logs) can help small businesses predict outcomes, prevent fraud and drive innovation.
How does big data work?
It may help to understand big data in terms of commercial fishing. If you're trying to run a business by being the only fisherman standing on the side of a stream, you're not going to yield a lot of fish. However, if you have a fleet of boats, each with large traps and wide nets, you will get plenty of fish of various species. Big data software is like that fleet of boats, and the fish are all the different types of data that we generate every day.
Once collected, the data is analyzed by the businesses utilizing big data techniques. This analysis allows a data scientist to understand a multitude of ways a company can be more efficient and increase profits. Big data works for more than just consumer needs – the medical field also utilizes such data to better predict the spread of disease.
"Businesses use big data to get insights on a number of things, including customer patterns and behaviors – most commonly, purchasing behaviors," said Jack Zmudzinski, senior associate at Future Processing. "The reason that big data is so vital for businesses is that it can help to identify new growth opportunities and even new industries through examination of customer information."
A data scientist can use big data to "provide context via queries to identify insights and results from the data. Automation and workflow tools would then automate the actions based on the data," according to James Ford, who holds a Ph.D. in data science and is the co-founder of AutoBead.
"Traditionally, the types of technology used by those investing in big data initiatives included database types such as SQL or NoSQL, which were connected using an enterprise service bus (database and endpoint integrations), which standardized the data and allowed it to work together," Ford said. "Large-scale data processing solutions such as Apache Hadoop or Databricks enable large-scale data processing and analysis."
Thanks to the advancement of cloud computing, Ford said, database software like Microsoft Azure's Cosmos DB can house multiple database types in a single database. Because of that, teams "no longer need to invest in expensive and complicated integration systems, as all data exists in one location, separated by security policies and logic rather than APIs and distance."
The 4 V's of big data
For data scientists, the concept of big data can be broken down into what they call the "four V's." Though some schools of thought say there could be as many as 10 V's, here are the top four qualifiers that help explain when a data stream becomes a big data stream.
Thanks to the massive amounts of data available daily, big data tends to take up a large amount of bandwidth and storage. Thousands of bytes can traverse the internet, especially with the widespread use of broadband internet. In fact, according to a survey by IBM, an estimated 40 zettabytes of data will be created this year, marking a 300% increase from 2005. Such vast amounts require big data technology that can handle large data sets.
Data flows through the internet at such a speed that if you were to try to parse it on your own, it would be akin to trying to drink from the world's largest and most powerful water hose. How quickly big data moves increases exponentially based on the number of connections people have with one another, since you'll likely be sending text messages, liking social media posts and making business agreements with them. The speed at which incoming data needs to be processed is a hallmark of big data.
Data can be collected from many different sources, such as various social networks, business and consumer transactions, and the proliferation of smart devices that collect data from (often unwitting) users. Similarly, that data can come in different file formats and structures, from stringently categorized database information to real-time file transfers and communications.
Inaccurate data is useless data. Furthermore, inaccurate data costs the U.S. economy approximately $3.1 trillion each year, according to the IBM survey. Many business leaders consider big data a gamble, with 1 in 3 respondents saying they "don't trust the information" big data provides. Nevertheless, big data technology tries to mitigate that problem as much as possible.
Examples of big data
Big data may seem like a nebulous concept that's hard to visualize, but it's used so widely in today's highly connected world that some examples immediately come to mind.
Netflix uses big data to gather billions of data points per day. While the most obvious data point would be what each person watches, Ford said the streaming giant uses big data in more focused ways.
"It was recently estimated that Netflix saves $1 billion each year on retention due to its effective use of the data available to them," he said. "[Netflix can determine] how many minutes a person watched before they stopped. Did they watch more than one episode? What type of content is someone most likely to binge? All these factors drive future production decisions, as well as personalized in-app experiences for users."
New York Stock Exchange
Big data is also a huge part of the world economy. There's no greater example of this fact than the New York Stock Exchange, which uses some of the most advanced computing techniques to handle the more than 1.4 billion shares traded each day. That amount of transactional data requires the type of big data solution that can receive, parse, and then transmit the vast data volume that goes in and out of Wall Street in a short time.
On a more personal note, your social media pages are also part of big data. Though your Twitter profile and Facebook feed could be seen as single data points, the more granular data-covering items such as your likes, posts, photos and personal data are all quantifiable pieces of data that big data can use to understand what you're likely to buy, what your hobbies are, and even who you're likely to vote for in the coming election.
How is big data stored and regulated?
Given how "big" big data is, the storage facility of such information must be equally massive, right? Well, it depends on how much money and space your business has available to it. Some of the largest data centers in the world span millions of square feet and house billions of dollars in server equipment. For your small business, though, a server rack with terabytes of storage could be enough.
While you will likely find many companies relying on physical solutions to house their file systems, such as a large data warehouse or large-scale server, other companies have turned to cloud-based storage solutions, like the ones hosted by Google and Amazon Web Services. In both instances, the data can be stored for as long as they have space.
As for regulation of big data, the federal government in the U.S. has taken a largely hands-off approach to the matter. Instead, existing privacy laws tend to police big data and the corporations seeking to participate in it. Privacy laws in America usually focus on specific industries that deal in sensitive information, such as financial institutions that use nonpublic personal information, which must conform to the Gramm-Leach-Bliley Act. Similarly, healthcare providers that use big data must ensure that the data is secured in compliance with the Health Insurance Portability and Accountability Act (HIPAA).
Jacqueline Klosek, senior counsel at Goodwin Procter LLP, said in a post for Taylor Wessing that companies often alter the data to remove any sensitive identifying information. That step is usually taken before data scientists analyze the data or before it's sent to a third party.
"Under the GLBA, the definition of 'personally identifiable financial information' specifically excludes: 'information that does not identify a consumer, such as aggregate information or blind data that does not contain personal identifiers such as account numbers, names, or addresses,'" Klosek wrote. "Exceptions to privacy requirements for de-identified data also exist under HIPAA. Companies using data that is strictly anonymized will still need to ensure that their conduct complies with their own privacy policies and contractual obligations, and, of course, will need to ensure that the data at issue is truly anonymous."'
As big data gets larger in scope, it's only a matter of time before legislation reins in the uses of private data. At the state level, some parts of the country have already begun taking action. [Read related article: GDPR: Email Marketing in the Age of Digital Privacy]