Brought to you byLONGITUDES
Data has become central to how we run our businesses today. In fact, the global market intelligence firm International Data Corporation (IDC) projects spending on data and analytics to reach $274.3 billion by 2022.
However, unwise spending accounts for much of that money. Gartner analyst Nick Heudecker estimates that up to 85 percent of big data projects fail.
A big part of the problem is numbers that show up on a computer screen take on a special air of authority. Once pulled in through massive databases and analyzed through complex analytics software, we rarely ask where data came from, how it’s been modified or whether it’s fit for the intended purpose.
The truth is to get useful answers from data, we can’t just take it at face value. We need to learn how to ask thoughtful questions.
In particular, we need to know sourcing, which models were used to analyze it and what was left out. Most of all, we need to go beyond using data simply to improve operations and leverage it to imagine new possibilities.
We can start by asking:
Data is the plural of anecdote. We record and store real-world events such as transactions, diagnostics and other relevant information in massive server farms. Yet few bother to ask where the data came from, and unfortunately, the quality and care with which data is gathered can vary widely.
In fact, a Gartner study recently found that firms lose an average of $15 million per year due to poor data quality. Often data is subject to human error such as when poorly paid and unmotivated retail clerks perform inventory checks.
However, even with an automated data collection process, there are significant sources of error such as intermittent power outages in cellphone towers or mistakes in the clearing process for financial transactions.
Data of poor quality or used in the wrong context can be worse than no data at all. In fact, one study found that 65 percent of a retailer’s inventory data was inaccurate.
Another concern, which has become increasingly important since the EU passed stringent GDPR data standards, is whether proper consent accompanies data collection.
So don’t just assume the data you have is accurate and of good quality. You have to ask about sourcing and maintenance. Increasingly, we need to audit our data dealings with as much care as we do our financial transactions.
"To get useful answers from data, we can’t just take it at face value. We need to learn how to ask thoughtful questions."
Even if data is accurate and well maintained, the quality of analytic models can vary widely.
Often open-source platforms such as GitHub pull together and repurpose models for a particular task. Before long, everybody forgets where it came from or how it is evaluating a particular dataset.
Lapses like these are more common than you’d think and can cause serious damage. As models become more sophisticated and incorporate more sources, we’re also increasingly seeing bigger problems with model training.
One of the most common errors is overfitting, which basically means the more variables you use to create a model, the harder it gets to make it generally valid. In some cases, excess data can result in data leakage in which training data mixes with testing data.
These types of errors can plague even the most sophisticated firms. As we do with data, we need to constantly ask difficult questions of our models.
Are they suited to the purpose we’re using them for? Are they taking the right factors into account? Does the output truly reflect what’s going on in the real world?
"Increasingly, we need to audit our data dealings with as much care as we do our financial transactions."
Data models, just like humans, tend to base judgments on the information most available.
Sometimes, the data you don’t have can affect your decision making as much as the data you do have. We commonly associate this type of availability bias with human decisions, but often human designers pass it on to automated systems.
For instance, in the financial industry, those who have extensive credit histories can access credit much easier than those who don’t. The latter, often referred to as “thin-file” clients, can find it difficult to buy a car, rent an apartment or get a credit card.
Yet a thin file doesn’t necessarily indicate a poor credit risk. Firms often end up turning away potentially profitable customers simply because they lack data on them.
Experian recently began to address this problem with its Boost program, which allows consumers to raise their scores by giving them credit for things like regular telecom and utility payments. To date, millions have signed up.
So it’s important to ask hard questions about what your data model might be missing. If you are managing what you measure, you need to ensure what you are measuring reflects the real world, not just the data that’s easiest to collect.
"We often call data the new oil, but it’s far more valuable. We need to start treating data as more than a passive asset class."
During the past decade, we’ve learned how data can help us run our businesses more efficiently. Using data intelligently allows us to automate processes, predict when our machines need maintenance and serve our customers better.
Data can also become an important part of the product itself. To take one famous example, Netflix has long used smart data analytics to create better programming for less money. Yet where it gets really exciting is when you can use data to completely re-imagine your business.
At Experian, they’ve been able to leverage the cloud to shift from only delivering processed data in the form of credit reports to a service that offers its customers real-time access to more granular data that the reports are based on. That may seem like a subtle shift, but it’s become one of the fastest-growing parts of Experian’s business.
We often call data the new oil, but it’s far more valuable. We need to start treating data as more than a passive asset class.
If used wisely, it can offer a true competitive edge and take a business in completely new directions. To achieve that, however, you can’t start merely looking for answers. You have to learn how to ask new questions.
Republished with permission, this article first appeared on Harvard Business Review.
Longitudes explores and navigates the trends reshaping the global economy and the way we’ll live in the world of tomorrow: logistics, technology, e-commerce, trade and sustainability. Which path will you take?