Data Integrity: Methods to Prevent Data Concerns

Knowing the differences between data integrity, data quality, and data trust is essential to address concerns quickly and even prevent problems.

June 26, 2019 by Michael Feske

Data is used by businesses to make a variety of decisions, from staffing to optimizing processes. Questioning the integrity of the data can set off a chain of events that consumes both time and money. It's essential to have policies in place to ensure data accuracy and provide transparency for others to trust the data as well. Many data concerns are more about uncertainty than real issues. Knowing the differences between data integrity, data quality, and data trust is essential to address concerns quickly and even prevent problems. What you need to know to address concerns and prevent problems

Data integrity refers to the overall completeness, accuracy, and consistency of data. Is the data correct? Can it be replicated and provide the same results each time? These are all questions to be addressed to ensure the integrity of your data.

Data quality refers to a clear understanding of the meaning, context, and intent of the data. Characteristics of data quality can consist of the following: accessibility, timeliness, frequency, granularity, definition, relevancy, and comprehensiveness.

Data trust refers to the ability of the recipient to accept the data from the person or group delivering it. The person who delivers the data should be a subject matter expert with the expertise to provide answers and provide confidence in the data.

Each of these three items is unique and require different preventative steps and resolution paths. With the designated protocol, you can prevent issues and allow your team to focus on value-add actions.

Impact

"Fortune 1000 enterprises will lose more money in operational inefficiency due to data quality issues than they will spend on data warehouse and customer relationship management (CRM) initiatives." – Gartner

If you're relying on data to make decisions, then you need to ensure those decisions are based on accurate information. There are several different types of impacts related to data issues:

  • Financial
    • Increased operating costs, decreased revenue, penalties/fines
  • Confidence
    • Decreased customer and employee satisfaction, decreased trust within the organization
  •  Productivity
    • Increased workloads, decreased efficiency, decreased quality of work

Setting up preventative action plans for data issues is a necessary investment for you and your company.

Data Integrity

Typically, almost all issues related to data accuracy get lumped under this category. However, there are specific characteristics tied to data integrity, and they can have the most significant impact on a business. That's why it is crucial to have plans in place to prevent these issues from happening in the first place.

There are four simple actions to maintain data integrity for your business:

  1. Use Source Data
  2. Automation
  3. Document
  4. Audit

Using source data is first on the list for a reason: It's most important. Limited access to the source hinders validation and slows the resolution process when there is a question or concern. If a data output is provided from a third party, repackaged, and then delivered to the customer, it introduces more opportunities for mistakes. The situation is similar to the "telephone game" where the story gets passed from person to person, and pieces get added and removed. Getting the data directly from the source allows it to be accessed whenever needed by whoever needs it. Working to rely on data directly from the source takes time and money but will provide a vital level of control.

Automation reduces the risk of human error and increases efficiency. Avoid processes with extensive manual manipulation of the data. A simple mistake during a copy and paste can create significant and unnecessary issues. Try to include any calculations, grouping, sorting, etc. in the output of the data so it doesn't require extra steps later. The more steps there are the more chances for mistakes. Automation also allows the data to be replicated. If the results cannot be reproduced, then you should seek out a new solution.

Creating documentation for the data provides a level of visibility to spot potential issues. A few items to be documented include data source, criteria, calculations, exclusions, assumptions, risks, and processes for generating the data. The more information captured and documented, the better. This will allow for a straightforward review of the process and results to make changes if necessary. Having this documentation in place will enable others to understand the data easily.

Performing regular audits of critical data are key to ensuring the accuracy of the data. Comparing the results from multiple data sources will allow for the triangulation of potential issues. Scheduling these audits provides for an even distribution of the workload based on the capacity of the team and enables them to catch problems proactively. Communicating the audit plans to the consumer of the data reassures them the data is accurate.

Data Quality

This category can be the trickiest one to get a handle on because it is more related to perception rather than fact. Understanding data quality characteristics can help put processes in place to mitigate negative perceptions of the data.

Actions:

  1. Identify Stakeholders
  2. Provide Required Data
  3. Set Expectation
  4. Deliver Consistently

The first step is to identify stakeholders and recognize what data is relevant to them. A CEO will have a different level of understanding of financial data than a supervisor and vice versa. That same CEO might not care to digest individual employee performance data. Therefore, you want to ensure the appropriate people receive the relevant data. By identifying who should get what, you will avoid questions from people who were not the intended audience or fully understand the data.

You should provide only the required data: no more, no less. Giving someone too much data can be just as bad as giving no data. Thousands of lines of data or multitudes of reports are overwhelming. Finding what you need is like hunting for a needle in a haystack, and the critical piece of information might be missed. Your data should reflect the priorities at hand; in some cases, a large data dump is required while others may only want a summary. This step is closely tied to the first one. Knowing your audience, what they are looking for, and the purpose of the data will help ensure you are delivering what's expected.

When providing the data, it is important to set expectations, explaining what the data is and what it is not. Recipients need a clear understanding of what's presented. There should already be documentation around the sources and calculations but it requires simplification for easy consumption and understanding. If a person doesn't understand the data, they might question it or draw incorrect conclusions. A clear summary and instructions can highlight the essential elements of the data and reduce the number of questions.

One of the easiest ways to prevent questions about the quality of the data is to deliver it consistently. It will raise flags if the data is typically provided at 8:00 am but delivered at 9:30 am yesterday and 10:00 am today. Utilize the automation in place to generate the data, to facilitate the delivery as well. Consistency will build a level of trust between providers and recipients of the data.

Data Trust

Once someone loses trust in the accuracy of the data, it can be hard to regain it. This is why it is essential to have plans in place to build and keep that trust. Investing in people is equally as important as technology because having the right people in place can save time and money in the long run.

Actions:

  1. Skilled Employees
  2. Communication
  3. Investigation
  4. Plan for Issues

Hiring skilled employees to handle your data is an essential and preventative investment. The employees who manage and deliver the data must have the skills to review and identify issues. Staff assignments should reflect the importance of data, with the most skilled staff responsible for the data critical to business decisions. While cost savings is a KPI for business, the data error expenses usually are much higher than the cost of skilled staff.

Another critical piece of data trust is communication, and the ability to confidently and concisely share the data. The person providing the data should be prepared to explain and answer questions about it. Successful communication requires excellent listening skills and the ability to understand the technical level of the audience. Identifying the expertise level will help craft the appropriate message and respond to questions accordingly.

Thoroughly investigating the data should be done early by the provider and before it is shared with others. Review the data and dig into anything that looks out of place. Anomalies can help point out possible issues. If numbers drastically increase or decrease, find and understand the cause. Including a review of the data delivery process will find most errors, and problems are easier to fix sooner than later.

Mistakes will happen, so there should be a documented plan in place on how to respond to issues. Having this in place, and easily accessible to staff will prevent critical steps from being overlooked. The first step is alerting the appropriate people of the problematic data to mitigate its impact. This group of stakeholders should be identified early and updated regularly. Next, perform a root cause analysis, utilizing applicable portions of the audit processes established for data integrity. Once the cause for the data issue is pinpointed, all resolution actions need to be documented. An action strategy should follow, laying out details on how the problem will be prevented from happening again. Finally, these resolution steps and preventative actions should be summarized into a second notification sent to appropriate stakeholders. Being transparent and proactive shows honorability and accountability, promising new measures to prevent issues from reoccurring.

Conclusion

According to the International Institute for Analytics, businesses who use data will see $430 billion in productivity benefits over their competition by 2020. This means as we continue to move forward, companies will become more reliant on data-based decisions, and data integrity will be more critical.

Benjamin Franklin said, "By failing to prepare, you are preparing to fail." Understanding the differences between data integrity, data quality, and data trust will allow you to implement processes to address the different concerns. This process reduces the number of costly, time-consuming issues and increases trust in the data. With consistent data integrity, the more time for your business team to focus on providing impactful data.

 

*Michael Feske is an Analytics Manager at Stefanini

Share:
See more data integrity

Let's co-create. Ask our experts for a proposal.