The Importance of Quality Control: How Good Is Your Data?
Most manufacturers would never think of eliminating the quality control function from their production processes. Without quality control, the number of defective products that must be reworked, scrapped or returned would dramatically increase. Almost all consulting/service organizations monitor the quality of the services they deliver to uphold their reputations, ensure satisfied customers and generate repeat business. After all, who would keep a product that falls apart the day after it is purchased or fly on an airplane that does not conduct preflight checks?
Yet ensuring data quality within operational and business intelligence applications is a discipline that is frequently overlooked by many organizations. It is often not until you discover a major problem that could have been avoided through quality control of your data that you recognize the importance of data quality.
Once this occurs, most executives quickly realize that almost all of the organization's operational and analytical business processes rely on a solid, high-quality, data foundation.
‘They discovered that many of the user names were cartoon characters or well-known politicians.’
Data quality involves ensuring the accuracy, timeliness, completeness and consistency of data used by an organization while also making sure that everyone who uses the data has a common understanding of what the data represents. For example, does product sales data include or exclude internal sales? Is it measured in units or dollars—or, perhaps, euros? The scope of a data quality initiative is not limited to the data generated by an organization's own operations; it must include data obtained from external sources.
While data quality may once have been considered a nice-to-have initiative, organizations now realize that it is an absolute necessity, especially for mission-critical applications or those that are required to meet governmental reporting and disclosure requirements. In fact, when applied to applications involving homeland security, data quickly escalates into a "bet your border" issue.
Accuracy and compliance
A prime example is tax regulatory compliance, which is often difficult to administer. In the telecommunications industry, providers are responsible for knowing the appropriate tax jurisdictions for each account and face substantial fines if the proper taxes are not collected. Conversely, municipalities looking to collect their fair share of tax revenue need to ensure that neighboring municipalities are not incorrectly collecting their revenue. Data quality software provides for accuracy in the tax jurisdiction assignment process, thereby helping telecommunications companies remain within the law and helping ensure that customers and local municipalities are treated fairly in terms of taxation and revenue apportionment.
Poor data quality can negatively influence how a company is perceived in the marketplace. A customer's first impression should be one of quality; the way a company is perceived by others can be influenced by the quality of its data, especially if it results in misaddressed mail, incorrect invoices or erroneous shipments. For instance, a healthcare organization that acquired several other smaller healthcare organizations discovered that a newly acquired company was quite careless in its data entry procedures. Social Security numbers, used as the patient identifier, were routinely entered inside of a free-form patient name field. The placement was seemingly random. The Social Security number sometimes appeared at the beginning of the name field; sometimes between first, middle or last names; and sometimes even within one of the names. Only after the company deployed data quality software capable of recognizing the pattern for a Social Security number was it finally able to merge the patient records of its newly acquired company with the patient records of its other companies.
On a practical scale, the inability to eliminate redundant name and address records results in unnecessary postage costs. According to a report published by The Data Warehousing Institute (TDWI) in 2002, "Poor quality customer data costs U.S. business a staggering $611 billion a year in postage, printing, and staff overhead." TDWI cited several examples, including a telecommunications company whose data entry errors incorrectly coded accounts and lost $8 million a month when it couldn't send out bills. The true cost is undoubtedly much higher, as the $611 billion was limited to customer name and address data and did not include secondary effects such as those associated with alienating and losing a customer.
Furthermore, people who receive duplicate mailings can become frustrated and question the company's overall operating efficiency. But consider what happens when a single customer is included in a company's database multiple times, each time with a different value for the customer identifier. In that case, your company would be unable to determine the true volume of this customer's purchase. You could even be placed in the embarrassing situation of attempting to sell the customer an item that customer has already purchased from you.
Therefore, the importance of having an ongoing data quality process that is continually maintained cannot be underestimated. Operational efficiency, print and postage costs, brand image, compliance and security are all elements that benefit from good data quality and management.
If you're still not convinced that forward-thinking organizations should include data quality as part of their everyday operations, consider one online marketer's experience.
Thinking they had 18 million users, based on the number of people registered to access content and place orders, company executives were about to make a major investment in a high-end operational CRM system when they discovered that many of the user names were cartoon characters or well-known politicians. One of the most common was a certain "George Bush" of 1600 Pennsylvania Ave.
After they eliminated the obvious fictitious entries, they found that the number of valid entries had dwindled to approximately 6.5 million. Using de-duplication software to eliminate multiple entries for valid users who simply registered again when they forgot their passwords, the company further reduced the number of unique users to fewer than 2.5 million.
The company's existing CRM software could easily handle a volume that size and still provide sufficient headroom for reasonable future growth. The company postponed its purchase of the new and expensive software indefinitely.