jump to navigation

Today’s Linkedin Discussion Thread: Enterprise Data Quality April 28, 2009

Posted by Peter Benza in Data Analysis, Data Elements, Data Governance, Data Optimization, Data Processes, Data Profiling, Data Quality, Data Sources, Data Standardization, Data Synchronization, Data Tools, Data Verification.
Tags: ,
add a comment

Here is my most recent question I just added to my Linkedin discussion group = Enterprise Data Quality.

QUESTION: What master data or existing “traditional” data management processes (or differentiators) have you identified to be useful across the enterprise regarding data quality?

MY INSIGHTS: Recently, I was able to demonstrate (and quantify) the impact of using an NCOA updated address for match/merge accuracy purposes when two or more customer “names and addresses” from three disparate source systems were present. The ultimate test approach warrants consideration especially when talking about the volume of customer records for big companies today number “hundreds” of millions of records. It is ideal to apply this test to the entire file not just a sample set. But, we all know today its about: money, time, value, resources, etc.

For testing purposes, I advised all individual customer address attributes were replaced (where information was available) with NCOA updated addresses and then loaded and processed through the “customer hub” technology. If you are not testing a piece of technology, then constructing your own match key or visually checking sample sets of customer records before and after is an alternative. Either way, inventory matches and non-matches from the two different runs – once with addresses (as-is) and once with addresses that leverage the NCOA information.

My goal was to establish a business process that focused on “pre-processing customer records” using a reliable third party source (in this case NCOA) instead of becoming completely dependent on a current or future piece of technology that may offer the same results, especially when the methodology (matching algorithms) are probalistic. My approach reduces your dependency, as well, and you can focus on “lift” the technology may offer – if your are comparing two or more products.

Where as, inside a deterministic-based matching utility (or off-the-shelf solution) adding extra space or columns of data to the end of your input file to store the NCOA addresses will allow you to accomplish the same results. But, for test purposes, the easier way may be to replace addresses where an NCOA record is available.

Remember, based on the volume of records your client may be dealing with, a pre-process (business process) may be ideal, rather than loading all the customer names and addresses into the third party customer hub technology and processing it. Caution: This all depends on how the business is required (i.e. compliance) to store information from cradle to grave. But, the rule of thumb of the MDM customer hub is to store the “best/master” (single customer view record) with the exception of users with extended search requirements. The data warehouse (vs. MDM solutions) now becomes the next challenge… what to keep where and how much. But, that is another discussion.

The percentage realized in using the updated customer address was substantial (over 10%) on the average based on all the sources factored into the analysis. This means several 10’s of millions of customer records will match/merge more effectively (and efficiently) followed by the incremental lift – based on what the “customer hub” technology enables using its proprietary tools and techniques. This becomes the real differentiator!

Dots On A Map Improve Data Quality April 18, 2009

Posted by Peter Benza in Data Accuracy, Data Hygiene, Data Integrity, Data Management, Data Mining, Data Profiling, Data Quality, Data Standardization, Data Stewardship, Data Types, Data Visualization, Linkedin.
Tags: , , ,
add a comment

This was a presentation I originally prepared back in 2005, but is probably even more applicable in 2009 given the impact using a GIS tool can have on visualizing data quality – customer addresses on  a map! The next time you conduct a customer “data” assessment – try this!

Data Quality and Master Data Initiatives March 31, 2009

Posted by Peter Benza in Data Accuracy, Data Integration, Data Integrity, Data Profiling, Data Quality, Data Sources.
Tags: , , , , , ,
1 comment so far

Initiatives related to master data continues to be on the radar of major corporations especially as it relates to data quality and other mission critical business processes across the enterprise that impact or relies on the quality of data being complete, accurate, and up-to-date.

What other MDM initiatives (besides Data Quality) are also paramount as part of centralizing master data for single customer view purposes.

Lets start a list:

1.) Data Profiling

2.) Data Integration

3.) Match Accuracy

4.) MDM Tools

5.) ???

Do you use Linkedin ? April 23, 2008

Posted by Peter Benza in Data Governance, Data Hygiene, Data Management, Data Profiling, Data Quality, Data Tools.
Tags: , , ,
add a comment

If you are interested in Enterprise Data Quality and want to network with other people that have similar professional interests or skills… Click on the link below and submit your name for review.  A linkedin account is required to join this network group.



BusinessObjects data quality XI January 17, 2008

Posted by Peter Benza in Data Accuracy, Data Analysis, Data Architecture, Data Assessment, Data Consolidation, Data Hygiene, Data Integrity, Data Profiling, Data Quality, Data References, Data Strategy, Data Templates, Data Tools.
Tags: , , ,
add a comment

Standardize, Identify Duplicates, Correct, Improve Match, Append, Consolidate, and more.    


What types of common data problems are found in your master data? January 13, 2008

Posted by Peter Benza in Data Analysis, Data Assessment, Data Governance, Data Hygiene, Data Metrics, Data Profiling, Data Quality.
Tags: , ,

Master Data exists across your entire enterprise.  Companies today are assessing what is the best way to consolidate all their information assets (data sources) into a “single customer view”.

What types of data problems exist in your organization today or the future with the move towards managing data at the enterprise level?

[Be first to answer this question]