Data Quality is a common issue faced by organisations and it can be expensive to rectify. Some organisations spend tens or hundreds of thousands of dollars per annum on attempts to analyse and resolve data quality issues. Sometimes these efforts are dedicated data clean-up projects without budget and long term objectives.
Increasingly as IT budgets have come under tighter constraints, data quality projects are ad-hoc “fire-fighting” projects, as the developers of new IT systems stumble upon unsuspected data quality problems. Initiating a data assessment enables organisations to understand and build effective data quality strategies to support their business.
Effective data quality strategies help to alleviate many of the problems associated with poor data quality by identifying the underlying issues, correcting and ideally preventing and providing a foundation for reliable data use across the organisation.
The Focus Data Quality Assessment (DQA) Service
The DQA is focussed on data analysis and auditing using IBM Information Analyzer. This offering is designed to facilitate the automated analysis of raw source system data and should provide information to answer key question such as:
- Which data will not migrate?
- Which data will migrate but will violate key business rules?
- Which records will require a process change?
- Why embark on a Data Quality Assessment Project?
The Benefits
Benefits of the Data Quality Assessment are:
- Reduces time needed to analyse source systems by over 90% compared to traditional methods.
- Aids in understanding record construction, domain sets and logic of source data.
- Establishes the strengths and weaknesses of data sources in scope.
- Aids decision on alternative data sources available.
- Establishing early in the project whether the target design is within the capabilities of the source data.
- Identifying critical data issues in source data that must be rectified.
- Helps set realistic expectations about outcome of the actual data warehouse or mart that will be constructed from the source data.