Ensure Oracle cloud readiness with scalable, codeless test automation
Watch this webinar learn how you can use codeless test automation to develop a scalable Oracle testing strategy that works before, during, and after the migration.
Editor’s note: Wayne Yaddow is an independent consultant with over 20 years’ experience leading data migration/integration/ETL testing projects at organizations including J.P. Morgan Chase, Credit Suisse, Standard and Poor’s, AIG, Oppenheimer Funds, IBM, and Achieve3000. Additionally, Wayne has taught IIST (International Institute of Software Testing) courses on data warehouse, ETL, and data integration testing. He continues to lead numerous ETL testing and coaching projects on a consulting basis. You can contact him at email@example.com.
This blog, which explains how a robust and flexible solution for data validation and reconciliation supports decision integrity, is the second article in a 3-part blog series. [Read Part 1 and Part 2]
All data information systems must be designed in a way to allow for transparent and straightforward data reconciliation during and after workflow processing. Even if precise and regular reconciliation is not required for the immediate needs, a data warehouse must be designed and built in a way to provide a clear view of inconsistent and unreconciled data. Even when BI reports objects don’t require high levels of accuracy, the end users must know the percentage/amount of data which is not accurate, and to know the reason for inaccuracy. 
An important key to a successful data dictionary defined and agreed with the business community to ensure successful
reconciliation. If definitions of major business terms (such as for Customer or Product) in data sources are different from the definitions used by a data warehouse, successful data reconciliation will be very problematic. However,
data dictionaries will be an important aid.
The original data extracted from sources will often be required for reconciliation. There may be situations that
the original data in the source system is not available (no longer exists) at the time when reconciliation is required. In this case the original data extract preserved in the data warehouse (database or file system) becomes a critical element for the reconciliation process.
The need to capture details of the data changes (updates and deletes), and the ability to report on the data changes (or at least easy access to the history of changes), can be underestimated both by business users and data warehouse developers. If data changes are not captured properly, then reconciliation of the data becomes very problematic.
Regularly used reconciliation routines often need ad-hoc reporting functionality to analyze data changes. For large data
warehouses with frequently changed data, it is essential that the data model allows for efficient and easy access to all historical changes.
The design of data exceptions, which may work efficiently for the data reconciliation process, is probably the most complex part of the overall data reconciliation architecture. The design is dependent on the number of factors related to the nature of data exceptions (referential integrity, data structure, specific constraints, etc.).
The most important requirements for good exception handling design suitable for the data reconciliation process
Tracing data elements from reports back to sources is usually a non-trivial task if a data warehouse integrates similar data from multiple sources. This task becomes very complex if the data passes through multiple transformations or in a heterogeneous data warehouse environment. Efficient data lineage is often a mandatory requirement for compliance or regulatory reporting.
Efficient data reconciliation should have the ability to identify the source of data at any point in the data flow.
Complex validation and data integration rules implemented into the data transformation tend to be forgotten or misunderstood by business users.
It is important that the data integration, validation and exception rules are easily accessible by business users (e.g. from mapping documents, in-line report comments, or online documentation).
Data warehouse processes may be interrupted due to technical failures. A common situation is when a lengthy or complex data transformation routine has been interrupted and the data was only incompletely loaded and transformed.
In this case, a reliable data reconciliation method will be required to determine from which point to restart the process, which particular data needs to be reloaded or removed.
Automation of data reconciliation (particularly in production) needs to be reviewed thoroughly during the design process. It may be beneficial for large, complex and heterogeneous data warehouses to make a reconciliation process as an integral part of the overall data flow – starting from the data extraction, data loads and finishing by the end user reports and analytics.
Considerations for selecting data reconciliation tools:
Reconciliation without effective processes and tools is always an arduous and time-consuming process where staff manually, or with multiple tools, investigate numerous files, tables, and spreadsheets analyzing each data merge and transformation. This approach is inefficient, error-prone, and unsustainable.
Leveraging automation tool capabilities, teams can focus on data reconciliation results rather than the process itself. This allows teams to shift focus and add value in more intensive areas that require their professional and business-related expertise, typically in the form of analyzing and resolving exceptions.
Automated data reconciliation tools empower teams to eliminate repetitive, time-intensive, and error-prone manual tasks fully.
Large-scale technology disasters are rarely a consequence of a single factor. Mostly, they result from a set of flaws in software development and maintenance processes. Data reconciliation controls serve as an independent additional protection mechanism for complex systems and therefore should be considered as a necessary part of production infrastructures.
In order to perform reconciliation tests, one needs to have data reconciliation tools available. The Quality Assurance team should strongly consider the possibility of implementing test tools capable of running passive data consistency checks.
These tools should be implemented with a potential opportunity in mind that they will also be used in production environments.
 Rachid Mousine, “Design Essentials for Reconciliation Data Warehouse”, Formation Data Pty Ltd