Data integrity testing

BI/Data Warehouse Testing Challenges: Choosing Qualified Testers and Getting Started

Editor’s note: Wayne Yaddow is an independent consultant with over 20 years’ experience leading data migration/integration/ETL testing projects at organizations including J.P. Morgan Chase, Credit Suisse, Standard and Poor’s, AIG, Oppenheimer Funds, IBM, and Achieve3000.  Additionally, Wayne has taught  IIST (International Institute of Software Testing) courses on data warehouse, ETL, and data integration testing. He continues to lead numerous ETL testing and coaching projects on a consulting basis. You can contact him at

This is blog 3 out of 3 in a series on the top BI/Data Warehouse Testing challenges. The first blog focused on Effective Test Planning for BI/Data Warehouse Testing, and the second focused on “Identifying the Precise Focus of ETL Testing“.

The impulse to cut costs is often strong, especially in the final delivery phase. A common mistake is to delegate testing responsibilities to resources with limited business and data testing experience.

The data warehouse testing lead and other hands-on testers are expected to demonstrate extensive experience in their ability to design, plan, and execute database and data warehouse testing strategies and tactics to ensure data warehouse quality throughout all stages of the ETL life cycle.

Recent years have seen a trend toward business analysts, ETL developers, and even business users planning and conducting data warehouse testing. This may be risky. Among the required skills for data warehouse testers are the following:

  • A firm understanding of data warehouse and database concepts
  • The ability to develop strategies, test plans, and test cases specific to data warehousing and the enterprise’s business
  • Advanced skill with SQL queries and stored procedures
  • In-depth understanding of the organization’s ETL tool
  • An understanding of project data and metadata (data sources, data tables, data dictionary, business terminology)
  • Experienced at data profiling with associated methods and tools
  • The ability to create effective ETL test cases and scenarios based on ETL database loading technology and business requirements
  • Understanding of data models, data mapping documents, ETL design, and ETL coding
  • The ability to communicate effectively with data warehouse designers and developers
  • Experience with multiple DB systems, such as Oracle, SQL Server, or DB2, Postgresql
  • Troubleshooting of ETL (e.g., Informatica/DataStage) sessions and workflows
  • Skills for deployment of DB code to databases
  • Scripting (Unix/Linux, etc.)
  • Use of Microsoft Excel and Access for data analysis
  • Implementation of automated testing for ETL processes
  • Defect management and tools

Recommendations for Getting Started with ETL Test Automation

Evaluate the current state of testing in your organization. Not all ETL testing requires automation. Assess the situations mentioned above to determine what type of automation would benefit the testing process and how much is needed. Evaluate testing requirements and identify inefficiencies that may be fixed with automated testing. QA teams that spend a lot of time on regression testing will benefit the most.

Create a business case for automated testing. Automated testing generally comes at a high price. In order to convey the value to the business, the IT team must first make the case.

Evaluate the options. After evaluating the current state and requirements within the IT department, investigate  which available tools fit the organization’s testing processes and environments. Options may include vendor, open source, in-house, or a combination of the aforementioned tools.  Automated ETL testing tools can significantly reduce the amount of time spent testing code in comparison to traditional manual methods. Other benefits include the creation of reusable code and a reduction in costs associated with personnel and rework. Get educated on automated testing and available tools to decide if it’s right for the QA team.

[Read Wayne’s paper on the Top 5 BI/Data Warehouse Testing Challenges]