Any company that has an ETL process knows how important it is to validate all the steps from source to destination system to ensure data has processed, transformed and loaded correctly. For instance, if the data is not transformed correctly, all dependent systems on that data can fail causing errors and a bad experience for the end user. Many companies fail to test the ETL process thoroughly and wish they had when these errors happen.
So what is ETL? ETL stands for extract, transform, load. These are 3 steps in a data integration strategy where you need to gather data from multiple systems and "transform" the data into a format so that other systems can use the synthesized data.
ETL Testing is the process of validating the data loaded from source to the destination system(s) and is transformed correctly along the way.
Below are 3 example automated tests written using Mocha showing different aspects of ETL testing. The 3 tests demonstrate:
- Extracting data from a CSV file, validates data within the file, and runs a simple ETL job that transforms the data by calculating statistics and writes resulting data to another CSV file, and then verifies correctness of outputted data
- Similar test as first but uses JSON
- Extracts data from a database as the source and outputs to another database table
Does your company have an ETL process that doesn't have automated tests around it? Do you experience failures, errors or crashes because of it? Add validation testing today! Need help? Testery testing services team will gladly guide you through how to add validation around your ETL process. Contact us today!