Skip to main content

Beginners Guide to ETL Testing

Beginners Guide to ETL Testing


What is ETL Testing?

These days more and more systems are moving from legacy to new technology, and ETL is one of the common methods used to help this transformation. It can consolidate the scattered data for any organization while working with different data format and sources. In this article, we will talk about the basic concept of ETL and how it has been tested.

Why we need ETL?

ETL stands for extract, transform, and load. In many organizations, the setup of the IT deferments happened long time back. So, in most of the cases the way each deferment handles the data is different. For example, in a retail organization you many have different departments such as sales, marketing, logistics, etc. Each of them is handling the customer information but way they store that data could be quite different. The sales deferment store it by name whereas marketing deferment has it in the number format. ETL can take all this data from different sources and transform it into a uniform presentation, such as for storing in a database or data warehouse.

Another challenge of the old IT infrastructure is the different data format used by different departments. So, it might happen for HRD department somebody is using SAP whereas the sales deferment you have Oracle Apps. So, to take a business-critical decision it is difficult for the higher management to get data from different platforms and consolidate them. ETL can do that job easily. It will take data from different sources and transform it to a uniform format and store it into DB tables. From the DB, you can generate the required reports.

How to do ETL Testing?

From the above points, clearly accuracy of the data after transformation is critical in ETL testing because if the data is not accurate, then the business decision will be wrong.
So, following test types are commonly used in ETL testing:

No Data Losses: In this testing type, we first determine an N number of entities in the source system. Say the total number of employee is one such entity. If I have 2000 employees in the source system, then after ETL transformation and data store I should still get 2000employees in the destination database. Only exception is when we have some business rule applied to the transformation. An example could be in the new system the business does not want to keep data for people who no longer works for the organization. So, in that situation the destination system will discard the record of those employee and store number less than 2000.

This testing is done by mainly querying the source and designation database. Depending on the complexity of the system these SQL queries might be very complex and beyond the skill of a tester. However, running the query and checking the result is still the responsibility of the QA team.
Validation of Transformation rules: This is the second big validation point. In ETL testing, you will get the requirements in terms of transformation rules. One example could be the data format in the source was yyyy/dd/mm where as in the destination database this is mm/dd/yy.So you need to check them by taking some sample data. Here you apply all the good conventional test theories like boundary value analysis, equivalent partition, etc. All the database field level and record level data integrity must be tested here.

So, you need to select the test data in such a way that at the end all the fields are touched into the destination database after transformation.

Business processes testing: Once the transformation is done these data feeds will go to many consumer systems (sales, marketing, HRD deferment applications). You need to ensure by testing from those systems that they can receive the data in the preformed format. All the critical end to end business flows must be tested from the GUI. 

Comments

Popular posts from this blog

How to Setup and Configure JMeter in Windows

JMeter is one of most popular performance testing tool and since it is open-source, therefore it is often a start point for beginners in performance testing. In this blog, I will share a step-by-step guide that can be used to install JMeter in Windows environment. There will also be some configuration and script designing tips which you will find useful while working on real-time JMeter projects. 1- Setup and Verify Java Apache JMeter is purely a Java based tool; therefore first of all we need to make sure that Java is properly installed and configured in our system. If you are already using any java based applications, then Java might already be installed on your system. To check if Java is already installed open command prompt and run command java -version. If Java is successfully installed and configured then the command will return valid java version number (java version “x.x.x.x”) as shown in image below: If Java is not installed or configured, then you may get a message like ...

Steps to configure Selenium WebDriver environment

Steps to Download Java Navigate to https://www.java.com/ Click on Free Java Downloads button. Click on the Agree and Start Free Down Button in the page as shown below. Steps to installing Java. After Java is installed, the next step is to set the environment variables. Steps to setup Environment Variables: Navigate to the path where Java is installed and go to the path until as below C:\Program Files\Java\jdk1.8.0_65\bin Copy the above path and Right-click on MyComputer -> Properties. Click on Advanced System Settings and click on “Environment Variables”. In the System Variables, paste the above-copied path and paste in the Path field. Click OK and Click OK Note: To verify if Path is set correctly, Go to Command Prompt by typing cmd in the Run and then in the Command Prompt type javac and should return list available as below: Download Selenium WebDriver based on language in which you want to use for Automation. Say you can use Java, C#, Ruby, Python, etc. The m...

What are some of the most common challenges faced by software QA team?

Here we go with the top challenges: 1) Testing the complete application: Is it possible? I think impossible. There are millions of test combinations. It’s not possible to test each and every combination both in manual as well as in automation testing. If you try all these combinations you will never ship the product 2) Misunderstanding of company processes: Some times you just don’t pay proper attention what the company-defined processes are and these are for what purposes. There are some myths in testers that they should only go with company processes even these processes are not applicable for their current testing scenario. This results in incomplete and inappropriate application testing. 3) Relationship with developers: Big challenge. Requires very skilled tester to handle this relation positively and even by completing the work in testers way. There are simply hundreds of excuses developers or testers can make when they are not agree with some points. For this tester...