Skip to main content

Beginners Guide to ETL Testing

Beginners Guide to ETL Testing


What is ETL Testing?

These days more and more systems are moving from legacy to new technology, and ETL is one of the common methods used to help this transformation. It can consolidate the scattered data for any organization while working with different data format and sources. In this article, we will talk about the basic concept of ETL and how it has been tested.

Why we need ETL?

ETL stands for extract, transform, and load. In many organizations, the setup of the IT deferments happened long time back. So, in most of the cases the way each deferment handles the data is different. For example, in a retail organization you many have different departments such as sales, marketing, logistics, etc. Each of them is handling the customer information but way they store that data could be quite different. The sales deferment store it by name whereas marketing deferment has it in the number format. ETL can take all this data from different sources and transform it into a uniform presentation, such as for storing in a database or data warehouse.

Another challenge of the old IT infrastructure is the different data format used by different departments. So, it might happen for HRD department somebody is using SAP whereas the sales deferment you have Oracle Apps. So, to take a business-critical decision it is difficult for the higher management to get data from different platforms and consolidate them. ETL can do that job easily. It will take data from different sources and transform it to a uniform format and store it into DB tables. From the DB, you can generate the required reports.

How to do ETL Testing?

From the above points, clearly accuracy of the data after transformation is critical in ETL testing because if the data is not accurate, then the business decision will be wrong.
So, following test types are commonly used in ETL testing:

No Data Losses: In this testing type, we first determine an N number of entities in the source system. Say the total number of employee is one such entity. If I have 2000 employees in the source system, then after ETL transformation and data store I should still get 2000employees in the destination database. Only exception is when we have some business rule applied to the transformation. An example could be in the new system the business does not want to keep data for people who no longer works for the organization. So, in that situation the destination system will discard the record of those employee and store number less than 2000.

This testing is done by mainly querying the source and designation database. Depending on the complexity of the system these SQL queries might be very complex and beyond the skill of a tester. However, running the query and checking the result is still the responsibility of the QA team.
Validation of Transformation rules: This is the second big validation point. In ETL testing, you will get the requirements in terms of transformation rules. One example could be the data format in the source was yyyy/dd/mm where as in the destination database this is mm/dd/yy.So you need to check them by taking some sample data. Here you apply all the good conventional test theories like boundary value analysis, equivalent partition, etc. All the database field level and record level data integrity must be tested here.

So, you need to select the test data in such a way that at the end all the fields are touched into the destination database after transformation.

Business processes testing: Once the transformation is done these data feeds will go to many consumer systems (sales, marketing, HRD deferment applications). You need to ensure by testing from those systems that they can receive the data in the preformed format. All the critical end to end business flows must be tested from the GUI. 

Comments

Popular posts from this blog

How to Setup and Configure JMeter in Windows

JMeter is one of most popular performance testing tool and since it is open-source, therefore it is often a start point for beginners in performance testing. In this blog, I will share a step-by-step guide that can be used to install JMeter in Windows environment. There will also be some configuration and script designing tips which you will find useful while working on real-time JMeter projects. 1- Setup and Verify Java Apache JMeter is purely a Java based tool; therefore first of all we need to make sure that Java is properly installed and configured in our system. If you are already using any java based applications, then Java might already be installed on your system. To check if Java is already installed open command prompt and run command java -version. If Java is successfully installed and configured then the command will return valid java version number (java version “x.x.x.x”) as shown in image below: If Java is not installed or configured, then you may get a message like ...

A QA expert’s guide to 11 popular software testing tools

  Which tools should you use for your quality assurance (QA)? Here are 11 popular software testing tools we recommend. 1. Selenium A testing framework for web applications on platforms such as Linux, Mac, and Windows,  Selenium  is also used to test browser compatibility. Developers can test in a number of languages, such as Java, Groovy, PHP, Ruby, C#, Python, and Perl. Selenium is part of most major browsers, including Firefox and Chrome. 2. Jenkins Jenkins  is a  continuous integration  (CI) tool for testing codebase changes in real-time. Jenkins lets you detect and resolve codebase problems quickly and automate build testing. An open-source program, it was  forked  from the Hudson CI tool. Jenkins runs on servers in a container and supports version control tools such as Perforce, Git, and Subversion. 3. New Relic New Relic  is a SaaS product that allows you to monitor mobile and web apps in real-time. It offers deep analytics that can giv...

Testers are not the "Quality" officers in a Agile teams | Supreme Agile.

What makes a team really "Agile"? To me, an agile team is one that follows the simple values and principles of the Agile Manifesto but knows how to take them a step further. Although the Agile Manifesto created the foundations of all it’s frameworks (Scrum, EP, Lean Etc.), it’s still not a religion that the team should follow blindly. An Agile team should have the power to learn new things. They should combine artifacts that work great in other frameworks, innovate new development techniques and find the balance between an Agile Spirit and what is more practical that works for them. There is one thing an Agile team must follow and that is the simple mission that the customer is the most important stakeholder. Therefore the team should always focus on making on delivering the best possible product that meets the customer’s requirements. From my experience, if a team focuses on this single mission.  It will formulate a team that will deliver the best quality ...