Automated the testing process to test the streaming data pipelines on production.
Enabled the on demand test execution and reduced our regression cycle to 70% by integrating it with CI.
About Ultimate Kronos Group
Ultimate Kronos Group (UKG) is an American multinational technology company with dual headquarters in Lowell, Massachusetts, and Weston, Florida. It provides workforce management and human resource management services. As a leading global provider of HCM, payroll, HR service delivery, and workforce management solutions, UKG’s award-winning Pro, Dimensions, and Ready solutions help tens of thousands of organisations across geographies and in every industry drive better business outcomes, improve HR effectiveness, streamline the payroll process, and help make work a better and more connected experience for everyone.
- A plug-and-play test automation framework that enables UKG to accelerate developers’ testing and expand the test coverage by 230%.
- Reduced Regression time by 70% and enabled on demand test execution.
- Minimise upstream source team dependency with dynamic event generation.
In order to achieve this team has faced numerous challenges which are as follows:
- Lack of Test Data
As we don’t have access to the upstream system, we tested with the mocked events from the upstream system, but those limited events are not good enough to ensure the quality of production data pipelines.
- Limited Test Coverage
As we were testing with mock events, so our test coverage was also limited; we were just performing functional testing with some limited positive and negative scenarios.
- Datasource Out of Our Control
The upstream source system is responsible for sharing the mocked events. Since we have multiple upstream source systems, it introduces difficulty in communication and collaboration. For example, it is difficult to communicate with each team for small updates in mocked events. This resulted in brittle test cases and failures when upstream systems made any changes.
- Shuffling test data with production data set.
This is another major challenge; As we had only one GCP instance for the development and testing team, we used the same service account to access the same GCP resource. As the developed pipeline was a streaming pipeline, so it’s pulling some real-time events along with mocked events from the upstream systems. Sometimes it’s a little cumbersome to filter mocked data from real time development data.
- Higher Regression Time
As we were majorly relying on the manual functional testing of the pipeline, which resulted in a higher regression time once we made a small change in the pipeline behavior.
Introduced Test Automation In SDLC
Manual testing of pipelines helped us to understand the pipeline behavior to identify and document the multiple scenarios to move forward. At this point, we had a clear understanding of what our pipeline was expected to do. What are the different scenarios? So we have decided to introduce test automation in SDLC
A Hybrid framework with Pact.io
With Contract tests, we were testing the integration point by checking each application in isolation to ensure the messages that the upstream system sends or downstream system receives conform to a shared understanding that is documented in a “contract.” This contract is the source of truth for us to validate the event schema received from upstream sources. In this way, we minimise our dependency on the source system.
This is how it works. As we were the consumers of the events from the upstream system, we have written some consumer-driven contract tests and generated a pact or contract to explain what we are expecting from the source system and shared the same with the provider or source team. Now the sourcing team structured the raw events as per the contract that we shared.
Test Automation Framework components
As we have developed a hybrid framework to automate the testing effort. Test framework had multiple components that we explained below:
- Extent Reports: A java library that is used to generate customised test execution reports.
- Custom Utilities: A java class that contains java methods to perform different actions like verifying data in a BigQuery data lake and more.
- Unit Testing Framework Junit 5: Junit 5 as a test runner, junit assertion helped us to assert on expected and actual behavior after test execution.
- Contract Testing Framework: Introduced contract tests to remove our dependency from the upstream team. Developed extended integration tests with Pact.Io.
- Consumer Driven Contract Tests: As we are majorly relying on raw events from the source team, these tests helped us to ensure that the source events are compatible with the expectations that we had. It minimises our dependency on the source team.
- Pact Broker: Is a centralised app to share the contracts and verification results with the data source team.
- CI Integration with Concourse CI: We were using concourse CI as a centralised CI/CD platform. We have integrated our test with concourse CI for on demand test execution and faster feedback.
At the end, we designed a plug and play reusable automation framework to test streaming ingestion data pipeline, which enabled us to get the faster feedback and help us to find out the red blocks on production.
- This framework enabled us to generate our dynamic events by restructuring the sample events.
- Increased our test coverage and enabled us to test more negative scenarios by manipulating the shared sample events.
- Enabled us for on demand test execution and reduced our regression cycle to 70% by integrating it with CI.
- Helped the development team to accelerate developer’s testing.
- Reduced delivery cycle time.
“OMNI is an analytical platform which interacts with multiple upstream systems, so we have lots of dependency on them for test data. The automation framework helps us to minimise our dependency on the source system to accelerate the development and on demand test execution.”
Senior Engineering Manager