“Quality is never an accident; it is always the result of intelligent effort.” – John Ruskin

Most engineers have heard of Murphy’s Law, which simply states, “Anything that can go wrong, will go wrong.” But based on years of engineering experience, we have learned the most significant essence of this law – that this statement is ACTUALLY TRUE, particularly when it comes to software development. Undoubtedly, your own personal experience is all the evidence needed to establish this as fact.

As a software company, it is therefore essential to institute a strong philosophy and culture within the organization that software quality is the number one priority above and beyond anything else – that when it comes to quality of software products, everyone must strive for the “best” and not merely for the “good enough.” Starting with the very first day of any engineering effort at Cirrus Data, we put an extraordinary amount of focus on ensuring that software quality is paramount.

Our Data Migration Server (DMS) has been responsible for helping our partners around the world migrate many petabytes of data across storage arrays from data center to data center. Our DCS has also been installed in many locations across the globe with great success. The overwhelmingly positive accolades our products have received from customers and partners would not have happened were it not for the careful scrutiny of our great corps of QA engineers.

Nevertheless, we didn’t stop there. The reality is that there are never “enough” resources to perform “enough” tests. The key, therefore, is the efficiency of the quality assurance process.

One straightforward way to increase product reliability is to simply increase the test mileage on all supported features by expanding the test effort to be an around-the-clock effort. So should we start mandating our QA engineers clock 12 hour shifts? Of course not! At Cirrus Data, we firmly believe hard work is totally overrated. Why burn out attentive and conscientious QA employees when instead, we can let machines do what they do best? This is where test automation comes in.

Test Automation is not a new concept. Some form of automatic testing exists in any QA operation. But as Rod Michael, an executive from a large industrial automation company puts it: “If you automate a mess, you get an automated mess.” A meaningful effective automated test system is created using intelligent design. It’s not something that just happens.

For unit-testing, there are many testing frameworks that developers generally employ to ensure that their code does not break. JUnit in Java and Karma in JavaScript are some examples that most organized development teams should already be using. As for larger-scale integration tests, scripts are usually created to simplify tasks that would be too tedious for humans to execute.

20140414bCreating automated tests this way is often very time consuming and not cost-effective, especially in environments that require many components and environmental configurations to be observed and coordinated. Therefore, a more systematic and comprehensive architecture is desirable, which brings us back to the “Engineering” part of “Software Engineering.”

Before diving right a fully-automated 24-7 testing lab, we focused our R&D effort on the new architecture, just as we do when developing any new product. We were determined to design and put together a robust and efficient test automation architecture that could effectively facilitate our quality assurance effort in enterprise storage.

From the results of the research, and an extensive study of our own QA operations with existing automation efforts, we concluded that a number of things are crucial to best-maximize a product’s reliability via effective automated testing:

  • Centralization
    • The automation effort must be coordinated not only among QA engineers and developers, but also among multiple national or global facilities in the organization.
    • Test scheduling, executing, and reporting can be coordinated from a central place, and managed in a multi-user environment.
  • Customization
    • Having a test case platform that allows testing to be customized at runtime via a web-based GUI allows “generic” test cases to be developed, as opposed to machine specific “scripts.” Develop Once, Use Everywhere (DOUE).
  • Multi-Component, Cross-Platform
    • The “DOUE” principle applies not only to different tests, but also to different platforms and components within a test. This implies an agent-system that can be installed on different OS platforms.
  • Cost-Effective
    • The fundamental motivation is that the effort spent on designing, developing, and implementing the infrastructure will be substantially less than that spent on performing manual tests.
  • Automation Process Development
    • Normalize test script writing to encapsulate most non-test-related logic within the framework (such as the logic to report the status and progress of tests). This allows QA engineers to quickly put together new test cases that focus ONLY on the test logic.
    • Re-usable test scripts allow new test cases to be created more efficiently.
  • Test Result Reporting and Analysis
    • Uniformed reporting allows test results to be quickly and accurately analyzed.
  • Test Progress / Result Monitoring
    • A user-friendly, web-based GUI facilitates test plan configuration, as well as management and customization of test cases. It makes the operation workflow more efficient, as developers and QA engineers can each focus on their own tasks in the quality-driven paradigm.

Using this blueprint, we have designed our next-generation testing automation platform – the Cirrus Data Solutions Test Automation Framework, CDS-TAF.

In the next part of this article, we will walk through the detailed design and implementation process of the CDS-TAF and demonstrate how the platform further enhanced the quality of our product portfolio.

Stay Tuned.