Financial institutions are some of the most highly regulated enterprises in the world especially with respect to data storage. The combination of legislation enacted by governments around the globe concerning consumer data privacy, the collection and safeguarding of private financial information, and secure storage requirements, the challenges with data storage and migration for financial institutions have become monumental. Add to this “perfect storm” the complexity created by the 7/24/365 nature of the global banking transaction system and it’s easy to see why a global financial services enterprise with operations in three dozen countries and over 70 million customers was looking for a block data migration solution that could migrate a massive amount of data remotely across eight data centers from legacy storage to its recently acquired new flash storage arrays; all with no impact to the underlying production applications.
Cirrus Data and a market leading flash Storage OEM featured in this case study have had a successful partnership for the better part of the last decade – a partnership that’s based on hundreds of successful block data migration projects delivered across global enterprises as well as companies large and small alike.
The customer in this case study had several unique challenges operating in a highly regulated environment that were exasperated further due to the scope and scale of the migration project. As a high-profile member of the global financial network, the customer had zero tolerance for risk of data loss or corruption as well as any type of business interruption. It was apparent from the onset that this project would be jointly managed by the customer, the Storage OEM and Cirrus Data; all supplemented by a team of highly skilled Cirrus migration remote delivery SME’s that were required to achieve the project timeline.
In order to maintain operations and ensure data integrity, the project was subject to numerous “change freezes” which wasted a significant amount of valuable migration time over the course of the project. The change freezes typically resulted from normal end-of-month and end-of-quarter freezes, major system and security upgrades performed elsewhere, and even unscheduled and sometimes urgent freezes prompted by various global conflicts and sanctioning requirements imposed by governments on the banking system itself. Although these change freezes were in effect for more than half of the available migration days of each year, fortunately – using Cirrus Data technology – we were able to remotely access the physical data centers hosting the company’s storage systems even when the freezes were active and in place. To add more complications to the project, there was also a global pandemic raging during the entire timeline of the migration project; therefore some delays were inevitable.
Since the Cirrus Data CMO technology was able to remotely access the storage arrays hosted at the various data centers; after significant planning and approval, all parties involved agreed to deploy 32 pairs of CMO appliances across numerous network fabrics in eight different data center locations all hosting the end-of-life storage systems. A specific number of appliance pairs were deployed at each site depending on the amount of data to be migrated and bandwidth considerations; all of them ready to be inserted into the SAN fabric as soon as a change window became available; thereby enabling “Migration as a Service” upon demand.
After the initial deployment of the CMO appliances, migrations could proceed even with change windows in effect. The migrations were allowed to proceed because the CMO solution utilizes Intelligent Quality of Service (iQoS) which continually monitors I/O activity during the migration progress in real time and automatically adjusts the rate of migration to maintain a constant quality of service level while still migrating as much data as possible to the new storage, utilizing 100% of the spare bandwidth when available.
Before the CMO appliances could even be deployed, however, there was another obstacle to overcome. The outgoing storage had a particular security requirement that required host groups to be created and mapped to the storage targets in order to allow access to the data. During the insertion into the SAN, CMO uses “proxy” addresses that stand in for the hosts and connect downstream to the storage. Like a bouncer in a private club, the storage must recognize who was on its access list. It quickly became apparent that creating these “proxy” mapping access lists for thousands of hosts with their accompanying exponential number of paths was going to be excruciatingly slow if performed by the vendor-supplied GUI. In fact, the migration experts involved in the project from the customer team remarked that they believed this obstacle was going to significantly lengthen the entire project timeline, if not put the entire project timeline in jeopardy.
That’s when Cirrus Data engineers studied the issue and – using the powerful Cirrus Data REST API – quickly developed, tested, and implemented an automation tool specifically to solve this problem by utilizing the storage array’s API to add the necessary host groups which satisfied the storage security requirements and allowed CMO to connect to the downstream arrays. Once connected to storage array and host, migration administrators could create migration sessions in CMO. This one automation step helped save nearly 2000 project hours and was just one of many regulatory or procedural hurdles that were effectively overcome by Cirrus Data and its partner during the project.
Almost 4000 insertions successfully occurred, spread over various “waves” as change windows became available. Once the obstacles described above were overcome, the rate of progress dramatically increased to scale over a short period of time. As can be seen from Figure 1 (at left), insertions started slowly with only 167 performed during the first four months of the project. Once the automation was deployed, the pace of insertions accelerated to almost 1800 insertions over the next seven months (2,000 over 11 months in 2021) with nearly 2000 more insertions performed over the final six months of the project during 2022; once operating at scale.
Once the Cirrus Data appliances were inserted into the SAN fabrics, the project team was able to migrate data at will. As can be seen from Figure 2 (at right), the rate of data migrated started slowly, with approximately 115 TB’s migrated during the first four months of the project. The rate of data migrated then accelerated to 2.9PB’s over the next six months for a total of 3PB’s migrated during 2021. Once at scale, the project team migrated nearly an additional 9PB’s of data over the next six-month period with a record 2.6PB’s migrated in just one of those months!
Final host cutovers followed a similar pattern of acceleration as can be seen in the Figure 3 chart below. During the first four months of project, while the obstacles described above were being resolved, the project team was only able to perform 148 cutovers. Once the obstacles were removed, the rate of host cutovers accelerated dramatically with the project team performing 1232 cutovers over the next six months for a total of 1380 host cutovers during the first ten months of the project in 2021. Once at scale, the project team utilizing CMO was able to perform over 2300 host cutovers during the first six months of 2022; all without impacting the operational performance of the underlying applications.
This use case clearly demonstrates the value of deploying the CMO appliances as part of the permanent storage infrastructure thereby enabling zero-impact data migration to be performed at scale and at will, all non-disruptively and as a service.
In the end, the customer was thrilled with the ease of use and speed at which the CMO appliances were able to migrate data from the old storage arrays to the new arrays, all without operational impact.
Another point worthy of note is that when the time came to un-insert the CMO appliances and perform the final host cut-overs to the new storage, the migration project team was further delayed by similar change freezes as had occurred during the insertion process. In fact, during 2021, there were only 128 cutover days available, and even fewer – just 115 – in 2022. So with 243 available days, the team achieved an effective rate of fifteen cutovers per available day. Despite all these challenges and narrow windows of opportunity to migrate data, the project was a huge success thanks to Cirrus Data support, CMO, and the cooperation between the customer’s management and the joint Cirrus Data and OEM project team.