9 cloud migration mistakes — and how to avoid them
The exponential increase in Amazon Web Services’ revenue in the past five years makes it clear that we are on the cusp of a generational transformation in how IT organizations provide application infrastructure.
Indeed, Gartner estimates that infrastructure-as-a-service (IaaS) revenue grew by nearly 43 percent in 2016 and said organizations saved “14 percent of their budgets as an outcome of public cloud adoption,” a ratio that is sure to rise in the coming years.
Many government IT organizations are at the forefront of the cloud conversion due to executive-level mandates, tight IT budgets and the increasing demand for online access to information and services.
Given the massive installed base of IT infrastructure and the billions spent every year on equipment, software and services in the public sector, shifting deployment models from on-premises to in-cloud won’t happen overnight. Instead, it will span generations of technology and applications. So government IT organizations have time to do it right and learn from the mistakes made by early, overly hasty adopters.
Many of the mistakes could happen with any major IT project, but some are a function of how cloud services are used. Most migration scenarios involve data movement, such as using a cloud service for backup and disaster recovery or redeploying an enterprise application that relies on legacy, on-premises data sources. In those situations, it can be challenging to ensure the completeness, integrity and security of data migration.
When moving applications to the cloud, IT leaders must decide how to maintain access to necessary data sources. If they opt for a hybrid architecture, with applications in the cloud and master data copies remaining on premises, it’s easy to make mistakes setting up network connections, security policies and replication settings.
For example, there are no hard rules for choosing which data to replicate, how often to replicate it, how long to retain it and how many replicas to keep. Administrators must also make decisions about whether to migrate all application layers to the cloud or just the user interface and business logic, leaving databases behind. Indeed, the IT team must decide whether the complexity of a hybrid design outweighs the expense of fully replicating all data to the cloud.
Other migration scenarios — such as replacing on-premises applications with software as a service — are simpler. However, they create problems of application governance, feature mismatch, policy compliance and client support, particularly for organizations with a large fleet of legacy devices that are long overdue for upgrades. Although cloud migrations shouldn’t mean modifications to security policy, they do change how that policy is implemented.
Loren Hudziak, Google’s senior solution architect, pointed out that “government and other regulated industries have no shortage of accreditation and certifications to consider, but to simply check the box of a cloud provider as meeting one of them isn’t enough.”
Regardless of the technical details, migrations are a major undertaking, and organizations face the pitfalls of any large project. Skimping on planning, product research and testing; starting with overly ambitious goals; and failing to manage internal politics and stakeholders are all recipes for failure.
Whether you’re a federal agency, local government IT department or a large enterprise, several categories of common migration mistakes span disciplines, including due diligence, technical features, security policy, deployment and testing, project management and implementation governance.
Here are some common traps to avoid:
1. Assuming that all cloud providers are roughly the same is a particularly dangerous mistake when migrating applications to IaaS. Although every cloud solution offers virtual machines and several types of raw storage, differences arise in feature details, billing models, and higher-level application and network services.
2. The flip side to stereotyping clouds is overly customizing an IaaS deployment and building a “snowflake” environment that cannot be templated to bootstrap future migration projects. This often happens when a single department runs the project and the application team creates custom management processes, security policies and service configurations that aren’t applicable to the broader organization.
3. Conversely, another common mistake is not using native cloud services and instead running your own services on generic virtual machine instances. Although seemingly contradicting the customization argument above, even with the best provider some key advantages of the cloud can be missed by keeping things too simple.
This is an easy mistake to make because it seems that the best way to avoid cloud vendor lock-in is to build applications that can quickly be moved to other clouds. However, the result is spending an inordinate amount of upfront time reinventing the wheel while simultaneously ignoring one of the chief advantages of cloud infrastructure: high-value services that can be instantly created and consumed as needed without deployment or management overhead.
Even when using generic services such as Amazon Web Services’ Elastic Compute Cloud, Elastic Block Store or Simple Storage Service, some level of redesign will be required if applications move to another IaaS platform.
4. Data integrity problems are often due to inadequate design and testing of data migration and replication systems and processes. Backup software and service providers are fond of touting surveys showing how few organizations have a disaster recovery plan and how many of those with a plan never test it. Despite vendors’ self-serving motives, it is true that IT organizations are as prone to procrastination as anyone. A commitment to documenting details and validating results is crucial when migrating critical data to the cloud.
5. A corollary is inadequate testing in general. Whether it’s application functionality, cloud administration and troubleshooting processes or security compliance, moving infrastructure to the cloud requires the same attention to implementation details as building a new data center.
6. Moving infrastructure to the cloud often leads to incomplete or inconsistent security policies that don’t adhere to established standards. All organizations have security requirements for user access and authorization, network traffic, system and application configuration, event logging and monitoring. Those policies don’t change and might in fact become more stringent in the cloud. As Hudziak noted, it’s imperative that organizations maintain a multilevel defense posture when migrating to the cloud.
7-8. Two common mistakes in building hybrid cloud environments are overlooking application dependencies on on-premises data and IT services and network connectivity problems with virtual private network configurations, routing and remote network security policies.
9. It’s easy to forget that virtual cloud resources are supplied by physical servers running in actual data centers. Although spinning up a dozen virtual machines and a 1 terabyte SQL database in a matter of seconds seems like magic, the cloud cannot defy physics. When deploying latency-sensitive applications, a common mistake is forgetting that distance and geography still matter. Applications that move a lot of data or manage user interfaces will perform better if the cloud provider has data centers or zones near your facilities.
Big bang vs. incremental approach
Government IT departments face some unique challenges when migrating to the cloud because of regulatory, security and application requirements. Mistakes can arise when providers are not adequately evaluated for their fit with current application requirements and user needs. But aside from the technical hurdles, IT leaders should not underestimate the organizational challenges of a cloud migration.
“It really has to start at the top with buy-in from the CIO and senior leadership,” a General Services Administration spokesman said. GSA, which manages the Federal Risk and Authorization Management Program (FedRAMP) and offers a range of cloud services through its various acquisition vehicles, has also learned the need to start small and not jeopardize ongoing operations.
“Trying to transform an organization overnight — for example, asking them to move production to the cloud when previously it’s been off limits — can be an intimidating milestone for a traditional CIO organization,” the GSA official said. “So an effective approach creates a sandbox — contractually, technically and organizationally — to allow the knowledge to incubate and grow.”
Indeed, a bimodal IT structure can be an effective way to introduce cloud services at a government agency, often in concert with other innovative IT practices such as DevOps, agile development, and continuous integration and delivery processes.
Hudziak emphasized the importance of exploiting cloud services to do things in new ways and not just “lift and shift” an existing environment.
“Making the move to the cloud should be taken as an opportunity to revisit the organization’s functional and business requirements,” he said. “CIOs should ask: Have we been doing things the way we have because the technology has historically forced us into that pattern?”
GSA officials cited the example of NASA’s incremental approach to cloud migrations. NASA funded its first cloud contract with just $25,000 then doubled it as IT managers learned and grew confident in the technology.
NASA officials realized that “a massive broker contract was inefficient and instead separated out the roles of consumption metering and billing from consulting or the assistance with integration and delivery,” the GSA spokesman said. “This led to increased transparency in tracking spending and contracting efficiency [because] the cloud accounts and operational services did not have to be transitioned concurrently with support contracts.”
FedRAMP is not a panacea
For agencies that must follow the Federal Cloud Computing Strategy, FedRAMP provides a centralized system to streamline the assessment, authorization and procurement of cloud services. However, it does not obviate the need for due diligence, including service-level agreements (SLAs).
For example, a September 2016 audit by the Government Publishing Office’s inspector general found that although GPO’s cloud service provider was FedRAMP-approved, important problems persisted, specifically:
- “GPO policy did not include cloud computing and/or hosted service definitions, principles, rules and guidelines.”
- “Personnel did not follow configuration management policy during the transition to the Amazon Web Services.”
- “Contract language did not address hosted services.”
“Lack of appropriate contract language for data ownership established an increased risk,” the IG’s report states. “Such a risk could have allowed the cloud provider with unnecessary access to federal data.”
The audit also notes that GPO did not incorporate new Amazon Web Services instances into its existing configuration management database and procedures, an oversight that illustrates an important point: Every cloud migration must include process and administration integration and not treat the cloud as a one-off environment.
Although cloud services are notorious for having vague or incomplete SLAs, it’s important to hold a vendor accountable by documenting meaningful performance measures that are driven by an IT organization’s key performance indicators. Federal agencies are expected to have project- and application-specific SLAs, as detailed by a Government Accountability Office report covering essential practices for cloud computing.
GAO specified 10 key practices to be included in an SLA, which include “identifying the roles and responsibilities of major stakeholders, defining performance objectives and specifying security metrics. The key practices, if properly implemented, can help agencies ensure services are performed effectively, efficiently and securely.”
After examining 21 cloud service contracts at five agencies, GAO auditors found that only seven fulfilled all 10 of the guidelines.
Jamie Tischart, McAfee’s CTO for cloud and security as a service, detailed a set of questions cloud buyers should ask that echoed many of GAO’s guidelines. Essentially, he said being a wise cloud consumer requires developing a sophisticated understanding of a vendor’s operational and security policies and controls. That understanding comes from studying the available documentation and service agreements and asking the right questions when those resources fail to provide enough detail.
A holistic approach to innovation
With cloud migrations, government IT organizations should avoid reinventing the wheel. Exploiting the expertise of cloud pioneers and other resources can dramatically simplify migration planning. For federal agencies especially, there are now communities of interest around FedRAMP, the Trusted Internet Connections mandate and the Data Center Optimization Initiative.
Agencies should also draw on existing standards and guidelines. As the GPO IG’s audit points out, the federal CIO Council and the Chief Acquisition Officers Council have published best practices for acquiring IT services. Their joint report, “Creating Effective Cloud Computing Contracts for the Federal Government: Best Practices for Acquiring IT as a Service,” includes guidelines for selecting cloud services, writing contracts and SLAs, delineating responsibilities between providers and agencies, and establishing standards for security, privacy, e-discovery and Freedom of Information Act requests.
However, agencies at all levels of government should take care not to interpret government standards too liberally and extend them to situations and technologies they couldn’t anticipate.
“The government IT landscape is made up of policies and accreditations that keep it locked in the past,” Hudziak said. “Many of these standards were created years ago when things were very different — not just the technology, but entire development processes and languages, as well as how we leverage modern solutions like virtual machines and containerization.”
Standards and frameworks like FedRAMP are a step in the right direction, he added, “but for government to move to the cloud, the notion of innovation needs to be holistic, and regulations need to keep up with the technology developments.”
Again, leave the big bang to physics because starting too large is a common mistake. Cloud migration is a journey that should begin with small, relatively low-risk applications and data, and add more complex systems as agencies gain experience and refine governance processes.
Other tips for planning a cloud migration include:
1.Don’t underestimate the costs involved. As a Congressional Research Service report on the Federal Cloud Computing Strategy points out, “If a user needs to move resources such as data from its own local facilities to those of the cloud provider, there will be a cost for such migration. That cost will depend on several factors, such as the size of the resources being moved, the method by which they are moved and whether the resources will need to be modified. Such costs are also a consideration with respect to a potential move from one cloud provider to another.”
2. Analyze resource use to manage costs. Have a complete cost model that accurately reflects service use and that can be used to lower spending by opportunistically using discounts for reserved instances (for steady-state workloads) and spot instances (for batch or asynchronous workloads that aren’t time-critical).
3. Make sure you have adequate staffing for migration projects. Instead, plan for training in cloud technologies, and don’t assume IT employees can quickly transfer existing skills and expertise to the cloud as their roles change from operations and administration to systems integration and capacity management.
4. Think about automation and orchestration upfront, before doing the first application migrations. Tools such as Amazon Web Services’ CloudFormation and OpsWorks, Microsoft’s Azure templates and third-party software such as Ansible, Chef, Puppet and Salt can streamline migration tasks, maintain consistency and reduce ongoing operations overhead.
5. Don’t put cloud deployments on autopilot. Instead, define cloud-specific management processes and associated tools and assign employees to actively monitor cloud deployments, resource use and application performance.
6. When using the cloud for disaster recovery or continuity of operations, ensure that plans rely on multiple cloud services or independent regions at the same provider. You do not want a primary and backup site taken out by the same infrastructure failure.
Kurt Marko is a technology consultant and writer based in Boise, Idaho.