Author: DiscoveryPlus Services Group

Why you need to do complete IT discovery before your cloud migration

There’s a mantra that I would repeat when training adult leaders for the Boy Scouts: “Fail to plan, plan to fail.” You’ll hear similar statements in training classes for project management and financial management. An aspect of planning that often gets less attention than it should is knowing what resources you have and what constraints you have to manage. In the infrastructure and operations world, we call this IT discovery.

Discovery is the process of identifying and documenting your infrastructure, its related components, and the relationships and dependencies in your operating environment.

Unless the application you’re migrating to the cloud is completely static using the exact code, applications, hardware, and operating system as when it was deployed, with every feature and communication documented, then you need to do discovery. Applications pick up features over time, especially if they’re monolithic, and data centers evolve.

Fail to plan. Plan to fail."

Additionally, unless you’ve been diligent in documenting every change and how that change impacts the relationships and dependencies for your applications, you can expect problems or failure when you try to migrate them to the cloud without discovery.  

Over time applications become more complex, amassing more features, more reporting, more compliance requirements, and more usability improvements. A common response to this growing complexity of applications is to move to a microservice architecture, which breaks the application into its component-independent services. 

While you can rearchitect your application to implement microservices, this takes time, money, and a lot of recoding. If you’re under time and cost constraints, you might not be able to support this type of project. How do you ensure you have a working application after the cloud migration? 

Complete Discovery require auto-discovery and human intelligence

Complete IT discovery

Complete IT discovery leverages both auto-discovery and human intelligence to understand what applications should and should not be migrated to the cloud. Adding an experienced discovery team to the effort ensures you capture data that cannot be discovered by a standard tool.


An auto-discovery tool can map relationships so that you know which servers are providing services to an application and regularly communicating with an application’s servers. The types of services being delivered to an application include database, web hosting, authentication and authorization, auditing, patching, and management. 

Your discovery tool should be able to provide a map showing the connections between your infrastructure and its components, including connections to the storage infrastructure and the ports used for communications. Use an agentless discovery tool so that it doesn’t distort the application dependency mapping by tracing the reporting from the agent to the data-collection server.

We’ve found that the best tools feature the ability to map relationships both across data centers and in the cloud to find relationships that aren’t shown in the application’s documentation. An example of this is file transfers. Let’s say that when setting up your web applications, you set up an FTP server for your clients to send files to your staff. Later, you upgraded to secure FTP, and still later you moved to a secure file-transfer application or changed the file transfer to use HTTPS. You’d think that communications with the original FTP server would cease. But, we’ve seen scenarios like the above where application servers were still communicating with the old FTP server, which, theoretically, retired several years ago.

The best tools feature the ability to map relationships both across data centers and in the cloud."

A good auto-discovery tool will show the communications with port 20 or 22, the IP address of the FTP server, and the routing through the DMZ for your server. It will also reveal the ports used by your application to pull the data off the FTP server. We often see IP addresses showing active communications that aren’t in the documentation.  

In addition to learning the relationships among your servers, you also need to figure out what performance you’re getting. Using monitoring software, we chart throughput, latency, CPU, and RAM usage. I’ve seen web applications max out CPU and network bandwidth. There was one scenario that involved a production Oracle database server. We were getting complaints about a website’s performance. The architecture for the web application was a load-balanced web tier running in a DMZ, supported by Oracle databases running on Solaris servers in a dedicated VLAN, accessed via a firewall. Each web application had a separate schema on the Oracle server and accessed the database instance via a service account. The web application handled authentication and authorization.

Investigation showed that instead of using built-in Oracle capabilities such as structured queries to run a sequence of queries, the applications ran each query on the web server against the database and waited for a response from the database before running the next query. The database server rapidly responded to each query, but the communications time through the firewall created significant latency.

We couldn’t get the developers to rewrite their application or to provide the DBAs the query so that they could set it up to run locally. To fix the problem, we moved the web application’s schema from the production server to a dedicated Oracle instance on a separate server, increasing licensing costs but resolving the performance complaints. This type of configuration makes this application a non-starter for the cloud. With the right tool, you’ll be able to identify issues like the one described above and have the data you need to make decisions about how to mitigate them.

RELATED CONTENT  | Infographic 
Seven steps to IT discovery /

Human intelligence

Complementing the data gathered from your discovery tool is human intelligence. Adding people to discovery provides context and analysis of the data gathered, but it should go a step further. The discovery team should review documentation with the application owner, the developers, security, and operations teams to ensure there are no surprises. You need to look at the documentation, including client agreements. We’ve run into information location and protection requirements that explicitly forbid placing the data in the cloud or tightly limited which cloud service provider (CSP) could be used because the data had to remain in country.
Brexit is an example of the potential complexity related to data location. With the United Kingdom (UK) leaving the European Union (EU), how do data centers in the UK handle the data provided by their EU clients for EU citizens, and how do they comply with the EU’s General Data Protection Regulation (GDPR)?
The discovery team should review documentation with the application owner, the developers, security, and operations teams to ensure there are no surprises."

Another possible complication relates to your disaster recovery (DR) requirements. If your policy mandates a hot or a warm DR site, you may find that replicating data to the failover site in the cloud is cost prohibitive. The primary advantage of the cloud is the ability to scale up or scale down and pay for the compute you need, when you need it. Steady-state systems that don’t scale remove the cost savings. If you’re replicating data continuously from the source data center to the failover site, you’ll be paying data-transfer rates to do so. 

You could also discover that some of the systems supporting your application use RISC servers or an operating system that isn’t supported by your potential CSPs. Neither of these is a show stopper. But, if you fail to plan on how to manage the communications and fail to determine if you can leave these systems in your data center or replace them with compatible equivalent systems, you are setting your migration up to fail.

When it comes to migrations, shortchanging discovery is the equivalent of planning to fail. Use an auto-discovery tool to identify all the relationships, use performance data to determine your system requirements, and validate both the application architecture and all contractual and legal constraints. Once you have this information you’re ready to start the cloud migration planning process.