When is a good time for a healthcare system to consider cloud infrastructure?
A growing client, with an application portfolio of 800+ applications, discovered the answer to this question when they needed to better standardize and automate system management. The teams managing the applications, servers, and underlying components could not scale to manage the differences of each system. Staffing and budgeting were continuing to get cut.
They needed a better way to consistently manage it all.
Various applications within the client’s portfolio were in mixed states of complexity and maturity. Because of this, they could not immediately be consolidated without impacting both the acute and ambulatory sites. Designing a streamlined, methodical transition was imperative to our success.
Applications were hosted in data centers and MDFs in various regional locations. Migrating these applications to cloud hosting, while technically feasible, had to be done in a way that would ensure that vendor support and management did not impact the clinical experience.
Multiple applications could not easily be moved to remote or cloud hosting, due to latency sensitivity and various other technical reasons. Migration to alternate management systems needed to avoid introducing downtime, wherever possible.
Finally, an application catalog existed but was last updated years ago. It did not have relevant application metadata and was about 50% current (at best).
Our Approach & Methodology
First and foremost, establishing a refresh of the application catalog was a priority.
We chose not to focus on standardized app catalog metadata. Instead, we included data collection that would help us build out a new strategy. Then, we could qualify and categorize applications based on technical, process, and business criticality/dependency.
Multiple hosting locations would continue to be a reality for the foreseeable future, and cloud hosting was not (yet) a viable model. So, we implemented hyper-converged platforms to host applications at each regional location. This would ensure that physical distance and/or latency impact would not be a factor.
We identified that >65% of the applications in the portfolio were ideal candidates for automated management, based on a number of criteria. In migrating production systems to the local, hyper-converged platforms, we were also able to create a snapshot-based model that would perform instant, local P2V or V2V migration. It would also allow us to facilitate local DR failover validation, perform UAT with the clinical staff, and then cut over to the converged platform.
We allowed both the legacy system and new, hyper-converged platform to run in parallel for 2+ weeks, to ensure that the impact was mitigated. Then, we ultimately decommissioned each legacy system.
Success & Value Realized
By performing the transitions in this way, we made a number of valuable discoveries.
- By hosting in hyper-converged platforms locally, we were able to realize a significantly reduced physical footprint. Server and storage devices shrunk to 90%+ in some cases. This had a positive realized value return of floor space (which was set to be converted to usable office space).
- Environmental costs (cooling, electricity, etc.) were reduced for the more condensed and efficient, hyper-converged devices.
- Many hundreds of thousands of dollars in annual hardware maintenance expenses were effectively eliminated.
- By being able to refresh the application catalog with valuable metadata, we created automation scripts that tightly aligned with business needs. RTO/RPO models were very closely aligned with the needs of the clinical departments.
- Snapshot-based recovery and replication were now automation-based and push-button simple. Each app was snapshotted on its business need, then replicated. Recovery was just as simple from any snapshot.
- An entirely new Disaster Recovery (DR) model was created, with snapshot-based replication to cloud hosting. Virtual networking was also implemented, to allow for transparent IP and network-based redirection. That way, in any disaster scenario, diverting user activity to the cloud-hosted DR images did not require physically touching each and every endpoint device.
- Using the cloud as a DR platform allowed for a significantly reduced initial cost model for cloud hosting. It also allowed the client to begin testing the long-term feasibility of cloud hosting production systems on a case-by-case basis
We also learned quite a few lessons in the process.
- A majority of applications were updated/maintained very inconsistently. Being able to standardize the automated management of systems, we were able to bring >85% of systems up to compliance with patching and updates, without any major staff movement.
- Automation management required a new skill set for staff. Those that had Unix backgrounds and/or scripting expertise (SQL, Perl/Python, etc.) quickly adapted to the new skill set and were rejuvenated with their work.
- Being able to standardize virtualization and automation enabled us to reduce workforce manual and physical intervention by >90%.
- Critical applications were able to be put on a significantly shorter snapshot window (one hour), to allow for near-immediate roll-back in any impact scenario. This guaranteed recovery in less than one hour, in many instances.
- Once we were able to have a comprehensive overview of the entire server inventory (40K+), we found that there were only 12 common server builds. Further scrutiny allowed us to standardize on three server builds. Automating 40k+ servers would have been impossible. Automating 12 server builds would be difficult, but feasible. Automating three server builds, however, became a simple success.
|1||Formalized Digital Transformation Strategy||Full transformation strategy documented, including communication, presentations, vision, value, mission, objectives.|
|2||Comprehensive Current State Inventory & Maturity Matrix||Full current state inventory and assessment based on maturity (CMM) for each infrastructure & technology component, processes in place, and capabilities to manage transformation as it relates to migration to the Microsoft Cloud Solution Platform.|
|3||Comprehensive Current Technology Total Cost Model||Full cost model for operating current state, broken down into multiple “Cost per X” models and prepared for implementation into existing Apptio implementation.|
|4||Detailed Technical Business Case Matrix||Technology business cases to be used to transform into new platforms technologies, incorporating optimization and cost value analysis|
|5||Detailed Business Case Matrix||Full inventory, by department, of each business process, with associated costs (IT) to operate, to be used for transformation opportunity analysis.|
|6||Target State Technology Model Utilizing the Microsoft Solution Stack||Full architectural design model, both logical, and physical for future state target.|
|7||Target State IT Organizational Model||Full organizational model required to operationalize target state technology model post-transformation.|
|8||Target State Process Model||Updated process and ITSM services model for future state operational management.|
|9||Transformation Value Realization Model||Mapping of transformation implementation, and realized value post-implementation based on cost, efficiency, or elimination of redundancy.|
|10||Comprehensive Transformation Key Performance Metrics Matrix||Key performance metrics of current and target state.|
|11||Digital Transformation Program Framework||Overall program management framework to manage overall Digital Transformation program, as well as multiple sub-programs.|
|12||Digital Transformation Program Governance||Committee based model to manage the entirety of digital transformation, evaluate risks, make decisions, etc.|
|13||Prioritized Cloud Migration Roadmap for existing application portfolio||Roadmap outlining each application, effort, and timeline to be migrated to cloud hosting in the MS Azure platform. A roadmap will be designed in phases, and the entire application portfolio will be evaluated. Exclusions will be reviewed and vetted by governance committees.|
|14||Prioritized Digital Transformation Roadmap||Roadmap outlining each phase/activity, broken down into monthly increments with dependencies aligned, and risks and targets highlighted.|
|15||Migration Roadmap & Strategy for Office365||Plan, roadmap and strategy to migrate all users from on-premise Exchange platform to Office365 pending scoping and inventory analysis.|
|16||Quantified and Empirically Supported Additional Investment Model||Any additional investments required to realize value/success for transformation will be quantified and vetted by key leaders, SME’s and empirically supported.|
|17||Program / Solution Implementation Evaluation Model||Each implementation will be reviewed by governance committee, and make a determination on whether implementation can/should be performed by internally, or bid out to vendor implementation.|
|18||Risk Management Framework||Framework to identify, track, quantify and manage risks associated with digital transformation|
|19||Prioritized Risk Register||Comprehensive risk register with a complete inventory of all risks.|
|20||Complete Risk Response Plan||For each identified risk above a certain threshold, a risk response plan will be tracked and managed.|