Posted by: Jenny Laurello
Disaster planning, Disaster preparedness, health IT infrastructure, Infrastructure resiliency
We had a great live chat yesterday with Health IT Exchange expert John Donohue, Associate Chief Information Officer at the University of Pennsylvania Health System (UPHS). John fielded some great questions from users about resource allocation, infrastructure configuration at UPHS and how to plan for loss of internet services during a disaster. To learn more about how UPHS has tried to prepare itself to withstand a data center disruption of any kind, watch John’s webcast.
Check out the chat transcript below to see what advice John has to give to health IT pros who are in charge of disaster planning and infrastructure resiliency.
Chat transcript: April 25, 2012
10:48 Anne Steciw: Welcome to today’s live chat with John Donohue, Associate Chief Information Officer at the University of Pennsylvania Health System. We’ll begin momentarily.
10:49 Anne Steciw: Please feel free to start entering questions now!
10:52 Jenny Laurello: Good morning! Our live chat with John on Best Practices for Disaster Recovery and Infrastructure Resiliency will begin at 11 ET. Please feel free to enter your questions now and we will push live once the chat begins, as time allows.
11:00 Jenny Laurello: And we’re off! Welcome all to the #HITExchange Live Chat with John Donohue, Associate CIO of UPHS. Welcome, John, and thank you for joining us today!
11:00 John Donohue: Thank you Jenny, glad to be here!
11:01 Jenny Laurello: First question for you today: In 2009, UPenn decided to bring infrastructure services in-house, outsourced for a decade prior. You cited in the webcast that the key drivers of this decision were rising costs, service level issues, and lack of flexibility in response to user demand. Can you speak to how you started the in-source vs. outsource assessment process, and the first steps you took after making this decision?
11:02 John Donohue: While I was not here when the assessment process took place, my understanding is that the Health System saw an opportunity to contain costs and improve services levels at the same time. In the years prior to this assessment, the Health System had insourced the applications side of things and secured good outcomes. The opportunity presented itself to pursue the same outcomes on the infrastructure side of things… I came onboard as we were executing the insource.
The first steps were “onboarding” the employees and looking for gaps in both skills sets and performance that needed to be addressed. For us, our priority was building out the infrastructure leadership team. We knew that leadership was going to be key to building out the new team and reaching high performance standards.
11:02 Comment From ErnieCIO: What is the difference preparing for an internal disaster (something happening within the hospital) and an external disaster (i.e. mass casualty incident a couple miles away) when it comes to IT stuff? OR isn’t there any?
11:04 John Donohue: From an IS perspective, preparing for both is often the same. We do try to work on scenarios that are IT specific (ie major line gets cut, loss of access to major site, etc.)… This allows us to be prepared for IS type disasters. Additionally we work with the clinical staff and hospital administrators to plan for external type disasters.
11:05 Comment From Vishnu: How did you handle resource allocation (both in terms of funding and staff) after deciding to insource? How do you balance infrastructure resources with those needed for other ongoing enterprise-wide and live application projects? Top bits of advice as to how to prioritize?
11:06 John Donohue: From a day one perspective, our focus was on operational excellence. We knew that we needed to keep the lights on while we improved our depth and breadth of services. We made sure that we kept our best resources on operational activities and began to secure additional resources to handle project and discretionary activities. We were able to quickly secure some very talented folks that allowed us to rapidly build out our project capabilities… From a funding perspective, we were able to build out these additional capabilities and still save money from what was budgeted and being spent on the outsourcing agreement – this made things a little easier in terms of funding.
In terms of prioritizing, I think it is key to understand demand management. If you have a good sense of what the expectations are and what your actual capacity is, you can make good decisions on what you can commit to. Furthermore, I think you can selectively use partners and contractors to bridge any gaps while you are getting your permanent team onboard and ramped up. The last comment I would make is that we relied very heavily on our project management folks. They were instrumental in terms of making sure that we were not missing major milestones or missing key conflicts.
11:08 Comment From JimCIO: I’m not familiar with the UPENN configuration. Are your systems completely insourced or did you build a hybrid infrastructure utilizing both insource and outsourced systems?
11:11 John Donohue: Jim, we have what I would call a hybrid approach to infrastructure and applications. On the infrastructure front, we now implement and manage all of the core infrastructure that is local (ie on campus). However, we leverage our outsourcers data center which is located in another state. We found that at this particular time, they can do a better job managing that aspect of our IS operation. We have the same setup from an applications perspective. We run some of our own key applications locally, but also outsource some as part of an SaaS service. We feel that this provides some level of additional resiliency – in terms of our key systems being spread out. In other words one major outage would not bring down all of our systems
11:11 Comment From Carlene Miyashiro: So much of our operations and clinical care is Internet dependent. I am worried about losing it in a disaster, i.e. a hurricane knocks out the ISP. Advice for creating plan B, C and D welcome
11:14 John Donohue: Carlene, we are in the same situation…. many of our key operational and clinical systems are dependent upon the internet. We have worked hard in this area to make sure that we have full redundancy. So we actually have two points of presence from an internet perspective. They are located in buildings that are across the city from one another to provide some geographical diversity.
Again the theme being that one disaster or problem would not knock us out. We have taken the same approach by diversifying our DMZ and other core services. I would recommend that you ensure that you can get to the internet from an enterprise perspective from fully redundant service points. It will cost some money to achieve, but it will be worth it the minute you have your first outage and your user community does not feel the pain.
11:17 Jenny Laurello: Please continue submitting your questions. We will respond to as many as time allows.
11:17 Comment From Jerome Everson: You developed an infrastructure dashboard at UPenn as part of the in-sourcing assessment and prioritization process. How did you decide (or why did you decide to focus on) on the top criteria against which decisions would be made?
11:19 John Donohue: Jerome, Good question! We worked hard to get the Voice of the Customer. We found in some cases, things that we thought were important to us, were not as important to our customers. This meant getting out there and making sure that we understood the needs and desires of our customers – this also gave us a chance to set some realistic expectations.
Additionally we were very inclusive when it came to our own technical staff. We tried to make sure that we had all of our bases covered and had some consensus on the criteria. It made for some very good and very interesting discussions. For us, it was a very effective team building exercise with a fairly new team. Once we identified the criteria, it was fairly easy sailing.
11:20 Jenny Laurello: Where are you and the UPHS IT team focusing your efforts this year and beyond?
11:21 John Donohue: For us, the first year was around building out the team and getting the right people in the right roles. The second year was around making wise investments to shore up our infrastructure and eradicate enterprise single points of failure. Now we get to start to attack the fun stuff, how we can leverage technology as an enabler… how we can start to become more innovative with the technology. We are working much more closely now with the business to help create competitive advantages.
11:23 Jenny Laurello: What are some of the top lessons learned from your camp?
11:24 John Donohue: From my perspective, the top lesson learned was – be honest with yourself and get comfortable with the cold hard facts. Understand your weaknesses both in terms of staff and infrastructure capabilities. Once you accept that you have some red on the dashboard and some real work to do, you can dig in and get focused. Laser like focus and a strong communication plan are key to ensuring that your team knows the priorities and are all rowing in the same direction.
The second key for us or lesson learned was the importance of leadership that understood where we were coming from and remain committed to seeing it through…. We have a terrific CIO that was very cognizant of what it was going to take and knew that we would skin our knees along the way. At times, we were executing so much change so quickly that we would skin our knees. It did not happen often, but when it did, we recovered and kept moving forward. A supportive executive team is crucial. Getting them onboard early and often is very important.
11:25 Comment From AR: John, we are a small to midsize organization with one main site where our IT infrastructure is located. I am looking now to create a duplicate infrastructure at another one of our locations that is in another township or possibly outsource the duplicate to an IAAS company. The idea being if something happened to our main site I could failover everything to the other and if nothing is wrong load balance our application services between the two. Do you have any recommendations that could help in my decision, or alternative solutions?
11:28 John Donohue: AR, sounds like you have a great opportunity in front of you. From my perspective, getting your resilient solution properly architected is the key. I think you can do the build yourself or outsource it. Some of that will depend on your access to capital versus operating dollars. In any case, if you have someone really sharp onboard that understands both applications and infrastructure – turn them loose on what it would look like. You can then have someone from the outside vet the design. Someone that has “been there, done that”.
Otherwise, I would look for a partner to help with the design and architecture. You want a partner that is going to be with you for a while so that they have some skin in the game. Getting the architecture right is crucial. From there, I think the key is building it out and testing it regularly…
11:29 Anne Steciw: After the chat’s over, check out our recent podcast with tips on how to create an effective disaster recovery plan https://searchhealthit.techtarget.com/podcast/Why-believability-matters-in-an-effective-disaster-recovery-plan
11:30 Comment From JimCIO: Given the advancements in cloud computing over the past several years, are you currently planning (or considering), moving any (or more) of your systems to that model?
11:31 Anne Steciw: We’ve also got a great tip about creating a HIPAA compliant backup in the cloud https://searchhealthit.techtarget.com/news/2240114690/Creating-a-HIPAA-compliant-backup-in-the-cloud
11:31 Anne Steciw: Great questions everyone, keep them coming!
11:32 John Donohue: We are looking very hard at cloud computing. We are being somewhat conservative due to the issues associated with PHI and privacy. We are likely to build out our own private cloud in the next several months. From there, I would like to look at leveraging some type of public cloud for the economic benefits. There are some interesting ways now with virtualization to leverage a hybrid model with both a public and a private cloud.
Once we are comfortable that we could deploy cloud technology in a very secure manner, we will move in that direction fairly aggressively. With our thoughts on cloud, virtualization and blade technology, we are forecasting a much smaller (and cheaper) data center footprint in the future.
11:33 Comment From Tenney Naumer: When the Joplin tornado hit the hospital, it completely destroyed their back up emergency power diesel generator. Fortunately, 3 weeks prior to the tornado, they had completed their conversion to electronic documents and had off site redundancy. Do you have suggestions for what sort of physical structural design might be more appropriate for preserving backup power to information systems?
11:37 John Donohue: Tenney, we are hoping not to see too many tornados in philly, but… On a serious note, we plan for tier 3 capability for all of our key clinical systems. So as we design our solutions we are always N+1. So from our perspective, we would require a backup generator so that we would be able to survive the loss of one. We really look at resiliency at all levels in the stack, starting with the data center and its environmental systems (power, cooling, etc.). We also try to make sure that our systems have some type of resiliency and lastly we leverage backup systems and disk synchronization such that we don’t expose ourselves to key data losses…
11:37 Anne Steciw: Mercy’s hospital is being rebuilt to be 30% stronger than before the Joplin tornado http://www.prweb.com/releases/2012/4/prweb9379197.htm
11:37 Anne Steciw: One more great article about the Joplin disaster: https://searchhealthit.techtarget.com/tip/What-Joplin-teaches-hospitals-about-disaster-recovery-planning
11:38 Comment From Darren Branner: I have heard that to “practice” DR and see how policies and procedures work, some hospitals treat a new software implementation like a PACS or RIS or EHR like a disaster event and assemble teams to deal with it. We are thinking about such an exercise. From your experience, is this a good idea?
11:41 John Donohue: Darren, this sounds like a good idea. We have not yet taken that particular approach, but it seems like a good opportunity to vet out the policies and procedures as well as seeing how the staff responds…. I do think it is absolutely critical to find some way to test your disaster recovery capabilities. You really need to have some organizational rigor to make sure that your solutions work. I would recommend cutting formal projects for DR testing and “budgeting” the time that it will take for your staff to do it right.
11:41 Comment From Lloyd Reisman: How do you make a 5-year plan for DR with technology advancing so fast you can’t possibly know what will be on your network five years from now? Focus on infrastructure, mostly?
11:45 John Donohue: Lloyd, you have hit on a very critical item. Unless you have a crystal ball, we don’t know exactly what will be running on our systems and how much “capacity” they will require. For example, a year ago, we were planning for Terabytes of storage, we are now talking petabytes of storage – a potential game changer. I think you are absolutely right, focus on the infrastructure. We work hard to abide by standards and build out our infrastructure with some level of agility… In other words try not to lock yourself into some technology that would be a dead end or would be difficult to support or integrate when things change, because we know that they will. The more agility and flexibility that you build in the more you will be able to protect those investments.
11:47 Jenny Laurello: What hospital staffers should the CIO interface with re: disaster planning? I.E. a safety officer is typically the head disaster planner, and the compliance officer is responsible for HIPAA compliance in a disaster, etc.
11:49 John Donohue: We have a safety officer that is very effective in understanding the scope of the disaster, communicating the impact and rallying the right resources to address the disaster. It is key that the CIO and other IS executives work with the safety officer in advance of disasters (in terms of planning) and then during the disaster. We have had very effective “post mortem” type meetings after disaster events to make sure that we capture lessons learned and get better and strong as a result of the experience.
Additionally I think it is key that the CIO have strong relationships with the other health system executives (from nursing to operational folks) so that when disasters strike, you can quickly respond. In summary, I don’t think you can go broad enough in terms of key hospital staffers in disaster planning.
11:51 Comment From Chad J: What are your thoughts on conducting mock disaster drills? If so, are the drills conducted primarily by the IT staff, or are other departments involved since practically all medical operations involved IT?
11:53 John Donohue: Chad, I am a strong supporter of conducting mock disaster drills. The challenge is that they are time consuming and they compete against other critical IS activities like installing new systems. I think you really need someone in your organization that “owns” this process and is measure on their discipline around conducting them and making improvements based on the feedback and results. I think it has to happen at both levels, some mock planning within IS for IS type disasters and then also at the broader institutional level (my experience is that most hospitals are very good at conducing regular mock drills on the “clinical” level)
11:55 Jenny Laurello: What’s the best strategy to ensure seamless – or close as possible – EHR uptime in a disaster event?
11:57 John Donohue: From an infrastructure perspective it is all about resiliency. We have worked very hard to eliminate any and all enterprise single points of failure. Vendor diversity, location diversity, tiered data centers, etc…. We have looked at resiliency at every level in the stack from data centers all the way to the end user devices. Some very effective investments and being opportunistic with our legacy capabilities allowed us to get there fairly rapidly. I would also say that it is key to be closely aligned with the Applications leadership and your vendor(s). It needs to be a joint effort and the infrastructure is only one piece of the pie. You can get the infrastructure right and then fail in one of the other areas (i.e. applications).
The EHR is so critical that this is one area where I would recommend having very specific scenarios vetted out… Really walk through the different type of outages you have an how your infrastructure and applications would respond. Look for the weak areas and then you can work on them to ensure that you are better prepared for when something happens
11:58 Anne Steciw: Check out how this facility dealt with a disaster after a recent EHR implementation: https://searchhealthit.techtarget.com/video/EHR-implementation-held-after-Minn-bridge-collapse
11:59 Jenny Laurello: And that’s all he wrote, folks! Thank you everyone who participated for all of your great questions, and thank you, John, for all of your wonderful expertise!
12:00 Comment From Tenney Naumer: Yes, thank you!
12:00 Comment From ErnieCIO: thanks for such an informative session.
12:00 John Donohue: Thanks, I enjoyed the questions and learned a little myself today….
12:00 Comment From JimCIO: You’re welcome! Thank you!
12:01 Anne Steciw: Thanks everyone! We’re closing the chat now. Check back here for a transcript.