Cloud Native ComputingDevelopersOpen SourceTo The Point

The Cause And Cure of Downtime During Maintenance Of SAP HANA Database


Brett Barwick, Principal Software Engineer at SIOS Technology, discusses the most common cause of downtime, which turns out to be during the switchover of the database between clusters. He also talks about the options at the disposal of database administrators to minimize that downtime.

Guest: Brett Barwick (LinkedIn)
Company: SIOS Technology (Twitter)
Show: To The Point


Brett Barwick: If you drill down into what actually happens in the switchover process, there’s a couple ways that the switchover can happen on the database side. So the database … by the way, I’m talking about databases that are using what’s called HANA System Replication, and this is SAP’s native mechanism that it designed to replicate data from one HANA database to another. So it turns out when you want to switch over the database, part of the process is going to be that you’re going to have to take your running secondary database and promote it to become the new primary, so clients can write to it.

Brett Barwick: And it turns out there’s two ways essentially to do that. So the first I call traditional system replication takeover, and this has been around since system replication has been around. And the idea there is, in order to make sure that everything’s safe, before you promote the secondary database to primary, you need to completely stop the original primary database. Make sure nobody’s writing to it, make sure everything’s in sync, then you can promote the secondary database, then you can reregister the original primary as the new secondary, and then restart that database.

There’s also a newer version of HSR takeover, which is called takeover with handshake. And this was introduced by SAP in the HANA 2.0 SPS 04 release, and this is around April 2019 that they introduced to us. And the big difference there is that instead of completely stopping the database, their idea was that it’s good enough to just put it in a frozen suspended state, which is a little bit faster, and then that allows you to then promote the secondary database more quickly, and get users back accessing the database more quickly.

So in terms of kind of under the covers, what’s going on, you have those two different takeover types. In terms of how you would actually perform the switchover or the takeover, there are a few ways. So if you happen to not be using HA software, you could use one of SAP’s administrative tools, either a graphical tool like HANA Studio, HANA Cockpit, or if you like command-line, you could use the HTBasic utility. But since we’re talking about HA environments, I do want to emphasize that you want to make sure that you’re using your HA Software to do these switchovers. The basic idea is that you want to make sure that you’re telling the HA Software, hey, I want to move this resource from one node to the other, so that if it moves underneath the covers, the HA software isn’t surprised by that and it doesn’t to restart the database on the node where it just stopped. So again, just make sure that if you’ve got HA Software in place, make sure you’re using that to switch over the database.