Linux and Graceful Shutdowns

6 Mayıs, 2014 tarihinde gönderildi

Senior Program Manager, OSTC
IaaS Virtual machines on Azure may be shut down in a number of ways, either via the Azure Management Portal, Azure Powershell cmdlets or CLI tools, or even by a user who is logged in interactively.  The Azure platform itself may even initiate a shutdown to perform platform maintenance.  The shutdown process for a Linux system running on bare metal on premises is well understood, but how does this all work in the cloud?

Initiating a graceful shutdown on Azure

The process to shut down a Linux virtual machine in Azure works much the same as it does on premises.  When a user is logged in and runs ‘/sbin/shutdown now’ the expectation is that the system will immediately begin stopping any running services, and eventually power off the system.  Of course, the exact mechanism depends somewhat on the distribution and the init system in use (typically SysV, Upstart, or systemd), but the result is the same. Now what happens when the shutdown is initiated by the portal, or the Azure platform?  In short the answer is, again, that pretty much the same things will happen.  In these cases Azure will communicate with the host to initiate a graceful shutdown of the guest Linux system.  In Hyper-V and Azure environments the signal to perform a graceful shutdown comes from the hypervisor and is handled by the hv_utils driver, part of our Linux Integration Services that are included with the Linux kernel.  This feature is known as integrated shutdown. After receiving this signal from the hypervisor the hv_utils driver will initiate a graceful shutdown of the Linux guest and launch essentially the same mechanisms as if the user had run ‘/sbin/shutdown now’ manually.  When hv_utils receives this signal from the hypervisor the following message with be sent to the logs (typically in /var/log/messages, or /var/log/syslog on Ubuntu systems) indicating that the Linux system was shut down by the hypervisor: There is a small catch, however.  The Azure platform cannot wait forever after initiating a graceful shutdown.  The platform will wait 5 minutes for the virtual machine to shut down gracefully, and if it is still running after that time it will power off the machine.  This is important to note ensure your VM can run all its scripts and shut down cleanly within the allotted time.

Managing the shutdown process

A common requirement among Linux users on Azure is to ensure their application shuts down properly when Azure initiates a graceful shutdown of their VM.  Now that we understand that the same mechanisms are used for both manual and hypervisor-initiated shutdowns, we can simply use the existing Linux init system to ensure the application shuts down properly. In many cases the existing SysV, systemd or upstart scripts will be sufficient to shut down the application properly.  However, in the cases where it is not sufficient, or if additional processes need to be run to “clean up” the application, there are a few things one can do:
  • Of course, the easiest method is to simply edit the applications init script and add the additional tasks.  This does have some downsides.  Besides the obvious typos and other errors that can occur when editing scripts, it may prevent the script to be upgraded since these scripts are usually managed by an RPM or Deb package.  One advantage is that because integrated shutdown is a hypervisor feature, you can test all this locally on Hyper-V before deploying to Azure.
  • Another option is to create your own init script.  Because there are so many different init systems available for Linux system there are also many ways to do this.  Most systems have at least SysV compatibility alongside their native init systems, so the easiest approach would be to write a SysV-compatible script and ensure it runs from runlevel 0.
  • And a slightly more elaborate approach may be this one, which I recently tested using our CentOS image in the Azure Gallery.  This script may be a good starting point for creating your own init script to perform needed pre-shutdown actions, but keep in mind that we still have a 5 minute window to cleanly shut down our VM before it is powered off.
But just like Linux itself, there may be no one-size-fits-all approach here for all distributions and workloads.  The main point here is while there are many ways to solve this, there is nothing unique we need to do on Hyper-V or Azure to cleanly shut down our VMs.  We can still use standard procedures and mechanisms already available on your favorite Linux distribution, while also enjoying the benefits of hosting those systems in Azure.