Folks using the IoT Hub Device Provisioning Service to securely provision their devices are taking the opportunity to start using hardware security modules (HSM) to store the keys on their devices. Hardware security modules protect cryptographic keys and operations. HSMs provide high levels of protection against key compromise by device software and firmware bugs, and usually provide good protection against hardware attacks. Hardware-based security can reduce the risk of device cloning, can improve supply-chain security, and can bootstrap secure and reliable device enrollment using the Device Provisioning Service. Some of you might be new to using HSMs and are wondering exactly how the Device Provisioning Service validates a device’s identity, especially when using TPMs, and why it’s so secure. This post describes the identity attestation process when using a TPM.
TPM stands for Trusted Platform Module and is a type of HSM. This blog post assumes you’re using a discrete, firmware, or integrated TPM. Software emulated TPMs are well-suited for prototyping or testing, but they do not provide the same level of security as discrete, firmware, or integrated TPMs do. Please don’t use software TPMs in production. Learn more about the types of TPMs.
This article is only relevant for devices using TPM 2.0 with HMAC key support and their endorsement keys and not for devices using X.509 certificates for authentication. TPM is an industry-wide, ISO standard from the Trusted Computing Group, and you can read more about TPM at the complete TPM 2.0 spec or the ISO/IEC 11889 spec. Check out this blog post to learn more about secure hardware with the Device Provisioning Service using X.509 certificates. This article also assumes you are familiar with public and private key pairs, and how they are used for encryption.
You don’t have to implement any of this if you’re using the Device Provisioning Service device SDKs – we handle all this mess so you don’t have to. Some folks who are new to TPMs want to have a better understanding of what’s going on with their security chip and why it’s so secure, and this post is for them.
TPMs use something called the endorsement key (EK) as the secure root of trust. The EK is unique to the TPM and changing it essentially changes the device into a new one. There's another type of key that TPMs have, called the storage root key (SRK). An SRK may be generated by the TPM's owner after it takes ownership of the TPM. Taking ownership of the TPM is the TPM-specific way of saying "someone sets a password on the HSM." If a TPM device is sold to a new owner, the new owner can take ownership of the TPM to generate a new SRK, which ensures the previous owner can't use the TPM. Because the SRK is unique to the owner of the TPM, the SRK can be used to seal data into the TPM itself for that owner. The SRK provides a sandbox for the owner to store their keys and provides access revocability if the device or TPM is sold. It's like moving into a new house: taking ownership is changing the locks on the doors and destroying all furniture left by the previous owners (SRK), but you can't change the address of the house (EK). It's not a perfect analogy, but you get the idea. Once a device has been set up and ready to use, it will have both an EK and an SRK available for use.
One note on taking ownership of the TPM: Taking ownership of a TPM depends on a lot of things, including TPM manufacturer, the set of TPM tools being used, and the device OS. Follow the instructions relevant to your system to take ownership.
The Device Provisioning Service uses the public part of the EK (EK_pub) to identify and enroll devices. The device vendor can read the EK_pub during manufacture or final testing and upload the EK_pub to the provisioning service so that the device will be recognized when it connects to provision. Note that the Device Provisioning Service does not check the SRK or owner, so “clearing” the TPM erases customer data, but the EK (and other vendor data) is preserved and the device will still be recognized by the Device Provisioning Service when it connects to provision.
When a device with a TPM first connects to the Device Provisioning Service, the service first checks the provided EK_pub against the EK_pub stored in the enrollment list. If the EK_pubs do not match, the device is not allowed to provision. If the EK_pubs do match, the service then requires the device to prove ownership of the private portion of the EK via a nonce challenge, which is a secure challenge used to prove identity. The Device Provisioning Service generates a nonce and then encrypts it with the SRK and then the EK_pub, both of which are provided by the device during the initial registration call. The TPM always keeps the private portion of the EK secure. This both prevents counterfeiting and ensures that SAS tokens are securely provisioned to authorized devices.
Let’s walk through the attestation process in detail.
Step 1: When the device first connects to the Device Provisioning Service and requests to provision, it provides the service with its registration ID, an ID scope, and the EK_pub and SRK_pub from the TPM. The service passes the encrypted nonce back to the device and asks the device to decrypt the nonce and use that to sign a SAS token to connect again and finish provisioning.
Step 2: The device takes the nonce and uses the private portions of the EK and SRK to decrypt the nonce into the TPM; the order of nonce encryption delegates trust from the EK, which is immutable, to the SRK, which can change if a new owner takes ownership of the TPM.
Step 3: The device can then sign a SAS token using the decrypted nonce and reestablish a connection to the Device Provisioning Service using the signed SAS token. With the Nonce challenge completed, the service allows the device to provision.
Now the device can connect to IoT Hub, and you can rest secure in the knowledge that your devices’ keys are securely stored. Now that you know how the Device Provisioning Service securely verifies a device’s identity using TPM, you can get started using auto-provisioning using the SDKs to take care of the flow.
To sum things up with a limerick:
A fancy nonce challenge identifies
Devices aspiring to authorize
With ID precision
A TPM’s key storage exercise