Keylime’s durable attestation makes security auditable

by Marcio A. Silva, George Almási, James Bottomley, Lily Sturmann, Michael Peters | Apr 25, 2023 | Trust

Remote attestation answers the (slightly paraphrased) question: “It’s 3pm, do you know what the systems in your data center are doing?” Going with this premise, durable attestation answers a similar question about 3pm yesterday.

In this article we describe the concepts of remote attestation and durable attestation, as well as an implementation in the context of ongoing CNCF, IBM and Red Hat projects. We focus on the operational advantages of durable attestation—replay of past attestation events for purposes of auditing, as well as the added capability of “what if” attestation. We also describe some of the system security implications, including better transparency and fewer operational security requirements, both of which result in a more trustworthy system overall.

Note: Red Hat’s Emerging Technologies blog includes posts that discuss technologies that are under active development in upstream open source communities and at Red Hat. We believe in sharing early and often the things we’re working on, but we want to note that unless otherwise stated the technologies and how-tos shared here aren’t part of supported products, nor promised to be in the future.

Introduction: remote attestation

Remote attestation is a process that proves or ascertains certain properties of one or more devices (ranging from full servers to embedded sensors) in a data center to an outside observer. Attestation includes information on the booted kernel, installed packages, installed firmware, etc., things that evidently are of interest to any operator managing a collection of such devices.

In the context of this article, remote attestation is provided by Keylime, an open source remote attestation project sponsored by the Cloud Native Computing Foundation (CNCF) and supported by Red Hat. What makes Keylime remote attestation different from other intrusion detection systems (IDS) like AIDE or OSSEC, or endpoint management systems like falcons or BigFix, is that the “proof” provided by attestation is considered complete in the mathematical sense, with premises and provable deductions. Keylime is packaged by Red Hat starting with Red Hat Enterprise Linux 9.1 and Fedora 36.

Keylime’s operation is centered around the concept of a hardware root of trust. The purpose of the root of trust is to provide a foundation during device (e.g. a server) startup. During startup, the device records a series of measurements. These measurements then can be used to prove facts about the booted device. The measurements themselves are arranged in a chain of trust, making the trustworthiness of a measurement depend on those preceding it, traceable back to the root of trust.

The Trusted Platform Module (TPM) is an industry-standard cryptoprocessor that can be used to implement the root of trust. Keylime is capable of running attestation on TPM-based devices.

Facts about a device, even provable facts, are only useful insofar as they can be compared with a set of properties established to evaluate a device’s fitness for its purpose. In our context, these properties and the rules governing them are cumulatively called an attestation policy. An attestation policy includes hardware and software bills of materials (HBOM, SBOM). For example, the best way to unambiguously identify devices equipped with TPM devices is to keep a record of the TPM’s Endorsement Keys (EKs); this would be an entry in the hardware bill of materials, together with the make, amount of RAM, physical position in the data center, its IP address(es), etc. The software bill of materials includes firmware, kernel, operating system, executables, libraries and configurations of the device(s) in question.

Attestation policy must be secured since attestation is, at most, as trustworthy as the policy it attests to. It is not within the purview of Keylime to ensure the security of attestation policies; this duty rests with the Keylime managers and operators. Typical implementations involve digital signatures of the attestation policy.

The outcome of Keylime attestation is a (provable) statement about whether the devices listed in the inventory do, or do not, satisfy the requirements set forth in the attestation policy. The proof itself is implemented by a Keylime component called the verifier.

Durable attestation

As implemented by Keylime today, attestation is an instant verdict system. Attestations are recurring with a certain frequency, and any verdict issued by the verifier is invalidated by the next attestation. The process is unimportant until a device fails attestation. At that point, an intervention is necessary, since the device no longer holds the Keylime operator’s trust.

Durable attestation (DA) changes this. Instead of just an instant verdict, durable attestation also creates a permanent record. This record is created in such a way that the attestation can be replayed, making attestations reproducible (durable). A key anticipated use of durable attestations is to prove to a third party that, at a particular time, the system was in a known (and monitored) state.

Durable attestation policies

Systems change over time. In today’s world of cloud computing, they come in and out of existence as needed. It is also self-evident that attestation policies (HBOM and SBOM) change over time; we can think of them as a time series. Traditional attestation requires attestation policy to be trustworthy at the moment when attestation is performed (as mentioned before, this is typically achieved by having a responsible party digitally sign the attestation policy).

Durable attestation has a stricter requirement—since the goal is to make the original outcome of attestation reproducible, any attestation policy that is valid at the time of the original attestation must never be altered retroactively, even (and especially) by its authors. Thus, durable attestation policies are required to be both trustworthy and non-repudiable.

Enter sigstore, another CNCF project with significant contributions from Red Hat. Sigstore’s rekor—by its own description, a tamper-resistant ledger of SBOM metadata—is an exact match for the purpose of rendering records non-repudiable. Rekor safeguards a set of digital signatures associated with individual SBOM and inventory records, thereby making these both trustworthy (guaranteed by the signer) and non-repudiable.

Durable registration records and the Keylime registrar

Part of the design of the TPM is that the key identifying the device (the Endorsement Key) is not suitable for signing attestation records (also known as “TPM quotes”). Keylime’s implementation uses a secondary derived key, called the Attestation Key (AK) for that purpose. Unlike the EK, the AK is not tied to the TPM device’s identity (for privacy reasons). In order to associate an attestation record with the generating TPM device, the EK→AK correspondence needs to be recorded. In Keylime, this is done by a separate, long-running component named the registrar. In the current implementation, trust in the registrar is implicit (there are no mechanisms in place to “prove” the Keylime registrar’s reliability).
Durable attestation once again has a stricter requirement—just like attestation policy, the EK→AK correspondence must be recorded in a time series and made non-repudiable. Just like with the attestation policy, this can be achieved using rekor—the only significant difference is that signing authority for the EK→AK correspondence is held by an automated process (the registrar).

Durable attestation does not require the Keylime registrar to be “more trustworthy” than during normal attestation, but a compromised registrar causes damage by compromising all attestation records that depend on a particular EK→AK registration event. In the case of such compromise, a tamper-resistant ledger like rekor allows us to determine which records attestation records were affected.

Durable attestation records and the Keylime verifier

The entire purpose of durable attestation is to create a time series of attestation records, that is, artifacts that represent the state of the monitored devices over time, and which can be used to replay attestation.

In the process of normal attestation, these records are gathered and processed by the Keylime verifier, then discarded (i.e., overwritten) once the verdict is reached. Keylime invests implicit trust in the verifier by accepting its verdict. However, unlike with attestation policy and registration records, attestation records do not require any trust themselves, since the integrity of the root-of-trust (in our case TPM cryptoprocessors) guarantees that attestation records cannot be falsified.

One could be forgiven for thinking that durable attestation records must be made non-repudiable (like registration records). However, this is not necessarily true. The process of collecting and storing attestation records does not require any trust. Any attempt at altering attestation records—either by the verifier before storing them, or by third parties subsequently—can be detected, because the TPM quote is signed by the private portion of the AK, which only exists in the TPM device, and all other components of the attestation record are authenticated by the TPM quote.

The above argument leaves open the possibility of replay attacks (that is, undetectably altering the record by removing “inconvenient” attestation records and replacing them with copies of earlier records). However, there are two mechanisms in the TPM device that make this next to impossible. First, TPM quotes encode a random nonce provided by the verifier, and any attempt to a replay attack would be detected by finding recurring nonces. Second, TPM quotes also include a TPM timer. The timer is not necessarily accurate, but it is monotonically increasing, and a replay attack would necessarily include a TPM timer that is “out of sequence”.

This makes a strong case that for the purposes of durable attestation, the Keylime verifier no longer needs to be trusted, which further minimizes risk. The (private) AK signs the TPM quote and therefore guarantees its integrity (and, by extension, the entire record’s integrity). The only missing link is the one between the key which uniquely identifies a device’s TPM hardware (the EK) and the AK. It is precisely to record this (temporary) association in a permanent, durable fashion that the aforementioned registrar will create an entry in rekor, as discussed above.

In fact, as a testament to the flexible implementation of this new feature on Keylime, the very same (verifier) code used on normal attestation can be reapplied, unmodified, to the time series of attestation records, in order to prove a device’s state at a given point in time.

Offline attestation

Offline attestation is the post-facto replay of attestation records with the explicit purpose of attesting system states from the past. An abbreviated form of the replay process runs as follows:

Assumptions
- Assume access to relevant durable registration and attestation records.
- Assume access to relevant attestation policy (hardware, software inventories).
- Furthermore, assume that signatures on relevant attestation policy and durable registration records have been recorded in a transparency log (e.g. rekor).
Authenticate attestation record(s)
- Authenticate registration records (check against transparency log, verify digital signatures).
- Authenticate the TPM quote enclosed within each attestation record (identify relevant public AK from registration record, decode TPM quote with public AK, guard against replay attacks by checking timer and nonce information)
- Authenticate remainder of attestation record using the TPM quote.
Verify attestation record against policy
- Retrieve and authenticate attestation policy (check against transparency log, verify digital signatures).
- Check attestation record against attestation policy and render a verdict.

The basic guarantee of replaying a durable attestation record is that the verdict will be the same that was reached at the time the record was collected. This can be useful when examining a security incident in the past—attestation records contain information about systems that are no longer operational or may no longer even exist.

A simpler, more versatile trust model

Normal Keylime usage invests the deployed Keylime server components (verifier and registrar) with a high degree of trust. After all, it is trusted to render the correct verdict. A normal Keylime server is assumed to be running in a non-compromised environment.

With durable attestation, we no longer need to trust Keylime with rendering a verdict. Therefore Keylime no longer needs access to an up-to-date attestation policy. In effect, we are demoting Keylime to a mere data collection service, with the verdict being rendered on the collected data. Ideally, this would mean that Keylime can now operate in an untrusted environment. The only component falling short of this ideal is the Keylime registrar, which must still be trusted to record the correct EK→AK association.

This leads to a simpler trust model. Ideally, we need not trust anything but attestation policy records, and use these to evaluate the collected attestation records.

Note that Keylime is not even required to be involved in offline attestation at all. The records themselves follow standardized open formats (e.g. the TPM quote, the measured boot log, or the IMA log), at most slightly altered by data storage requirements (e.g. base64 encoding, JSON encapsulation, GRPC etc). A multiplicity of open source non-Keylime tools can be brought to bear for data processing.

Disassociating data collection from evaluation also gives us more options for sharing attestation information. It goes without saying that attestation records are extremely sensitive and sharing with third parties must be done carefully. However, sharing attestation records can be done with much finer granularity and better control than permitting access to the data center where the Keylime server is running.

Finally, note that offline attestation is not limited to replaying verdicts without alteration. With trivial alterations, in the future, it can also be used provide answers to a variety “what if” scenarios involving hypothetical attestation policies, e.g.:

Would a system have passed attestation under a more restrictive attestation policy than was active at the time? If not, what percentage of the machines in the system would have passed?
Given post-facto knowledge about a CVE, when was the first (and last) time a system was exposed to this CVE? What is the average time-to-patch of a system to CVEs?

Other such scenarios—such as tenant-specific attestation policies on shared-use machinery—may be contrived, but will be left to the imagination of the reader.

Prospective changes to Keylime

We should note that for all the discourse about reduced trust in Keylime, we have not removed the requirement that at least parts of Keylime operate in a trusted environment. The Keylime registrar continues to be vital to the trustworthiness of the collected attestation records.

With that being said, however, we foresee pervasive changes in the Keylime architecture. The most obvious consequence is that without a requirement to render an instant verdict, there is no actual need for a Keylime verifier. Data collection can be done by the attested systems doing their own data recording. Keylime also no longer needs to be fed accurate and timely attestation policy information.

The flip side of this simplicity is that the complexity of dealing with attestation policies becomes the offline attestation tool’s problem. The saving grace here is that there is no longer “a” verifier—evaluation of attestation records can be done by multiple tools on multiple schedules and produce multiple outcomes. Not least, this also allows the Keylime architecture better scaling and resilience. We expect Keylime offline attestation tooling to become more versatile and purpose-specific over time.

Conclusion

In this article we have presented the concept of durable attestation, essentially changing attestation from a point-in-time system (“only the last measurement matters”) to a time series (“all measurements are retained for posterity”). This outwardly-simple change leads to several desirable consequences. Durable attestation allows retroactive auditing with Keylime, while at the same time reduces the degree of security necessary for the correct operation of Keylime. Durable attestation records also allow alternative interpretation (checking against more or less restrictive attestation policies than the original, as a way to answer “what if” questions).

An implementation of durable attestation in Keylime is underway at the time this article is published. In the short term, durable attestation will be an additional, experimental option for Keylime operators. But we envision a time when durable attestation will become more mainstream, and the default way to run Keylime securely in production, given its innate advantages.