NSS OCSP Brainstorming

From MozillaWiki
Jump to: navigation, search

This page is meant as a place to share ideas about the behaviour of OCSP requests and caching in NSS.

NSS' verification behaviour

NSS uses a global setting to have OCSP either enabled or disabled.

What happens if code execution attempts to use OCSP to verify a certificate?

NSS will:

  • open a connection to the OCSP server using HTTP
  • send a request, receive a response
  • verify the validity (signature, freshness) of a response

Only after all of the above succeds then NSS will make use of the received response.

In the past, if OCSP was enabled, and NSS considers a certificate for OCSP checking, NSS would strictly require a successful communication with a valid response as explained above. If there were any failure, NSS would have treated the the cert as invalid.

NSS continues to offer this strict behavior and uses it by default. The mode is called ocspMode_FailureIsVerificationFailure.

Starting with NSS 3.11.7 an application may globally set NSS' behavior to a relaxed mode called ocspMode_FailureIsNotAVerificationFailure.

In the relaxed mode, OCSP communication will be attempted, but its success is optional. Any failure during the OCSP protocol or the response verification will be treated as "no response available" and cert verification will be limited to the other checks.

In the relaxed mode, only a valid response that indicates a revoked certificate will cause NSS to reject a certificate. Such a response might be cached. Once information about a revoked certificate has been cached, NSS will reject the certificate. NSS might continue to ask an OCSP server about current certificate status, but an failure to obtain a valid response would not override the previously obtained revocation information.

NSS' internal OCSP Cache

Starting with NSS version 3.11.7 NSS will cache OCSP responses.

As of version 3.11.7, the cache will be cleared when the application switches any OCSP settings.

As of version 3.11.7, if the OCSP server sends information for multiple certificates, only received information about the certificate of interested will be added to the cache. It has been proposed to optimize this in the future, however, it must be ensured that bulky responses will not kick more important information out of the cache.

NSS uses a lower limit on retrying OCSP which is set to 1 hour by default (as of version 3.11.7). In relaxed mode, after NSS tried to obtain an OCSP response, NSS will not retry to fetch an answer again until after the end of this period. Even if no valid response could be obtained, NSS will remember this failure and not try until after the end of the period.

This lower limit is used differently in strict mode. If NSS has no information cached at all about a certificate, it will attempt to talk to the OCSP server each time verification for such a certificate is requested. However, once a response could be received, NSS will use the cached information and not talk to the OCSP server until after the lower time boundary.

An OCSP response may contain an optional next-update information. A CA or OCSP server can use this to communicate the earliest time where more recent information might be available. In theory, a cached response may be considered fresh until after the next-update time has been reached.

However, some CAs use next-update values of weeks or months. Because of that NSS uses an upper boundary to define whether a cached response is fresh or not. As of NSS version 3.11.7 the upper boundary is 24 hours.

As is pointed out below, this behaviour might be problematic in strict mode. If a CA decides to produce new responses every 48 hours, they will set the cache control headers on the HTTP response containing the OCSP response appropriately. This will mean an intermediate cache is perfectly allowed to hold on to the response for 48 hours, and so checks will start failing after the upper boundary (24 hours) is reached. - Gerv

Once NSS considers a cached OCSP response to be no longer fresh, it will attempt to obtain a new response. In relaxed mode, NSS will ignore failures. However, in strict mode, NSS will require to obtain a new valid response or reject the cert as invalid.

HTTP POST vs. HTTP GET

As of today NSS always uses HTTP POST when talking to OCSP servers. Using POST prevents caching of OCSP responses in proxy servers.

It has been proposed that NSS shall support HTTP GET. An application shall be able to instruct NSS to use GET by default.

It is unclear whether all deployed OCSP servers fully support HTTP GET. If there are doubts, NSS would have to fall back to use POST on failures. This would cause additional traffic for those servers who do not support GET. Ideally NSS might remember (during a process lifetime) which servers fail to support POST and might use GET for the remainder of the session. (Is this smartness really required? Opinions?)

This problem is due to some braindeadedness on the part of the person who defined the GET interface for OCSP. I know that the Red Hat server doesn't, and Tumbleweed does. We could do further investigation if necessary. Might it be possible for NSS to keep a blacklist of the Server identification strings of bad servers, and use POST only in those cases after the first request fails? - Gerv

When using GET we open a new class of problem for the OCSP-client to OCSP-server communication, because responses can originate from a proxy server cache.

NSS has a requirement that a response is fresh. As of today this means, the timestamp contained in a signed OCSP response must be at least 24 hours old.

This sentence is unclear. Do you mean "must be less than 24 hours old"? - Gerv

A problem arises in the following scenario:

  • OCSP server issues a OCSP response
  • the proxy server decides the response shall be valid for more then 24 hours
  • after 24 hours an NSS client receives the old response from a proxy cache and rejects the response as invalid, because the timestamp is not fresh

This is a small problem in relaxed mode. NSS will ignore the bad response. However, in the case of a revocation, it disables the CA's ability to push out a new response.

See above for my comments on this problem. The decision by a proxy server to hold onto the response will be based on the cache control headers sent by the CA, and so any disabling of their ability to do anything is their own fault. However, I think we should consider the approach of removing NSS's upper limit, or allowing it to be disabled. We should assume that the CAs are competent enough to set the nextUpdate dates and cache headers to match how often the data actually is updated - which means that there's no advantage in ignoring it and making requests more often. We just increase their traffic and annoy them. - Gerv

In strict mode the situation is worse. NSS will reject the old response and give up, even thought the real OCSP server would have been available to provide a more recent answer.

How should this problem get solved?

Another topic: It has been proposed, CAs might produce bulk responses, like a single response for a group of 50 certificates. It has been said, a proxy cache might be able to carry only a single response for this group. However, it seems this will not work, because the requests for each individual cert will look completely different, and therefore the proxy server will see different "keys" and will be unable to group those requests and responses.