CA/Responding To An Incident: Difference between revisions

Remove draft status, and minor edits
(fix typos)
(Remove draft status, and minor edits)
Line 1: Line 1:
{{draft}}
The page gives guidance to CAs as to how Mozilla expects them to react to reported misissuances, and what the best practices are. For the purposes of this page, a "misissuance" is defined as any certificate issued in contravention of any applicable standard, process or document - so it could be RFC non-compliant, BR non-compliant, issued contrary to the CA's CP/CPS, or have some other flaw or problem. Researchers who find CA misissuances are welcome to include a link to this page in their report to the CA, reminding the CA that Mozilla has the following expectations.
 
The page gives guidance to CAs as to how Mozilla expects them to react to reported misissuances, and what the best practices are. For the purposes of this page, a "misissuance" is defined as any certificate issued in contravention of any applicable standard, process or document - so it could be RFC non-compliant, BR non-compliant, issued contrary to the CA's CP/CPS, or have some other flaw or problem.


While some forms of misissuance may be seen as less serious than others, opinions vary on which these are. Mozilla sees all misissuances as good opportunities for the CA to test that their incident response processes are working well, and so we expect a similar level of timeliness of response and quality of reporting for all incidents, whatever their adjudged severity.
While some forms of misissuance may be seen as less serious than others, opinions vary on which these are. Mozilla sees all misissuances as good opportunities for the CA to test that their incident response processes are working well, and so we expect a similar level of timeliness of response and quality of reporting for all incidents, whatever their adjudged severity.


We do not expect perfection from any CA; it is true that our confidence in a CA is in part affected by the number and severity of incidents, but it is also significantly affected by the speed and quality of incident response.
To be clear on the status of this document: this is a best practices document, not an official policy, and does not use normative language. Therefore, failure to follow one or more of the recommendations here is not by itself sanctionable. However, failure to do so without good reason may affect Mozilla's general opinion of the CA. Our confidence in a CA is in part affected by the number and severity of incidents, but it is also significantly affected by the speed and quality of incident response.  


= Immediate Actions =
= Immediate Actions =
Line 11: Line 9:
In almost all cases, a CA should immediately cease issuance from the affected part of your PKI until you have diagnosed the source of the problem.
In almost all cases, a CA should immediately cease issuance from the affected part of your PKI until you have diagnosed the source of the problem.


Once the problem is diagnosed, you can restart issuance even if a full fix is not rolled out, if you are able to put in place temporary or manual procedures to prevent the problem re-occurring. You should not restart issuance until you are confident that the problem will not re-occur.
Once the problem is diagnosed, you may restart issuance even if a full fix is not rolled out, if you are able to put in place temporary or manual procedures to prevent the problem re-occurring. You should not restart issuance until you are confident that the problem will not re-occur.


= Revocation =
= Revocation =


It is normal practice for CAs to revoke misissued certificates. But that leaves the question about when this should be done, particularly if it's not possible to contact the customer immediately, or if they are unable to replace their certificate quickly. Section 4.9.1.1 of the CA/Browser Forum’s Baseline Requirements states:
It is normal practice for CAs to revoke misissued certificates. But that leaves the question about '''when''' this should be done, particularly if it's not possible to contact the customer immediately, or if they are unable to replace their certificate quickly. Section 4.9.1.1 of the CA/Browser Forum’s Baseline Requirements currently states (version 1.4.9):


<blockquote>
<blockquote>
Line 27: Line 25:
This means that, in most cases of misissuance, the CA has an obligation under the BRs to revoke the certificates concerned within 24 hours.
This means that, in most cases of misissuance, the CA has an obligation under the BRs to revoke the certificates concerned within 24 hours.


However, it is not our intent to introduce additional problems by forcing the immediate revocation of certificates that are not BR compliant when they do not pose an urgent security concern. Therefore, we request that your CA perform careful analysis of the situation. If there is justification to not revoke the problematic certificates, then your report will need to explain those reasons and provide a timeline for when the bulk of the certificates will expire or be revoked/replaced.
However, it is not our intent to introduce additional problems by forcing the immediate revocation of certificates that are not BR-compliant when they do not pose an urgent security concern. Therefore, we request that your CA perform careful analysis of the situation. If there is justification to not revoke the problematic certificates, then your report will need to explain those reasons and provide a timeline for when the bulk of the certificates will expire or be revoked/replaced.


If your CA will not be revoking the certificates within 24 hours in accordance with the BRs, then that will need to be listed as a finding in your CA’s BR audit statement.
If your CA will not be revoking the certificates within 24 hours in accordance with the BRs, then that will need to be listed as a finding in your CA’s BR audit statement.
Line 43: Line 41:
* Scan your corpus of certificates to look for others with the same issue. It does not look good for a CA to claim they have revoked all affected certificates and resolved the issue, and then for a researcher to discover another set of certificates with the same or a similar problem.
* Scan your corpus of certificates to look for others with the same issue. It does not look good for a CA to claim they have revoked all affected certificates and resolved the issue, and then for a researcher to discover another set of certificates with the same or a similar problem.


* Examine whether there are potential related problems which you can also remediate at the same time. For example, if the problem was bad data in a particular field, consider improving the validation of all fields in the certificate prior to issuance. You should be proactively looking for ways to harden your issuance pipeline against further problems.
* Examine whether there are potential related problems which you can also remediate at the same time. For example, if the problem was bad data in a particular field, consider improving the validation of all fields in the certificate prior to issuance. You should be proactively looking for [https://crt.sh/linttbscert ways] to harden your issuance pipeline against further problems.


* If, as happens in a regrettably large number of cases, a problem report was sent to your CA but action was not taken within 24 hours, investigate what happened to that report and whether your report handling processes are adequate.
* If, as happens in a regrettably large number of cases, a problem report was sent to your CA but action was not taken within 24 hours, investigate what happened to that report and whether your report handling processes are adequate.
Line 59: Line 57:
# Confirmation that your CA has stopped issuing TLS/SSL certificates with the problem.
# Confirmation that your CA has stopped issuing TLS/SSL certificates with the problem.
# A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
# A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
# A complete list of the problematic certificates. The recommended way to handle this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
# The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
# Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
# Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
# List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
# List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
Line 65: Line 63:
= Keeping Us Informed =
= Keeping Us Informed =


Once the report is posted, you should provide regular updates giving your progress, and confirm when the remediation steps have been completed. Such updates should be posted to the m.d.s.p. thread, if there is one, and the Bugzilla bug, if there is one.
Once the report is posted, you should provide regular updates giving your progress, and confirm when the remediation steps have been completed. Such updates should be posted to the m.d.s.p. thread, if there is one, and the Bugzilla bug, if there is one. The bug will be closed when remediation is completed.


= Examples of Good Practice =
= Examples of Good Practice =
Account confirmers, Anti-spam team, Confirmed users, Bureaucrats and Sysops emeriti
4,925

edits