Security/DNS Over HTTPS
This article describes the various mechanics of Firefox's DNS over HTTPS (DoH) frontend. See Trusted Recursive Resolver for details on TRR implementation in necko.
- The DoH frontend and its sub-features are gated behind prefs that are set to true via Normandy Rollouts, which allows us to target specific regions and control population size and growth so we can manage risk.
- The pref `doh-rollout.enabled`, serves as a blanket gate. Every mechanism described below depends on this pref being set to true.
- Individual mechanisms may be additionally gated behind their own prefs. This is indicated where relevant.
- We run various heuristics to determine whether the network is (un)suitable to enable DoH.
- The heuristics are run at startup and upon network changes.
- DoH is enabled on the network if all heuristics pass, and disabled otherwise.
- Main article: Security/DNS Over HTTPS/Heuristics.
- If we detect that the user changed their DoH settings in about:preferences, we permanently turn off our heuristics and other mechanisms. The user-set values are obeyed.
- This holds for prefs that were set prior to enrollment in the rollout.
- In order to avoid interfering with enterprise-configured network behaviors, we disable our heuristics and other mechanisms if any policy is active on the client.
- This is true whether the policy is configured on the local machine or propagated by the network e.g. via Group Policy.
- If a DNSOverHTTPS policy to turn on DoH is in effect, this is respected and heuristics and other mechanisms will be enabled.
Default Provider Selection
- Before running heuristics for the first time, we attempt to choose one of the available providers as the default for the profile.
- The chosen default is used whenever DoH is enabled, via the pref `doh-rollout.uri`.
- A network-provided endpoint, if detected, will take precedence over the default provider when on that network. (See Provider Steering below)
- This feature is controlled by the prefs `doh-rollout.trr-selection.enabled`.
- Choice of provider is made by using each available provider to do lookup several popular domains as well as random subdomains of `firefox-dns-perf-test.net`.
- Default provider selection is done in two phases: a dry-run followed by committing the result.
- By default, this feature is dry-run-only, and records the result in a pref `doh-rollout.trr-selection.dry-run-result`.
- Committing the result is enabled by another pref `doh-rollout.trr-selection.commit-result`. If this is true, then after the dry-run step, the `dry-run-result` will be copied into `doh-rollout.uri`.
- Some providers supply their own DoH endpoints which we want to use if indicated.
- This capability is discovered via the CNAME response when looking up the domain `doh.test`.
- Discovery is only attempted if all heuristics are passing on the network.
- A DoH endpoint discovered in this manner takes precedence over the automatically chosen default provider (see Default Provider Selection above).
- A provider (endpoint + expected CNAME for discovery) must be explicitly supported for this mechanism to work.
- Currently, Comcast is the only supported provider.
- This feature is controlled by the pref `doh-rollout.provider-steering.enabled`.
- When the client is first enrolled in the rollout, we show a doorhanger popup to let the user know that DoH is available.
- This doorhanger offers an option to opt-out, which results in permanently disabling heuristics and other mechanisms.
- The doorhanger is shown only if the rollout is "successful" - i.e. the user did not already have custom DoH preferences or active enterprise policy.
- The doorhanger is implemented as a CFR message, gated behind the relevant prefs.
- Interaction and functional data is collected in the form of two telemetry Events.
- A state event is sent when the DoHController's state changes, e.g. when DoH is enabled or disabled on the network, when a user-choice results in disabling heuristics, when a rollback is detected, etc.
- A heuristics event is sent whenever we run heuristics, containing the result of each heuristic as its payload, along with the trigger (e.g. startup, network change) and the provider steering status.
- We have several migrations to support users upgrading from older versions of Firefox as well as the rollout when they upgrade to newer versions.
- Two of the migrations work on the format of stored state (local storage and prefs)
- During a dry-run-only test of Default Provider Selection, an underlying bug was triggered that caused clients to effectively DDoS NextDNS's endpoint. In the aftermath, a new endpoint was set up and we have a migration to convert occurrences of the old endpoint in stored URI values to the new one.
- TODO: list all involved prefs and semantics.