Confirmed users, Bureaucrats and Sysops emeriti
812
edits
(Created page with '{{draft}} == WEP 106 - Backoff Specification == * Champions: [mailto:mconnor@mozilla.com Mike Connor] * Status: Draft * Type: ? * Created: 2009 Sep 1 * Reference Implementation…') |
|||
| Line 18: | Line 18: | ||
* Error Handling | * Error Handling | ||
** Handle 503 + Retry-After as an explicit "stop syncing, retry after the time given by the server" | ** Handle 503 + Retry-After as an explicit "stop syncing, retry after the time given by the server" | ||
** Handle all other HTTP errors as an immediate backoff. | ** Handle all other HTTP 5xx errors as an immediate backoff. | ||
** Handle all other errors as network issues, retry once on a normal schedule, then back off. | ** Handle all other errors as network issues, retry once on a normal schedule, then back off. | ||
* Backoff intervals | * Backoff intervals | ||
** If we receive 503+Retry-After, we will retry after that time + some amount of fuzzing later (to ensure that clients don't bunch up at the end of a service downtime) | ** If we receive 503+Retry-After, we will retry after that time + some amount of fuzzing later (to ensure that clients don't bunch up at the end of a service downtime) | ||
** For all other issues, we will follow a progressive series of intervals, with a significant amount of entropy to guard against traffic spikes. | ** For all other issues, we will follow a progressive series of intervals, with a significant amount of entropy to guard against traffic spikes (existing impl). | ||
* UI presentation | |||
** Initial backoff phase (first few attempts) | |||
*** Some friendly notification, with messaging to make clear that we will automatically retry and this is probably temporary. | |||
*** Except for the 503+Retry-After case, users will be allowed to manually try _once_, after which UI will be disabled if that sync fails due to server issues. | |||
** Secondary backoff phase (after three retry attempts) | |||
*** No option to manually sync, clearly we have major issues on the server at this point. | |||
*** Stronger warning, so users are aware data is not syncing. (Key principle: if we can't propagate the user's data, we should inform them.) | |||
* Open questions | |||
** In high load situations where the service is simply overloaded and not dead, we probably want a way to tell the server that we're in backoff mode already (i.e. retryAttempt=2) so that rather than an all or nothing backoff, we can selectively update the users who are furthest behind. The big question is how do we ensure that the next sync attempt is still delayed, not in 5 minutes. | |||
=== Pre-Requirements === | === Pre-Requirements === | ||