Gaia/System/Updates/GeckoGaia

From MozillaWiki
< Gaia‎ | System‎ | Updates
Jump to: navigation, search

Gecko Updates

Overview

  • Will be updated via:
    • Full-System FOTA updates (Gonk+Gecko+Gaia) [or]
    • Atomically w/ Gaia (Gonk not included)
  • Frequency:
    • Updates will probably need to happen somewhere between the current Desktop (6 week) and Extended Support Release (42 week) intervals. Will depend also on input from carriers and OEMs.
    • Current proposal is to offer regular updates every 18 weeks. This frequency offers Mozilla and our partners the ability to update functionality on the device at a quicker pace than competitor OS stacks (iOS and Android), but at the same time not overwhelm our Carrier partners who may not be used to updating software so frequently.
  • Backups:
    • Requirements is to offer a back-up instance of Gecko to ensure we can failover when necessary (if we somehow shipped an updated Gecko version that resulted in a bug).
    • Probably a Phase 2 item.
  • Cost: Users will not be charged any carrier network fees for any Gecko update, as agreed upon by Telefonica, who will be making these updates via a private APN.
  • User can view current version information from Settings > Device > Device Information.
  • Update process:
    • Download:
      • Files will be binary diff'd to minimize size.
      • If the interrupts download (turns of Airplane mode, powers off device, etc) we can theoretically complete at later time.
      • Necko has the ability to download ranges, but this also needs server support. We certainly could require such support though. (as per Jonas)
    • Install
      • Time to apply will vary depending on size of update, internal disk speed, device hardware spec, etc
      • Will require restarting process (aprox 10 seconds), but not rebooting the device. Reboot _may_ be required as fail safe if /system is somehow left in read-write after the updater is finished.
      • Battery life: We can detect current battery level, but not drain rate (which varies with battery age). It will also be difficult to estimate the amount of power required to complete an install. Therefore we should build healthy margins into any "minimum required battery" thresholds.
  • User prompts:
    • ...

Bugs

Questions

Open

  • Do we need to have a mechanism for pushing extra-critical updates?
  • Do we check for available device storage before downloading? If insufficient, how do we mitigate?
  • How much user agency do we provide over installs? Can they defer? For how long? What affordances do we make for out of date Gecko?
  • How can the user review the currently-installed version? From Settings?

Answered

  • What is the sequence of events (eg: prompt user to restart device, whereupon install process runs?)
    • (marshall_law)
    • For user prompting sequence / rules, see the requirements laid out by cjones in these bugs:
    • Currently, the plan is to automatically download any Gecko updates to /data/local/updates. IIRC we are working w/ carriers to make sure this isn't billable data.
    • After an update is downloaded, the update is staged in /system/b2g/updated, and the b2g process is cleanly shutdown (and then restarted by the system)
    • Soon after bootup of the new b2g process, the updater runs again, copying the staged updates into /system/b2g to make them live.
    • Note: The updater process will re-mount /system as read-write when it starts, and back to read-only when it exits. This will happen for both staging, and copying the staged files in place. In the event of a failure remounting /system as read-only (before exit), the device will be restarted, allowing the system to mount /system as read-only. See https://bugzilla.mozilla.org/show_bug.cgi?id=764683 for more details
    • (/marshall_law)
  • How often should we check for Gecko update? CLee's etherpad outline says every 18 weeks.
    • (sicking) This sounds too rare given that we ship Gecko updates every 6 weeks which always contain security updates. Usually we count on security researchers being able to reverse engineer those updates and create exploits for older versions of Gecko.
    • (marshall_law) IIRC the reasoning had to do with risk mitigation from carriers (it was a compromise of some kind?)
  • Can we confirm that there will be a back-up instance of Gecko in event of failed updates? If so, what will sequence of it's application be?
  • Does being on 3G/Edge affect when we check for Gecko updates?
    • Should not, since updates will be free OTA via Carrier's private APN.
  • What—if anything—should we tell the user when a Gecko updated is detected? Should we behave differently if the user is on 3G/Edge connection when we detect that an update is available?
    • Download update and apply silently in background, same as Gonk process? Might be too intrusive for these more frequent (18 weeks) Gecko updates?
    • (marshall_law) see above
  • Do we have the technical ability to download the update in the background and only notify the user once the new version is available?
    • (marshall_law) yes
  • Should we do anything special if the phone has been turned off for a few days and is then started?
    • (marshall_law) IIRC, the existing Gecko update internals already have the logic for knowing how long it's been since the last update check. They should be able to detect this and do an update check as soon as the phone is on. We should confirm this, though.
  • Should we inform users about how big updates will be before downloading them? Do we have the ability to tell before doing the actual download?
    • (marshall_law) We do have the ability -- the update MAR format specifies update size. We don't currently have plans to inform about the size of an update, but feel free to chime in on any of the bugs listed on how that might work.
  • Do we require the user to plug in the phone if battery is below X %?
    • (sicking) See answer for Gonk
  • What prompts do we present to user?
    • (marshall_law) see above
  • Do we have a rollback strategy for failed installs? Previous April discussion w/ cjones indicated no...
    • (marshall_law) we do in Phase 2, which I don't think we plan to have for v1. Will need to check w/ cjones to verify though
  • What is time to install? Several minutes?
    • (marshall_law) It all depends on the size of the update, the speed of the internal disk, the device hardware.. :) We might want to profile this in a few configurations..
  • Can we avoid user friction by downloading silently, and only during periods of user inactivity? Would not want to slow down web browsing while update processed in background, for example.
    • (marshall_law) yes, see the bugs about prompting
  • How many device reboots are required in the process?
    • (marshall_law) for Gecko, there shouldn't be any. only a process restart is required. the only time a restart might happen is if /system is somehow left in read-write after the updater is finished, as a fail safe.
  • How large are these updates?
    • (marshall_law) they are binary diff'd, so potentially not "huge", but again this all depends on how big the update is. definitely smaller than if we were downloading fresh binaries.
  • If the user powers down the device while an update is silently downloading in background, can we resume download later on?
    • (sicking) See answer for Gonk


Gaia Updates

Introduction

  • Gaia updates are related to anything that may modify the user interface and experience of the OS. The update interval will also be every 18-weeks to align with Gecko updates.
    • (ladamski) To be clear, Gaia apps will be updated as part of Gecko update correct? See below
  • Updates that happen to the Core Apps (Dialer, SMS, Camera, etc.) will happen silently and users will not be charged any carrier network fees for Gaia System and Core App updates (similar to Gecko updates via a private APN). All core apps should be updated simultaneously so that a single B2G version represents the full stack.

Open questions

  • How much are we testing Core app updates before delivering these updates to the APN?
    • This will be tested at the same level we are testing starndard 3G connectivity to the carrier network? From CLee's etherpad outline...
  • Same questions from Gecko apply here:
  • What is the sequence of events (eg: prompt user to restart device, whereupon install process runs?)
    • How different is process from Gecko updates?
  • How often should we check for updates?
  • Will there be a back-up instance of Gaia in event of failed updates? If so, what will sequence of it's application be?
  • Does being on 3G/Edge affect when we check for Gecko updates?
    • Should not, since updates will be free OTA via Carrier's private APN.
  • What—if anything—should we tell the user when a Gecko updated is detected? Should we behave differently if the user is on 3G/Edge connection when we detect that an update is available?
    • Download update and apply silently in background?
  • Do we have the technical ability to download the update in the background and only notify the user once the new version is available?
  • Should we do anything special if the phone has been turned off for a few days and is then started?
  • Do we need to have a mechanism for pushing extra-critical updates?
  • Should we inform users about how big updates will be before downloading them? Do we have the ability to tell before doing the actual download?
  • Do we require the user to plug in the phone if battery is below X %?
  • What prompts do we present to user?
  • Do we have a rollback strategy for failed installs? Previous April discussion w/ cjones indicated no...
  • What is time to install? Several minutes?
  • Can we avoid user friction by downloading silently, and only during periods of user inactivity? Would not want to slow down web browsing while update processed in background, for example.
  • How many device reboots are required in the process?
  • Do we provide link to changelog so user can review update details before installing?
  • How large are these updates?
  • Do we check for available device storage before downloading? If insufficient, how do we mitigate?
  • If the user powers down the device while an update is silently downloading in background, can we resume download later on?
  • How much user agency do we provide over installs? Can they defer? For how long? What affordances do we make for out of date Gaia + Core apps?
  • How can the user review the currently-installed version? From Settings?


Draft process for atomic Gecko+Gaia updates

Josh Carpenter, Aug 15

Overview

Design principles

  • Low-friction. Minimize user interruptions, connection speed impacts, etc.
  • Free. Avoid user charges.
  • Safe. Minimize changes and consequences of failed updates.
  • Patient. Support backwards compatibility for users who cannot update.
  • Friendly. Avoid presenting users with excess technical details.

Steps

Diagram: Gecko+Gaia Update Process v1 (PDF)

  1. Check for update
  2. Confirm available drive space
  3. Check connection
  4. Download
  5. Check Battery
  6. Install
  7. Follow-up

1. Check for update

Automatic (push)

  • Update server pushes silent "update available" notification to device.

Automatic (poll)

  • Device checks with server for available update at scheduled time/interval.

Manually

  • User initiates "Check for Updates" via UI input. Probably from Settings > Device > Device Information (although this has not been specced yet).

2. Confirm available drive space

  1. Check size of download.
  2. Check there is sufficient storage on device to download the update. Define minimum sufficient storage as some multiple of the update file size, in order to leave sufficient room for day to day device operation.

Sufficient

Proceed to next step.

Insufficient

Two possibilities: fail silently, or user prompt.

Fail Silently

Fail silently if update process was initiated Automatically (Push or Poll).

  • Update fails and goes into Wait state.
  • X minutes/hours/days (?) elapse -> Check sufficient storage again.

Prompt

Prompt user if update process was initiated Manually, and device is On and Unlocked. Contents as follows:

  • Image: Icon
  • Title: "Update available"
  • Body: "A FxOS update is available there is not enough space to download it. Free up space by deleting videos, files, etc."
  • Input: "OK"

User presses [OK]:

  • Update fails and goes into Wait state.
  • X minutes/hours/days (?) elapse -> Check sufficient storage again.

Because these prompts interrupt the user, we should consider a nuanced approach. For example, upon second prompt give user option to not be reminded again (although that creates fragmentation problems), or to choose between two varying reminder intervals (eg: 1 Day and 1 Week).

3. Check connection

Ensure updates are free for user by taking into account connection type before downloading.

No connection

The Network Status is one of following:

  • Airplane Mode
  • Searching
  • No Network

Update process can encounter this when:

  • Process is Manually initiated.
  • Process was Automatically initiated, but delayed due to insufficient drive space, and connection type has changed in meantime.

The Update process fails. Two possibilities (same as the two possibilities under "Paid", below).

WiFi

Proceed to next step.

Free (APN)

Proceed to next step.

User is on paid data connection, probably Roaming. To avoid incurring charges, update fails. Two possibilities:

Silent

Fail silently if update process was initiated Automatically (Push or Poll).

  • Update fails and goes into Wait state.
  • Exit Wait state when one of following occurs:
    • Connection type changes (eg: upon connection type change push the new type to the Updater)
    • Time interval passes (eg: check connection type every X hours/days)
    • New update push notification is received. Restart update process.

Prompt

Prompt user if update process was initiated Manually, and device is On and Unlocked. Contents as follows:

  • Image: Icon
  • Title: "Cannot download update" (verbiage TBD)
  • Body: "There is no data connection. FxOS cannot be updated. Connect to a WiFi or Data connection." (verbiage TBD)
  • Input: "OK"

User presses [OK]:

  • Update fails and goes into Wait state.
  • Exit Wait state when one of following occurs:
    • Connection type changes (eg: upon connection type change push the new type to the Updater)
    • Time interval passes (eg: check connection type every X hours/days)
    • New update push notification is received. Restart update process.

There's a lot room to improve the flow of this scenario. eg:

  • Detect reason for failure (eg: Airplane mode vs Roaming) and tailor prompt accordingly (eg: present option to turn Airplane Mode off).
  • Check available WiFi connections and offer user chance to connect if an Open and/or Remembered network is detected.

4. Download

Once initiated, downloads are as follows:

  • Silent & invisible (no visible UI such as progress bars)
  • Pause for user network activity
  • Auto-complete in event of interruption. eg:
    • Connection type changes to Roaming.
    • Connection type changes to Airplane Mode.
    • User turns off device.
    • Battery dies.

We should also create error handler for: Storage space dropped below minimum threshold during update download. eg: While update is downloading, user copies video to device storage via USB.

Once download is complete, proceed to next step.

5. Check Battery

In order to minimize risk of failed update, check battery level before starting installation. Defer or proceed depending on % remaining.

Given the following caveats:

  • System can detect current battery level, but not drain rate (varies with battery age)...
  • It will be difficult to estimate the amount of power required to complete an install, given variations in device specs, update sizes, etc...

...we should build healthy margins into any "minimum required battery" thresholds. eg: Minimum 30%.

  • If minimum threshold is met, proceed to next step.
  • If minimum threshold is not met, two possibilities:

Fail Silently

Put installation into Wait state until future condition is met

Need to flesh this out further.

Prompt user

Prompt user to plug in to power source in order to proceed. Offer option to cancel, or initiate automatically as soon as power connection is detected.

Need to flesh this out further.

6. Install

Installation process begins. The are three possibilities, depending on the current state of the device:

Manual Install

  • The device is On and Unlocked.
  • Because the installations are disruptive (requires a B2G process restart), the user is first prompted to initiate or defer the installation.
  • User has option to install immediately, or defer install.

Need to flesh this out further.

Silent Install

Two possibilities:

Update when idle

  • The device is On, Locked, and the likelyhood of use is low.
  • Can occur when the download has completed while the device is locked, or when the user has chosen to Defer from a Manual Install prompt.
  • One possibility is to delay install until the early morning hours, when the user is least likely to be using the device.
  • The Install process executes without turning on the screen.

Update at power-On

  • The device powers-On from Off state
  • Can occur when the user has chosen to Defer from a Manual Install prompt, and the device has subsequently been powered Off. When the device is turned On, the device does a Battery check, and then installs automatically & silently.

In both cases, the user is only made aware that the update has occurred once it is complete, and they Unlock the device.

Need to flesh this out further.

7. Clean-up

After install, we can consider informing the user that the update has occurred, and offer details such as link to update details (eg: "What's new in FxOS").