WebAPI/AudioChannels

From MozillaWiki
Jump to: navigation, search

This API introduces the concept of a hierarchy of audio channels. The channels are prioritized as to allow "silencing all channels with priority lower than X".

The problems that we are trying to solve are:

  • When the user answers a phone call, the sound from all apps should be silenced
  • The alarm clock shouldn't be muted even if normal audio is muted. This to prevent the user oversleeping due to having muted the phone the previous day.
  • When the user leaves an app, under normal circumstances the app should be muted.
  • Some apps need to be able to opt in to not getting muted when the user leaves the app, such as the music player app or the radio app.
  • When the volume keys are used it should change the volume for different audio types depending on context. For example while in the alarm app, the volume keys should adjust the alarm volume and not the "normal" volume.
  • When a video app starts playing audio, background music should be muted while video is playing.

See also: https://etherpad.mozilla.org/sound-stream-types

The channels are:

  • normal: UI sounds, web content, music, radio
  • content: music, video.
  • notification: New email, incoming SMS
  • alarm: Alarm clock, calendar alarms
  • ringer: Incoming phone calls.
  • telephony: Phone calls, voip calls.
  • publicnotification: Forced camera shutter sounds. This will not be in V1.

Whenever an audio channel is used, lower priority channels are automatically paused. The only exception to this is that "normal" and "content" has the same priority which means that if the "content" channel is used, it's simply mixed with audio from the "normal" channel.

If two apps try to use the "content" channel at the same time, the foreground app wins. If both of the apps are background apps, then the last app to try to use the channel wins.

We'll have separate mute and volume settings per channel. We'll additionally have a volume and mute setting for a "headphones" channel. The "headphones" channel is used both for normal headphones as well as for bluetooth headsets.

For now all sounds are directed through headphones/headset when they are plugged in. We discussed possibly making the alarm sound through both headphones and speaker, but we deemed this a non-v1 feature. Whenever

For now all audio channels except "telephony" never use the built-in earpiece. I.e. they always use the speaker or headphones/headset. We might introduce using the built-in earpiece for "normal" sounds in a future version.

When the volume up/down buttons on a bluetooth headset is pressed, we'll treat that exactly as when the on-device volume up/down buttons are pressed.

Application API

interface AudioChannelManager : EventTarget {
 // We might not need this headphones section for v1. 
 readonly attribute boolean headphones;
 attribute EventHandler onheadphoneschange; // Always fired before audio start playing through the new channel

 attribute boolean telephonySpeaker; // Not in V1 Makes the "telephony" channel go through the speaker.

 attribute DOMString volumeControlChannel; // The channel to adjust the volume for when volume keys are pressed
}

We'll additionally introduce a new attribute "mozAudioChannelType" on <audio> and <video> which selects which audio channel will be used for any played audio.

partial interface HTMLMediaElement {
 attribute DOMString mozAudioChannelType;

 readonly attribute boolean mozchannelpaused; // Returns true if this element is currently paused due to the channel being paused. Not implemented yet.
 // These events only fire if the element is currently played.
 attribute EventTarget onmozinterruptbegin;
 attribute EventTarget onmozinterruptend;
}

The mozchannelpaused property can change both as a result of an app being put in the background, or as a result of the channel being paused due to a higher priority channel being used.

System and Browser API changes

We need to add settings for controlling the volume and the mute to any iframe mozbrowser. The proposal is to add this set of new properties to the browserAPI (for simplicity in webIDL):

interface BrowserAPI {
  // With this it's possible to mute and pause all the MediaElement for this mozbrowser iframe.
  attribute boolean audioPaused;

  // range: 0.0 to 1.0 - a custom volume
  attribute double audioVolume;

  // list of active and non muted audio channels
  readonly attribute boolean hasSoundingChannels;

  // when this attribute is set to true, any normal channel will be 'converted'
  // to content channel. This fixes the problem of the visibility.
  // Task of the system app is to set this boolean to true before changing the visibility to false
  // for the foreground app. Implementing this, we can revert all the hacks we implemented for the
  // visibility and normal channels...
  attribute boolean treatNormalAsContent;
}

In addition we want to emit events when activeAudioChannels or soundingAudioChannels attributes change.

All of these changes are meant to be built on top of the existing AudioChannelService. Doing that we keep the logic in gecko, but we allow custom settings:

  1. the browser can let play websites as such as spotify, grooveshark and so also when the device is locked, setting normalAudioChannelToContent to true.
  2. the system app can mark the foreground app, setting normalAudioChannelToContent to true before changing the visibility
  3. the settings app can have configurations ad-hoc for the audio management and apps

Volume control

There's multiple different settings for audio volume:

  • audio.volume.content: Affects the "content" and "normal" audio channels. It's an integer between 0 and 15 where 0 means muted (but not paused).
  • audio.volume.notification: Affects the "notification" and "ringer" channels. It's an integer between 0 and 15.
  • audio.volume.alarm: Affects the "alarm" channel. It's an integer between 0 and 15.
  • audio.volume.telephony: Affects the "telephony" channel. It's an integer between 0 and 5.
  • audio.volume.bt_sco: Volume when bluetooth headset is plugged in. It's an integer between 0 and 15.

Note that headphones do not use the same audio setting as bluetooth. Instead headphones use the same volume as when headphones are not plugged in. We should fix this such that headphones are controlled through a dedicated volume control.

Todo: Describe policies for which volume is controlled when volume buttons are pressed.

Todo: Describe "silent mode".

Security model

In order to get access to anything more than the "normal" channel, the application needs to enumerate these channels in the permissions property in the app manifest. So something like:

permissions: {
  ...
  audio-channel-alarm: {
    description: "..."
  },
  audio-channel-notification: {
    description: "..."
  },   ...
}

This would enable the app to play sound through both the "notification" channel and the "alarm" channel.

I'm not sure if we need to have prompts if non-privileged apps try to use channels beyond the "normal" or "content" channels.

Audio Competing Policy

AudioChannelType Priority Foreground Background Timing of Resuming Faded Effect
normal 0 Play [1] Transit to foreground. If notification is fired, volume will be reduced to 20% of current level.
content 1 Play [1], [2]

If the muted/paused is caused by

 1. [1] then will be resumed when [3].
 2. [2] then will be resumed to play when transit to foreground.
As above.
notification 2 Play [1] [3] N/A
alarm 3 Play [1] [3] N/A
telephony 4 Play [1] [3] N/A
ringer 5 Play [1] [3] N/A
public notification 6 Play Play N/A N/A

[1] If there is any higher priority channel in playing then muted or paused.

[2] If there is any other content channel starting to play in foreground or background then current content channel in playing will be muted or paused.

[3] No higher priority channels are in playing.

Use cases and requirements

The set of use cases and requirements that we have tried to solve with the above APIs are:

Ability for the system app to do the following for the currently running apps, as well as for the browser to do for the currently running tabs:

  • Show which applications/tabs are currently playing audio
  • Show UI which mutes a specific app/tag
  • Enable a spotify-app/tab to be treated as a music app. I.e. it should be able to get the same benefits as if it had mozaudiochannel=content. Including both the automatic muting of other content audio, as well the ability to play in the background.
  • Control which application/tab gets to play audio if there are several background applications that all are attempting to use the "content" channel, but no visible app using the "content" channel.

Ability for the system app to:

  • Turn down the volume of background audio rather than completely silence it when the notification channel is used
  • Figure out which volume to modify when the user is pressing the volume buttons.

Ability for apps to:

  • Act as if it's occupying a channel without actually using a media element playing on that channel. This is useful for example to prepare for audio which is about to start playing, or to prepare for turning on the camera.

Volume control

We also have a set of usage scenarios around volume control. We currently have 4 different volumes which makes it tricky to decide which one (or ones) to modify when the user presses the volume buttons. Here's a set of situations where we need to figure out which volume to control:

  • When on the home screen.
  • When on the lock screen.
  • When using the built-in alarm app.
  • When running a 3rd party app which is currently playing "normal" audio. For example playing background music.
  • When running a 3rd party app which on occasion plays sound effects.
  • When running a 3rd party app which we don't yet know if it plays sounds or not.
  • When using the built-in fmradio or music app and audio is playing.
  • When using the built-in fmradio or music app and audio is not playing.
  • User lowers volume to a very low volume, this likely either affects content or ringer volume. Suddenly there's an incoming notification. However since this is a separate volume this can be very loud which is surprising to the user.
  • User enters dialer and lowers volume. Then presses DTMF tone buttons and expects this to play at a low volume.

Refactor Audio Channel Service

(04/30/2015 updated)

  • Gecko: The concept of refactoring the audio channel service is to deal with those issues that we couldn't solve by the existing mechanism in gecko, and we will need the changes(in below) in both gecko and gaia to have a new management.
    • New browser api: bug 1113086 - this is the major part for the new browser api so that the gaia system app is able to control each iframe/app's audio channel, such as allowing or denying those types we have in b2g.
      • It's almost completed, just lacking of the volume control of the FM Radio.
    • Telephony: bug 1129882 - The new audio channel browser api will be implemented base on the foundation of media element, but the telephony api does not use media element to produce sound, so we need the telephony api to bind on a specific window(callscreen), just like the other apps did.
      • Review finish, wait for landing.
    • VOIP: bug 1126224 - There will be 3rd party apps that use webrtc api to implement voip services, and to fit the ux spec, these apps might behave different from the gaia callscreen app, so probably we will need a new type called "voip" to distinguish from "telephony" and "voip" audio channels.
      • This issue is paused now, since we have no the consensus about whether we really need the "voip" type.
    • System: bug 1142933 - The ux spec has specified a new audio channel type called "system" to represent those sounds that system use, such as the keyboard, screen lock or screenshot...
      • Review finish, wait for landing.
  • Gaia: The system app will use the browser api(bug 1113086) to control all the audio channels in gaia, it means we will move the audio channel service from gecko to gaia, since gaia has more window related information then gecko and can do a better management.
    • Audio Channel Manager: bug 1100822 - we will introduce this new module in system app, to manage the audio channels.
      • Landed.
    • System: bug 1157140 - Manage System app's audio channels in the AudioChannelManager.
      • Figuring out a good way to manage the audio channels in System app.
    • Callscreen: bug 1159610 - The Callscreen app needs to call the telephony's audio channel api.
    • Cleanup: bug 1139838 - Remove all the hacks for audio channel.
  • UX: Before the refactoring started, developers and ux had meetings to discuss about the current audio issues and future needs, then came out a sound spec to define the audio channel behaviours.
    • Spec: bug 1068219 - as the developers are implementing the new modules, meanwhile ux is also updating the spec if we encounter some problems on the behaviours or coding limitation.