CloudServices/FirefoxMobileServices/ChannelService: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 184: Line 184:
more efficient implementations as they're available.
more efficient implementations as they're available.


==== Workers / Application Queues ====
==== Workers ====


Workers are responsible for:
Workers are responsible for:
Line 193: Line 193:
* Pulling messages from the retry queue for delivery
* Pulling messages from the retry queue for delivery
* Delivering NACK messages to the Service Queue for messages that failed delivery
* Delivering NACK messages to the Service Queue for messages that failed delivery
Workers can draw from multiple queues or a single queue depending on through-put requirements.
==== Application Queues ====
Application queues are the interface used to send/receive messages to device's from Mozilla Services. All interaction
with clients passes through the application queues.


===Push Server Modifications===
===Push Server Modifications===

Revision as of 19:28, 25 March 2014

Last updated: 2014/03/25
Draft-template-image.png THIS PAGE IS A WORKING DRAFT Pencil-emoji U270F-gray.png
The page may be difficult to navigate, and some information on its subject might be incomplete and/or evolving rapidly.
If you have any questions or ideas, please add them as a new topic on the discussion page.

Overview

Channel Service is a service for Mozilla Cloud Services that need to communicate with Firefox on the client-side (FxOS, Desktop, etc). It unifies bidirectional communication between the client and Mozilla Cloud Services.

Project Contacts

Principal Point of Contact (US) - Ben Bangert bbangert@mozilla.com

Principal Point of Contact (EU) - Tarek Ziade tziade@mozilla.com

Goals

To develop a single multiplexed communication channel between clients and Mozilla Cloud Services.

This should result in:

  • Client-side API for client-side code that needs to talk to a Mozilla Cloud Service
  • Server-side API for Cloud Services that need to talk to a client
  • Cleaner code in the client that is not entangled with a specific service
  • Cleaner client-side service code that does not need to concern itself with the channel back to Mozilla Cloud Services
  • Easier updates of client-side code that is not intermixed with channel code
  • Ability to update channel code for efficiency/performance/cost-savings without changing other client-side or Cloud Services server-side code

Use Cases

Push

The first candidate for using the ChannelService is Push, which is already on FxOS. At the moment the code handling the socket connection is mixed in with the code handling the DOM API's for Push. By splitting the channel out, future Push updates could be easier to land in the client.

Presence

The Presence project would benefit from having an easier client-side API to use for talking over a single shared channel.

Loop

Loop's server-side architecture is more complex than necessary because the only way to currently wake a FxOS client is SimplePush and it is unable to carry data (the token). ChannelService would make it easier/faster to deploy client-side code that can get the token simplifying the architecture on the server-side and the requirements substantially.

On the desktop, SimplePush is not available yet, and there is no channel that can be used.

Requirements

Firefox OS

  • Client-side JSM developer familiar with Push's implementation that can split out the socket handling portion
  • Client-side Push JSM developer to refactor the client-side portion to use the new FxOS Channel Service

Firefox Desktop

  • Platform dev's that can land the client-side channel handling code

Fennec / etc

  • Same as for Firefox Desktop

Server-side

  • Server-side dev's to implement the new Mozilla server-side portion
  • Server-side Push dev's to restructure Push to utilize the Mozilla Cloud Channel Service

Design

API

Client

Clients (Firefox, FxOS, Fennec, etc.) will get a new internal API to register for usage of the Channel, events, and to send/receive messages over the channel.

The Client-side Channel API will allow other Mozilla service code running in the client to hook in via Events and direct calls:

Events (to register for):

  • OnConnect (called anytime the connection is established)
  • OnDisconnect (called if the connection is interrupted)
  • OnMessage (called when a message is received over the channel for this service)

API Calls:

  • Send(payload, serviceName) - Send the payload over the channel. It will end up in the server-side service handling it. (ie. if push.jsm wanted to send a call to the Push service, it would call Send('whatever', 'push') )

The channel code will initially connect and call 'register' to get a deviceID/key assigned to it. It will need to save both the deviceID/key. On reconnects the client will call 'authorize' with the deviceID/key before it will be considered a valid client. The key acts as a password and is not shared.

If no client-side code registers for Events, the client will not open a channel to Mozilla as no client-side services need to utilize the channel.

Overview of client architecture:

ClientChanService.jpg

Server-side Services

Cloud Services can utilize an API to communicate with clients, based on message-passing from inbound/outbound queues. The inbound message queue for a Cloud Service contains the body of the message and metadata:

Type: Data
Metadata:
  DeviceID: AABCDC-550e8400-e29b-41d4-a716-446655440000
Payload: ................

The payload is an opaque blob that should be significant to the Cloud Service utilizing the channel.

XXX: Determine a standard encoding for the Payload

The outbound message queue for a Cloud Service contains the body of the message and metadata on what device the message is intended for. It also contains retry policy information, and whether the message should be NACK'd in the event that it was not possible to deliver the message.

Metadata:
  MessageID: ce3488ea-cf7c-41c3-8110-9907b1fe80e8
  DeviceID: AABCDC-550e8400-e29b-41d4-a716-446655440000
  Retry: 1m,5m,1h
  Nack: True
Payload: ................

This indicates that the message should attempt 3 deliveries of the message, 1 minute later, than 5 minutes, than 1 hour after that. The inability to deliver the message in this case triggers a Nack which is delivered over the inbound queue to the Cloud Service application.

In addition to basic bidirectional messaging, ChannelService broadcasts:

  • DeviceID changes
  • Nack's

If a device changes its DeviceID will change and the Cloud Service should account for this change.

Device Change Message:

Type: DeviceIDChange
Metadata:
   OldDeviceID: AABCDC-550e8400-e29b-41d4-a716-446655440000
   NewDeviceID: ACDCBA-550e8400-e29b-41d4-a716-446655440000

It is not expected that device ID's will change frequently, as a change indicates a client has changed geographic regions (North America -> South America, Europe -> North America, etc). Device ID's may change more frequently in the future if the geographic region is further confined (East Europe <-> West Europe).

Nack Message:

Type: Nack
Metadata:
  MessageID: ce3488ea-cf7c-41c3-8110-9907b1fe80e8
  DeviceID: AABCDC-550e8400-e29b-41d4-a716-446655440000

It's safe to assume if a try at least 5m later failed, the client is offline.

Platform Requirements

Firefox OS Modifications

Currently there's a PushService in the parent process. This proposal changes the structure.

  • A thread will be launched from the parent process (referred to as the 'Services' Thread)
  • The Services thread will have two portions running in it to begin with
    • channel.jsm - Channel handling code, split out from the current PushService.jsm, that handles the connection and implements a basic API for other code that is in the Services Thread
    • push.jsm - Push daemon refactored to utilize the Channel API. PBackground will likely need to be used to have the DOM API call into this thread hanging off the main thread

Cloud Services Channel Service

The Cloud Services Channel Service (fondly pronounced koos-koos) handles the server-side termination of the channel from each device.

The architecture of this service is shown below, this example shows how Push integrates, queue's for Push will be setup that Push will use to recieve/send messages:

ServerChanService.jpg

Connection Nodes

These nodes are responsible for:

  • Routing inbound-client messages to the appropriate Services Queue (Push/etc)
  • Routing outbound-client messages received from workers to the client connected
  • Updating the database indicating what DeviceID's are connected to this node
  • Holding very high quantities of connections to clients

By simplifying the task the connection node is responsible, we can easily swap in and replace connection nodes with more efficient implementations as they're available.

Workers

Workers are responsible for:

  • Looking up the ConnectionNode from the database that a message is addressed to via its DeviceID
  • Pulling messages from the Service Queue's for delivery
  • Putting messages on the retry queue if the DeviceID is not available
  • Pulling messages from the retry queue for delivery
  • Delivering NACK messages to the Service Queue for messages that failed delivery

Workers can draw from multiple queues or a single queue depending on through-put requirements.

Application Queues

Application queues are the interface used to send/receive messages to device's from Mozilla Services. All interaction with clients passes through the application queues.

Push Server Modifications

The Push server service will need to be modified to work with the Cloud Services Channel Service.

Code Repository

Links to the published code bases

Release Schedule

Predicted code delivery dates

QA

Points of Contact

Engineer - Name contact@info

Test Framework

Security and Privacy

Fill out the security & privacy bug template: https://bugzilla.mozilla.org/form.moz-project-review (https://wiki.mozilla.org/Websites/Kick-Off_Form)

For security reviews, there's: https://wiki.mozilla.org/Security/ReviewProcess

Points of Contact

Questionnaire Answers

1.1 Goal of Feature

2. Potential Threat Vectors and Mitigation Points

Review Status

Bugzilla Tracking # - see https://wiki.mozilla.org/Security/Reviews

Issues and Resolutions

Legal

Points of Contact

Operations

Points of Contact

Deployment Architecture

Bugzilla Tracking # -

Escalation Paths

Lifespan Support Plans

Logging and Metrics

Points of Contact

Tracking Element Definitions

Data Retention Plans

Dashboard URL

Customer Support

Points of Contact

Sumo Tags

Review Meeting

Documentation Internationalization