CloudServices/At a glance: Difference between revisions

Jump to navigation Jump to search
(*/update/*)
Line 48: Line 48:
Project Overview
Project Overview
==Requirements==
==Requirements==
* <i>List of requirements</i>


==Get Involved==
===Notifications===
<i>Instructions on how one can contribute to the project</i>
== Points of Contact ==
=== [[CloudServices/Glossary#Product_manager|PM]]: ===
Name: contact@info
IRC - #channel
Group Email - TBD
=== [[CloudServices/Glossary#Technical_writer|Docs]]: ===
Name: contact@info
IRC - #channel
Group Email - TBD
=== [[CloudServices/Glossary#Software_developer|Dev]]: ===
Name: contact@info
IRC - #channel
Group Email - TBD
=== [[CloudServices/Glossary#Quality_engineer|QA]]: ===
Name: contact@info
IRC - #channel
Group Email - TBD
=== [[CloudServices/Glossary#Operations_engineer|Ops]]: ===


Engineer - Name contact@info
The [[Services/Notifications|Notifications project]] needs Queuey for storing
messages on behalf of users that want to receive them. Each user gets their own
queue.


=== [[CloudServices/Glossary#Support_engineer|Support]]: ===
If a user needs to get notifications for general topics, the Notifications
application will create a queue and clients will poll multiple queues.


Name: contact@info
The first version could be considered a '''Message Store''' rather than a
''queue'' as it supports a much richer set of query semantics and does not let
public consumers remove messages. Messages can be removed via the entire queue
being deleted by the App or by expiring.


== Resources ==
Requirements:
=== Specs and Docs: ===


=== Code: ===
* Service App can create queues
* Service App can add messages to queue
* Messages on queues expire
* Clients may read any queue they are aware of


=== Public channel(s): ===


=== Internal/confidential channel(s): ===
===Socorro===  


=== Scheduled meetings ===
The second version allows authenticated applications to queue messages and for
clients to consume them. This model allows for a worker model where jobs are
added to a queue that multiple clients may be watching, and each message
will be given to an individual client.


=== Events/meetups: ===
Requirements:


* Service App can create queues, which can be partitioned
* Service App can add messages to queue, and specify partition to retain ordering
* Clients can ask for information about how many partitions a queue has
* Clients may read any queue, and its partitions that they are aware of


== Architecture ==


== Points of Contact ==
'''Queue Web Application'''
Join the #geo channel on [[IRC|Mozilla's IRC network]].  
 
[https://lists.mozilla.org/listinfo/dev-geolocation Join our mailing list]
    queuey (Python web application providing RESTful API)
 
'''Queue Storage Backend'''
   
    Cassandra
 
'''Coordination Service'''
   
    (Only used when consuming from the queue)
 
    Apache Zookeeper
 
'''Queue Consumption'''
   
    * Consumers coordinate with coordination service
    * Consumers split queues/partitions amongst themselves
    * Consumers record in coordination service the farthest they've processed
      in a queue/partition
    * Consumers rebalance queue/partition allocation when consumers are added or
      removed using coordination service
    * Consumption and rebalancing is done entirely client-side
 
Queuey is composed of two core parts:
 
* A web application (handles the RESTful API)
* Storage back-end (used by the web application to store queues/messages)
 
The storage back-end is pluggable, and the current default storage back-end is
Cassandra.
 
Unlike other MQ products, Queuey does not hand out messages amongst all clients
reading from the same queue. Every client is responsible for tracking how far
it has read into the queue, if only a single client should see a message, then
the queue should be partitioned, and clients should decide who reads which
partition of the queue.
 
Different performance and message guarantee characteristics can be configured by
changing the deployment strategy and Cassandra replication and read / write
options. Therefore multiple Queuey deployments will be necessary, and Apps
should use the deployment with the operational characteristics desired.
 
Since messages are stored and read by timestamp from Cassandra, and Cassandra
only has eventual consistency, there is an enforced delay in how soon a message
is available for reading to ensure a consistent picture of the queue. This
helps ensure that written messages will show up when reading the queue to avoid
'losing' messages by reading past where they appeared in the queue.
 
For more on what kind of probabilities are involved in various replication
factors and differing read/write CL's, see:
 
* [http://www.eecs.berkeley.edu/~pbailis/projects/pbs/ PBS: Probabilistically Bounded Staleness]
* [http://www.datastax.com/dev/blog/your-ideal-performance-consistency-tradeoff Your Ideal Performance: Consistency Tradeoff]
 
In the event that the clients are deleting messages in their queue as they've
been read, the delay is unimportant. Enforcing a proper delay is only required
when clients read but never delete messages (and thus track how far into the
queue they've read based on time-stamp).
 
When using queue's that are to be consumed, they must be declared up-front as
a ''partitioned'' queue. The amount of partitions should also be specified, and
new messages will be randomly partitioned. If messages should be processed in
order, they can be inserted into a single partition to enforce ordering. All
messages that are randomly partitioned should be considered loosely ordered.
 
== Queue Workers ==
 
A worker library called Qdo handles coordinating and processing messages off
queues in a job processing setup. The Qdo library utilizes Apache Zookeeper for
worker and queue coordination.
 
Workers coordinate to divide up the queue's and partitions in each queue so
that no queue/partition has multiple readers. This avoids the need for read
locking a queue, and how far into each host+queue+partition is stored in
Zookeeper.
 
This model is based exactly on how Apache Kafka workers divide up queues to
work on.
 
== API ==
 
API can be found on the [http://readthedocs.org/docs/queuey/en/latest/api.html queuey API docs] page.
 
== Engineers==
 
*Ben Bangert
*Hanno Schlichting
27

edits

Navigation menu