CloudServices/At a glance: Difference between revisions

CloudServices/At a glance (view source)

Revision as of 14:28, 16 December 2015

3,949 bytes added , 16 December 2015

→‎update on: MESSAGE QUEUE

Anyango

27

edits

@@ Line 48: / Line 48: @@
 Project Overview
 ==Requirements==
-* <i>List of requirements</i>
-==Get Involved==
+===Notifications===
-<i>Instructions on how one can contribute to the project</i>
-== Points of Contact ==
-=== [[CloudServices/Glossary#Product_manager|PM]]: ===
-Name: contact@info
-IRC - #channel
-Group Email - TBD
-=== [[CloudServices/Glossary#Technical_writer|Docs]]: ===
-Name: contact@info
-IRC - #channel
-Group Email - TBD
-=== [[CloudServices/Glossary#Software_developer|Dev]]: ===
-Name: contact@info
-IRC - #channel
-Group Email - TBD
-=== [[CloudServices/Glossary#Quality_engineer|QA]]: ===
-Name: contact@info
-IRC - #channel
-Group Email - TBD
-=== [[CloudServices/Glossary#Operations_engineer|Ops]]: ===
-Engineer - Name contact@info
+The [[Services/Notifications|Notifications project]] needs Queuey for storing
+messages on behalf of users that want to receive them. Each user gets their own
+queue.
-=== [[CloudServices/Glossary#Support_engineer|Support]]: ===
+If a user needs to get notifications for general topics, the Notifications
+application will create a queue and clients will poll multiple queues.
-Name: contact@info
+The first version could be considered a '''Message Store''' rather than a
+''queue'' as it supports a much richer set of query semantics and does not let
+public consumers remove messages. Messages can be removed via the entire queue
+being deleted by the App or by expiring.
-== Resources ==
+Requirements:
-=== Specs and Docs: ===
-=== Code: ===
+* Service App can create queues
+* Service App can add messages to queue
+* Messages on queues expire
+* Clients may read any queue they are aware of
-=== Public channel(s): ===
-=== Internal/confidential channel(s): ===
+===Socorro===
-=== Scheduled meetings ===
+The second version allows authenticated applications to queue messages and for
+clients to consume them. This model allows for a worker model where jobs are
+added to a queue that multiple clients may be watching, and each message
+will be given to an individual client.
-=== Events/meetups: ===
+Requirements:
+* Service App can create queues, which can be partitioned
+* Service App can add messages to queue, and specify partition to retain ordering
+* Clients can ask for information about how many partitions a queue has
+* Clients may read any queue, and its partitions that they are aware of
+== Architecture ==
-== Points of Contact ==
+'''Queue Web Application'''
-Join the #geo channel on [[IRC|Mozilla's IRC network]].
-[https://lists.mozilla.org/listinfo/dev-geolocation Join our mailing list]
+    queuey (Python web application providing RESTful API)
+'''Queue Storage Backend'''
+    Cassandra
+'''Coordination Service'''
+    (Only used when consuming from the queue)
+    Apache Zookeeper
+'''Queue Consumption'''
+    * Consumers coordinate with coordination service
+    * Consumers split queues/partitions amongst themselves
+    * Consumers record in coordination service the farthest they've processed
+      in a queue/partition
+    * Consumers rebalance queue/partition allocation when consumers are added or
+      removed using coordination service
+    * Consumption and rebalancing is done entirely client-side
+Queuey is composed of two core parts:
+* A web application (handles the RESTful API)
+* Storage back-end (used by the web application to store queues/messages)
+The storage back-end is pluggable, and the current default storage back-end is
+Cassandra.
+Unlike other MQ products, Queuey does not hand out messages amongst all clients
+reading from the same queue. Every client is responsible for tracking how far
+it has read into the queue, if only a single client should see a message, then
+the queue should be partitioned, and clients should decide who reads which
+partition of the queue.
+Different performance and message guarantee characteristics can be configured by
+changing the deployment strategy and Cassandra replication and read / write
+options. Therefore multiple Queuey deployments will be necessary, and Apps
+should use the deployment with the operational characteristics desired.
+Since messages are stored and read by timestamp from Cassandra, and Cassandra
+only has eventual consistency, there is an enforced delay in how soon a message
+is available for reading to ensure a consistent picture of the queue. This
+helps ensure that written messages will show up when reading the queue to avoid
+'losing' messages by reading past where they appeared in the queue.
+For more on what kind of probabilities are involved in various replication
+factors and differing read/write CL's, see:
+* [http://www.eecs.berkeley.edu/~pbailis/projects/pbs/ PBS: Probabilistically Bounded Staleness]
+* [http://www.datastax.com/dev/blog/your-ideal-performance-consistency-tradeoff Your Ideal Performance: Consistency Tradeoff]
+In the event that the clients are deleting messages in their queue as they've
+been read, the delay is unimportant. Enforcing a proper delay is only required
+when clients read but never delete messages (and thus track how far into the
+queue they've read based on time-stamp).
+When using queue's that are to be consumed, they must be declared up-front as
+a ''partitioned'' queue. The amount of partitions should also be specified, and
+new messages will be randomly partitioned. If messages should be processed in
+order, they can be inserted into a single partition to enforce ordering. All
+messages that are randomly partitioned should be considered loosely ordered.
+== Queue Workers ==
+A worker library called Qdo handles coordinating and processing messages off
+queues in a job processing setup. The Qdo library utilizes Apache Zookeeper for
+worker and queue coordination.
+Workers coordinate to divide up the queue's and partitions in each queue so
+that no queue/partition has multiple readers. This avoids the need for read
+locking a queue, and how far into each host+queue+partition is stored in
+Zookeeper.
+This model is based exactly on how Apache Kafka workers divide up queues to
+work on.
+== API ==
+API can be found on the [http://readthedocs.org/docs/queuey/en/latest/api.html queuey API docs] page.
+== Engineers==
+*Ben Bangert
+*Hanno Schlichting

CloudServices/At a glance: Difference between revisions

CloudServices/At a glance (view source)

Revision as of 14:28, 16 December 2015

Navigation menu

Search