|
|
| Line 22: |
Line 22: |
| a message guarantee that is generally "Deliver once, and exactly once." These | | a message guarantee that is generally "Deliver once, and exactly once." These |
| guarantee's can be altered based on deployment configuration. | | guarantee's can be altered based on deployment configuration. |
|
| |
| = Terminology =
| |
|
| |
| Cassandra
| |
| Apache Cassandra, the default storage back-end for the queue
| |
| CL
| |
| Consistency Level
| |
| DC
| |
| Data-center
| |
| MQ
| |
| message-queue
| |
| qdo
| |
| Worker library for queuey
| |
| queuey
| |
| The message queue web application Python package that provides the
| |
| RESTful API for the message queue.
| |
|
| |
|
| = Engineers = | | = Engineers = |
| Line 107: |
Line 91: |
|
| |
|
| When using queue's that are to be consumed, they must be declared up-front as | | When using queue's that are to be consumed, they must be declared up-front as |
| a ''partioned'' queue. The amount of partitions should also be specified, and | | a ''partitioned'' queue. The amount of partitions should also be specified, and |
| new messages will be randomly partioned. If messages should be processed in | | new messages will be randomly partitioned. If messages should be processed in |
| order, they can be inserted into a single partition to enforce ordering. All | | order, they can be inserted into a single partition to enforce ordering. All |
| messages that are randomly partitioned should be considered loosely ordered. | | messages that are randomly partitioned should be considered loosely ordered. |
| Line 125: |
Line 109: |
| This model is based exactly on how Apache Kafka workers divide up queues to | | This model is based exactly on how Apache Kafka workers divide up queues to |
| work on. | | work on. |
|
| |
| == Example Deployment Configurations ==
| |
|
| |
| === Multiple DC Reliability ===
| |
|
| |
| This setup is for messages that are critical to deliver, performance is
| |
| substantially slower as writes will not return until a quorum of storage
| |
| nodes in each data-center acknowledge the message was written.
| |
|
| |
| Depending on how urgent it is for the clients to see the message, the read
| |
| CL can be between ONE and LOCAL_QUORUM. Higher read availability
| |
| will be achieved by using ONE though, with only a slight delay before a
| |
| written message will be seen regardless of the data-center.
| |
|
| |
| Using a read CL of ONE allows for an entire data-center to become unavailable
| |
| and messages will still be delivered, however no new messages can be written
| |
| as all data-centers must have quorum's available to write messages. Changing
| |
| the write CL to a lower value adversely affects how clients may read the queue
| |
| as they may need to track what has been read to ensure they don't miss messages
| |
| that are still propagating.
| |
|
| |
| ''Queuey''
| |
|
| |
| * Deploy to webheads in each data-center
| |
|
| |
| ''Storage back-end''
| |
|
| |
| * Select Cassandra
| |
| * Deploy Cassandra machines (3+) in each data-center
| |
| * Set write CL to EACH_QUORUM
| |
| * Set read CL between ONE - LOCAL_QUORUM
| |
| * Set delay as appropriate for the read/write CL's
| |
|
| |
| === High Throughput, Occasional Unavailability of Messages, Single DC ===
| |
|
| |
| In this setup, Queuey behaves like Apache Kafka. The queuey service runs on the
| |
| same nodes as Cassandra, and each node runs Cassandra as a single node. This
| |
| provides no data durability beyond the disks, so if message persistence is
| |
| important than the drives should be RAID. If a node goes down, the messages on
| |
| that node are unavailable until the node returns.
| |
|
| |
| Applications writing messages should hit a load balancer that has the queuey
| |
| nodes behind it, so writes will be unaffected by individual machine outages and
| |
| message producers will not need configuration updates to add/remove queuey
| |
| nodes.
| |
|
| |
| Clients reading queues will have to know the individual hostnames of the queuey
| |
| nodes and track how far they've read into the queue per-host, as well as be
| |
| able to communicate directly with the queuey nodes (bypassing the load
| |
| balancer).
| |
|
| |
| ''Queuey''
| |
|
| |
| * Deploy to desired nodes
| |
|
| |
| ''Storage back-end''
| |
|
| |
| * Select Cassandra
| |
| * Deploy as single-node instance to every queuey node
| |
| * Set read/write CL to ONE, as there is only ONE node
| |
| * No delay needed
| |
|
| |
| === Good Throughput, Available Messages, Single DC ===
| |
|
| |
| This is a good generic setup that provides message durability in the event a
| |
| few nodes are lost. Webheads run queuey, and all connect to the same Cassandra
| |
| cluster.
| |
|
| |
| Message guarantee's can be tweaked by altering the read/write CL's, and the
| |
| enforced message availability delay. For example, to raise availability the
| |
| read/write CL can be set to ONE and a delay can be set that will provide a
| |
| fairly high 99% change of seeing a message within 50ms of writing it (assuming
| |
| no nodes go down).
| |
|
| |
| See http://www.eecs.berkeley.edu/~pbailis/projects/pbs/#demo for more details
| |
| on tweaking the delay and CL's to achieve the desired message delivery
| |
| guarantee's.
| |
|
| |
| ''Queuey''
| |
|
| |
| * Deploy to webheads
| |
|
| |
| ''Storage back-end''
| |
|
| |
| * Select Cassandra
| |
| * Deploy Cassandra cluster (3+) machines
| |
| * Set write CL to ONE - QUORUM
| |
| * Set read CL to ONE - QUORUM
| |
| * Set delay appropriate for read/write CL's
| |
|
| |
|
| = Initial User Requirements = | | = Initial User Requirements = |
| Line 255: |
Line 150: |
| = API = | | = API = |
|
| |
|
| Applications allowed to use the '''Message Queue'''
| | API can be found on the [http://readthedocs.org/docs/queuey/en/latest/api.html queuey API docs] page. |
| will be given an application key that must be sent with every request, and
| |
| their IP must be on an accepted IP list for the given application key.
| |
| | |
| All methods must include the header 'X-Application-Name' which indicates the name of the
| |
| application using the queue. Queue's are unique within an application.
| |
| | |
| For methods requiring authentication (internal apps), the application key must
| |
| be sent as a HTTP header named 'ApplicationKey'.
| |
| | |
| == Internal Apps ==
| |
| | |
| These methods are authenticated by IP, and are intended for use by Services Applications.
| |
| | |
| | |
| '''POST /queue'''
| |
| | |
| Creates a new queue
| |
| | |
| Params:
| |
| ''queue_name'' (Optional) - Name of the queue to create
| |
| ''partitions'' (Optional) - How many partitions the queue should have (defaults to 1)
| |
| | |
| '''DELETE /queue/{queue_name}'''
| |
|
| |
| Deletes a given queue created by the App.
| |
| | |
| Params:
| |
| ''delete'' (Optional) - If set to ''false'', then the queue contents will be deleted,
| |
| but the queue will remain registered.
| |
| | |
| '''POST /queue/{queue_name}'''
| |
| | |
| Create a message on the given queue. Contents is expected to be
| |
| a JSON object.
| |
|
| |
| Raises an error if the queue does not exist.
| |
| | |
| == Clients ==
| |
| | |
| '''GET /queue/{queue_name}'''
| |
|
| |
| Returns messages for the queue.
| |
| | |
| Params:
| |
| ''since_timestamp'' (Optional) - All messages newer than this timestamp, should be
| |
| formatted as seconds since epoch in GMT
| |
| ''limit'' (Optional) - Only return N amount of messages, defaults to 100
| |
| ''order'' (Optional) - 'descending' or 'ascending', defaults to descending
| |
|
| |
| Messages are returned in order of newest to oldest.
| |
| | |
| '''GET /queue/{queue_name}/info'''
| |
| | |
| Returns information about the queue.
| |
| | |
| Example response:
| |
| {
| |
| 'partitions': 4,
| |
| 'created': 1322521547
| |
| }
| |
|
| |
|
| = Design Background = | | = Design Background = |