The ins and outs of queues
Looking at the following diagram, we can see where Queue Storage fits in with the rest of Windows Azure:
As one of the three simple storage options, Queue Storage is created
when a storage service is added to an account. The Queue Storage
endpoint is listed with the others when we view a storage service, as
shown here:
A single Azure account can
have any number of queues. Each queue is composed of messages, each of
which carries the data or processing instructions that need to be acted
upon by the back-end servers (refer to the next diagram).
There is no enforced limit to
the number of messages in a queue, although there is a practical limit
to the number of messages we'd want stacked up at any given time.
Messages with a long latency in the queue are an indication that either
the back-end processes need to be further optimized, or we need to scale
out some additional back-end servers.
Each message is simply an XML
document, in Atom format; messages are limited to 8 KB in size. If the
data to be processed are greater than 8 KB, the data should be stored in
a blob or table, and processing instructions sent in the message. In
addition to the messages, each queue can have up to 8 KB of metadata
associated with it. Metadata are stored as name-value pairs.
The idea behind a queue is to
provide a way for different processes to communicate with one another
in an asynchronous manner. A queue is best used when there is not tight
time dependence between the completion of one action and when the
subsequent one completes. A queue should be used when the message must
get processed, but no one is waiting for the processing to happen.
However, queues can be used in many different scenarios, allowing
flexibility in application tiers to perform a structured workflow
asynchronously.
Reasons to use a queue
If we have to justify to a project manager why we should implement queues, here are a few points we can use:
When any application
crashes, any in-process session information is lost. However, in the
case of Queue Storage, it is persistent and has recovery mechanisms if a
message fails to process.
Queues
allow processes to scale independently. One common arrangement is for a
front-end process to call a back-end process, wait for the back end to
complete, and then the front-end process performs its next action. There
is a 1:1 relationship between front-end and back-end processes. By
using a queue to pass processing instructions, processes can scale
independently of one another; there can be 1:10 or 10:1 front-end to
back-end process ratio. This independent scalability allows our
application to absorb traffic surges in a better way.
Using
multiple queues allows work to be segregated by importance. Queues
containing more important work can have more processes directed against
them, or can have special services written for them.
By
decoupling application layers, the different processes can be written in
different languages, and may exist in completely different locations.
Invisibility time and failover
In the points we discussed
in the previous section, we mentioned a recovery mechanism if a message
fails to process. Here's how that works:
A GET request
via the REST API or a query in the client library is prepared in our
application. The request or query should specify a parameter called. The visibilitytimeout sets the amount of time (in seconds) that the message will be invisible to subsequent processes. visibilitytimeout
A message is
"dequeued", that is, it is read from the queue and marked as invisible
to any further requests. The message technically stays in the queue, but
is inaccessible.
If
the message processing succeeds, the message is marked for deletion and
cleaned up later with garbage collection. It's important to delete the
message before the visibilitytimeout expires, or we risk another process dequeuing and processing the message.
If the message processing fails, the message becomes visible again after the visibilitytimeout
has expired. As message processing is FIFO (First In, First Out), the
next process reading from the queue will retrieve this message again,
and the process starts over.
The process is outlined in the following diagram:
Choosing the right value for the visibilitytimeout
parameter is important. If the invisibility time set is too short, the
message may become available again while it is still being processed.
On the other hand, if the processing time is too long, there will be
added latency should the message processing fail. Considering the
consequences, it's probably better to set the timeout a little longer.
Special handling for binary data
Binary data can be transmitted
in XML, so that queued messages can contain binary data. The data will
be processed as binary data. However, when the messages are dequeued,
the data are Base-64 encoded, including the binary data. Our application
would need to decode the binary data properly before processing the
binary data.
Working with queues
The client class for working with queues via .NET code is Microsoft.WindowsAzure.StorageClient.CloudQueue.
The methods listed here are methods of this class, unless specified
otherwise. The documentation for this library can be found at http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.cloudqueue.aspx.
Documentation for the REST library for Queue Storage can be found at http://msdn.microsoft.com/en-us/library/dd179363.aspx. The base URI for accessing queues via the REST API is http://<account>.queue.core.windows.net. To perform an operation on a specific queue, the URI is http://<account>.queue.core.windows.net/<queue> and the different HTTP verbs (PUT, GET, DELETE) are used to determine the action.
When using the REST API, every operation has an optional timeout
parameter that sets the processing timeout of the operation. If the
operation does not complete by the timeout, it will fail. The default
value is 30 seconds, which is also the maximum value that can be set.
As with Table and Blob Storage, the optional x-ms-version header should also be used with Queue Storage requests.
Listing queues
Obtaining a list of queues
is technically an action performed against the account, rather than the
collection of queues. As such, the base URI or client library class are
different than for the rest of the operations.
REST API
To list the queues in our account, a GET request is made to this URI: http://<account>.queue.core.windows.net?comp=list.
This will return a list of up to 5,000 queues in our account. To shape
the response, we can use some optional URI parameters to filter the
list:
Parameter
|
Description
|
---|
prefix
|
Returns only such queues whose names begin with the specified prefix.
|
marker
|
Similar in function to the continuation tokens used for Table Storage. The marker parameter specifies where the query results begin. The results include a NextMarker parameter in the response body that can be used as the marker value for a subsequent query.
|
maxresults
|
Limits the query only to the specified number of results.
|
include=metadata
|
If this parameter is included, the queue's metadata will be included as part of the response.
|
Client library
The CloudQueueClient class is used to perform queue-related actions against the storage account. Documentation for this class can be found at http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.cloudqueueclient.aspx.
To obtain a list of queues, we use the ListQueues method, which has three overloads, as described in the following table:
Method signature
|
Description
|
---|
ListQueues()
|
Returns all queues in our account.
|
ListQueues(<prefix>)
|
Returns all queues whose names begin with the specified prefix.
|
CloudQueueClient.ListQueues (<prefix>, <QueueListingDetails>)
| Returns all queues whose names begin with the specified prefix, and includes the specified level of details. QueueListingDetails is an enum with three values specifying the level of detail All (return all available details for each queue), Metadata (include metadata only), and None (return no details).
|
Creating queues
Because queues are addressable via URI, their names must be valid DNS names. There are four basic rules regarding queue names:
The queue name can contain only letters, numbers, and "-"
The queue name must begin and end with a letter only
Queue names must be lowercase
A queue name must be at least three characters, but shouldn't be longer than 63 characters
Metadata names must be valid
C# identifiers. Metadata names are case insensitive when created or
queried, but the case is preserved when the results are returned.
REST API
A PUT request is made to the base URI, naming the queue to be created. Queue metadata is passed in the headers, using x-ms-meta-<name>:<value>. Metadata names must follow the same naming rules as C# identifiers.
If the named queue exists,
the queue service checks the metadata to see if the two queues are
identical. If the metadata match, a 204 "No Content" response code is
received. If the metadata do not match, a 409 "Conflict" is returned.
Client library
To create a queue in a client library, we create an instance of the CloudQueue class, with the name we want the queue to be set to the name of this instance. We then call the Create method to create the queue. Metadata are added as properties of the CloudQueue instance.
Deleting queues
A queue is not immediately deleted when the Delete method succeeds. Instead, the queue is marked as unavailable and is cleaned up at a later time via garbage collection.
REST API
The Delete method is used to delete the queue specified in the URI.
Client library
To delete a queue via the client library, we create an instance of the CloudQueue class pointing to the queue we want to delete, and call the Delete method.
Setting metadata
As users, we can define
metadata that describe the queue. Note that metadata are added to the
queue, not the messages. We can use queue metadata to easily identify
the characteristics of a queue, such as adding messages or working on
messages, or the types of messages that pass through the queue.
REST API
To add/delete metadata via the REST API, a PUT request is made against the URI http://<account>.queue.core.windows.net/<queue>?comp=metadata. Metadata are specified in the request header as x-ms-meta-<name>:<value>. If no metadata are specified in the header, all metadata are deleted from the queue.
Client library
To start, we create an instance of CloudQueue class, referencing a specific queue. Then we create a NameValueCollection containing the metadata. We then add this to the Metadata property of our instance, and call the SetMetadata method.
Getting metadata
The whole point of
setting metadata is to be able to retrieve the metadata for later usage.
Let's now see how to retrieve the metadata.
REST API
To retrieve the metadata, we use a GET request to the URI http://<account>.queue.core.windows.net/<queue>?comp=metadata. Metadata are returned as headers. To assist in processing the headers, the x-ms-approximate-message-count:<count> header is also returned. x-ms-meta-<name>:<value>
Client library
When we create an instance of the CloudClient pointing to a specific queue, the metadata are accessible as the Metadata property of the queue.
Working with messages
As message manipulations are actually actions performed against a queue, the message methods are also part of the CloudQueue class.
Documentation for the REST library can be found at http://msdn.microsoft.com/en-us/library/dd135717.aspx. The base URI for accessing queues via the REST API is http://<account>.queue.core.windows.net/<queue>/messages.
To address a specific message by its ID, the URI is http://<account>.queue.core.windows.net/<queue>/messages/messageid?popreceipt=<messageid>. The different HTTP verbs (POST, GET, DELETE)
are used to determine the action. Note that the specific queue name is
specified as part of the URI. Message properties are specified in the
request body, which is in Atom format. Response bodies are also in Atom
format.
Parameter
|
Rest API
|
Client library
|
---|
Put messages
| A message is added to the end of a queue by submitting a POST request to http://<account>.queue.core.windows.net/<queue>/messages.
The message is XML and is posted in the request body. Messages are
limited to 8 KB in length, and must be able to be UTF-8 encoded. The
optional messagettl querystring property
can be used to set the time to live (in seconds) for the message. The
default TTL is seven days, which is the maximum value. Should a message
reside in a queue for more than the TTL, the message will be deleted.
| A message is created as an instance of a CloudQueueMessage, and is added to a queue by calling the AddMessage method. There are two overloads AddMessage(<message>) and AddMessage(<message>,<time-to-live>).
|
Get messages
| The GET method dequeues messages from
the specified queue for processing. Messages are returned in the
response body in XML format; the format is the same as what was
specified under the Put Message request.
There are two optional querystring parameters that can be utilized:
numofmessages: Sets the number of messages to be returned. The value can be from 1 (default) to 32.
visibilitytimeout: Sets the time in
seconds the retrieved messages will be invisible. The maximum value can
be of up to two hours. Default is 30 seconds.
When messages are dequeued via a Get method, they are made invisible to other processes. Included in the response properties is a PopReceipt, which is a message identifier that must be passed back in the DELETE request.
| To dequeue the next message in the queue, the GetMessage() method can be used. There are two overloads GetMessage(), and GetMessage(<visibilitytimeout>). To dequeue a number of messages, the GetMessages(<numofmessages>) or GetMessages(<numofmessages>,<visibilitytimeout>) are used.
|
Peek messages
| Peeking works the same as getting messages, with one required parameter in the querystring. To peek at messages, we use a GET method to the URI http://<account>.queue.core.windows.net/<queue>/messages?peekonly=true. The only optional querystring parameter is numofmessages.
Peeking at messages is similar to getting messages, but when we peek, a
message is not marked as invisible. This allows us to examine the
contents of a queue (such as how long messages have been hanging
around), without affecting queue processing.
| As with GET, there are two methods we can call: PeekMessage() peeks at the next message in the queue, while PeekMessages(<numofmessages>)is used to peek at multiple messages.
|
Delete messages
| To delete a message, we use a DELETE request to the URI http://<account>.queue.core.windows.net/<queue>/messages/messageid?popreceipt=<string-value>.
If we want to delete all messages in a queue, we make a DELETE request to http://<account>.queue.core.windows.net/<queue>/messages. If there are a lot of messages, the command may timeout before it completes. The DELETE method is not transactional, so in this case, the DELETE request can be reissued several times until all messages have been deleted.
It is important for applications working with messages to delete them if
processing is successful. Otherwise, once the visibility timeout
expires, the messages will be available for processing again. When a
delete operation is successful, messages are not immediately deleted.
Messages are marked for deletion, which makes them unavailable to any
process, and are cleaned up later by garbage collection.
| In the client library, we use the DeleteMessage method. There are two overloads DeleteMessage(<message>) or DeleteMessage(<messageid>,<popreceipt>).
All messages are cleared from a queue by calling the Clear() method.
|