Microsoft Azure: Enterprise Application Development - Worker Roles

6/21/2011 4:41:01 PM

Worker role internals

Building worker roles is fairly simple they are just class libraries that inherit from the Microsoft.ServiceHosting.ServiceRuntime.RoleEntryPoint class. Worker roles are automatically started when their host instance is started. During startup, code in the OnStart() method is executed. The OnStart() method returns a Boolean value. If returns true, the role is started and the Run() method is called, whereas if OnStart() returns false, the role is stopped. OnStart()

Our worker tasks should be coded in the Run() method, and we should not return from the Run() method. If we do, Azure will restart the worker role. Instead, and despite our best instincts, the code in the Run() method should be enclosed inside an infinite loop. The way to stop a worker role is to stop the host instance. For this reason, worker roles that need to function independently should all be separated into individual instances.

When our worker role instance is being shut down by Azure, the OnStop() method is called. We can add cleanup code to this method if necessary, or just a return statement. OnStop() is not called in the event of application or hardware failure. The Azure Fabric will wait for 20 seconds to receive a return code. If more than 20 seconds of time elapses, the role will be killed regardless of its status or place in executing the code.

We can use any .NET language to develop worker roles. This is different from web roles, which can be programmed in .NET and non-.NET languages. Even if we're using Azure to host a server for a non-.NET language, there is still a little bit of plumbing that needs to be done in a .NET language. There are "solution accelerators" and examples for plumbing most of the major alternative web servers.

Uses of worker roles

Worker roles can be used to perform a variety of functions. Some of the uses of worker roles include:

Processing messages contained in Queue Storage (thread-pool pattern)
Retrieving data from remote web services
Hosting non-IIS servers such as Jetty (http://blogs.msdn.com/b/dachou/archive/2010/03/21/run-java-with-jetty-in-windows-azure.aspx), PHP (http://blog.maartenballiauw.be/post/2010/04/08/Running-PHP-on-Windows-Azure.aspx), and other web servers (http://blog.smarx.com/posts/using-other-web-servers-on-windows-azure)
Serving as the TCP endpoint for FTP services (http://blog.maartenballiauw.be/post/2010/03/15/Using-FTP-to-access-Windows-Azure-Blob-Storage.aspx)
Mounting an Azure CloudDrive VHD
Accessing files on a CloudDrive VHD

Although worker roles can be used to host web servers for non-.NET languages, they can only be developed in .NET languages.

The uses of worker roles are limited only by the collective imagination of the Azure ecosystem. Microsoft has opened an Azure App Marketplace at http://pinpoint.microsoft.com/en-US/windowsazure/resources , where applications developed specifically for Azure are listed, most of which are worker roles or have worker role components. Before embarking on extensive worker role development, it might be a good idea to look through the App Marketplace.

Externally facing worker roles

One of the Azure features touted by Microsoft is that non-IIS servers can be used on Azure. Servers such as Tomcat and Jetty, as well as communications protocols such as FTP, have all been implemented on Azure. The mechanism by which these have been accomplished is externally facing worker roles. Worker roles can serve as TCP endpoints, and using the System.Net.Sockets.TcpListener class, we can create listeners for a number of protocols or ports. While IIS may be the primary web server on Azure, externally facing worker roles provide us with a great deal of options should we need to expand beyond IIS, or if we wish to utilize a non-.NET language.

Thread-pool pattern

In the thread-pool pattern (http://en.wikipedia.org/wiki/Thread_pool_pattern ), work that needs to be done accumulates in a queue, and one or more threads process the work. As one unit or work is complete, the thread requests the next unit in the queue. When all the work is complete, the thread can rest or monitor until there is more work. Extending this pattern to Azure does not require a great deal of imagination, with Queue Storage serving as the work queue, and a worker role serving as the thread that processes work. Others have described this pattern as the work-queue pattern.

Managing worker roles

Azure is an elastic system, meaning resources can fairly easily scale up or shrink based on demand for those resources. Because the costs of Azure are based on resource utilization, there is a balance between cost and performance for our Azure applications; hence, managing roles is an essential part of a well-run Azure application.

So how do we know when to scale up a worker role? The answer depends largely on the overall system architecture. If we're experiencing high traffic, experiencing significant lags in processing time, and a queue is filling faster than it can be processed, it's probably time to increase the worker roles.

On the other hand, our system design may include a rate-limiting step to maintain system resources downstream. Or, our application may employ the singleton pattern to avoid data concurrency issues. In these cases, we'll have to look at other mechanisms to increase performance under high loads.

The initial number of instances for a particular role is specified using the Instances element in the ServiceConfiguration.cscfg file, as seen here:

<Role name="JupiterMotorsWorkerRole">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="DiagnosticsConnectionString" value="UseDevelopmentStorage=true" />
</ConfigurationSettings>
</Role>

When we deploy our application, we can control the number of instances by adjusting the value of the Instance element. Should we need to scale up or down after the application is deployed, we can either change the number of instances in the Azure portal, or we can use the Service Management API (http://msdn.microsoft.com/en-us/library/ee460799.aspx ). Changing the number of instances by editing the ServiceConfiguration.cscfg file is not recommended, as edits to this file will cause the roles is restart, similar to how editing the web.config file causes an ASP.NET web app to restart. Using the portal to increase the number of instances does not cause an application to restart, but if we decrease the number of instances, we do not have any control over which ones are shut down.

For a more automated solution, we can use another worker role to monitor the length of a queue, and if a queue becomes too backed-up, increase the number of worker roles. When the queue becomes depleted, the number of worker roles can be reduced by the monitoring role.

Best practices

When building worker roles, the standard best practices still apply object-oriented design, reusable code, and so on.

An important additional practice is to add logging information. Worker roles operate invisibly, so they can be difficult to debug. All we may have are the symptoms of the malfunctioning role, with no indication of why the role is malfunctioning.

Another consideration in worker role design is limiting one worker role to one function. For instance, we shouldn't design worker roles to process work from more than one queue, even though it may be more cost-effective to do so. Limiting the number of functions a worker role performs allows us to scale only what is a bottleneck and not affect the processing of other data.

We also need to design for parallel operation. If we have multiple instances of the same worker role, we need to ensure they will not trip over each other and process the same data or miss other work.

When creating an externally facing worker role, we need to keep in mind that Azure is load balanced, and subsequent requests may not be processed by the same role instance. State needs to be maintained in a way that it can be accessed by one role instance on one request, and a different role instance on a different request. One method to achieve this is to serialize state information into blobs.

The Jupiter Motors worker role

When an RV is finished, a Jupiter Motors driver takes the RV from our factory to the customer. Because custom RVs can cost around US$400,000, and very high-end RVs can cost around US$1,000,000, it's important the customer takes ownership of the RV quickly. When a customer takes ownership of the vehicle, the vehicle is removed from Jupiter Motors' insurance and begins the billing cycles to the customer. In order to take ownership, the customer inspects the vehicle, and accepts delivery by filling out a small form in a custom application on the driver's laptop. The custom application calls our web service and updates the status of the order.

Rather than updating the database directly, the web service places a message in a queue to be processed by our worker role. We are not expecting a high degree of traffic through the web service, but the customer acceptance is a zero-failure process. The built-in failover mechanisms of a queue make it a very attractive way to add zero-failure with not too much extra work.

Building the Jupiter Motors worker role

Our Jupiter Motors worker role will take a message from our queue and will update the order status in our portal database. This process will occur with our local application simulating a handheld device with a connection to the Internet. The application will capture the OrderHeaderID and OrderStatusID from our portal database (via our WCF Web Service) and build a string for our queue message. The string will be in a simple format of [OrderHeaderID],[OrderStatusID]. Let's see how we can accomplish the task of reading this message and updating the database from our queue message.

First, we need to add a worker role to our project. We do this in the same manner that we added our WCF role by right-clicking our roles folder in our JupiterMotors cloud application, choosing Add, and selecting the New Worker Role Project... option.

We're going to name our worker role as JupiterMotorsWorkerRole, as shown in the next screenshot:

At this point, we can see that our worker role is created for us with an app.config file and a WorkerRole.vb file. A worker role executes much like a service in the background. There is no visual aspect of the worker role. It is exactly what the name claims a worker. All of our code in the sample application will be placed in the WorkerRole.vb file. We are not limited to keeping all of our code though. We can create classes within the worker role project and split up the code if we want to. Our code for the Jupiter Motors worker role calls only one routine, so it is simple enough to keep within the WorkerRole class generated for us.

Notice that the WorkerRole class starts out with a routine Public Overrides Sub Run(). This is where our executable code will reside.

Imports System.Net
Imports System.Threading
Imports Microsoft.WindowsAzure.Diagnostics
Imports Microsoft.WindowsAzure.ServiceRuntime
Imports System.Data.SqlClient
Imports Microsoft.WindowsAzure
Imports Microsoft.WindowsAzure.StorageClient
Public Class WorkerRole
Inherits RoleEntryPoint
' The Run() method is where the work is performed. We construct an infinite loop to ensure the role
' stays running.
Public Overrides Sub Run()
' This is a sample implementation for JupiterMotorsWorkerRole. Replace with your logic.
Trace.WriteLine("JupiterMotorsWorkerRole entry point called.", "Information")
Dim _account = CloudStorageAccount.DevelopmentStorageAccount()
Dim _client = _account.CreateCloudQueueClient()
Dim _queue As CloudQueue = _client.GetQueueReference("orderupdatequeue")
_queue.CreateIfNotExist()
While (True)
Thread.Sleep(10000)
Trace.WriteLine("Working", "Information")
'Gets a message from the queue
Dim _msg As CloudQueueMessage = _queue.GetMessage()
If Not _msg Is Nothing Then
'Parse message to get the orderHeaderId and orderStatusId
Dim _orderHeaderId As Integer
Dim _orderStatusId As Integer
Dim _separatorPosition As Integer
Dim _messageLength As Integer
_messageLength = Len(_msg.AsString)
_separatorPosition = _msg.AsString.IndexOf(",") + 1
_orderHeaderId = Left(_msg.AsString, _messageLength - _separatorPosition)
_orderStatusId = Right(_msg.AsString, _messageLength - _separatorPosition)
'Call routine to update the order status
UpdateOrderStatus(_orderHeaderId, _orderStatusId)
'Delete the message from the queue once order is updated
_queue.DeleteMessage(_msg)
End If
_msg = nothing
End While
End Sub
' OnStart() runs only once, when the role is initially started. This is a good method to set up
' any diagnostic connections, connection limits, etc.
Public Overrides Function OnStart() As Boolean
' Set the maximum number of concurrent connections
ServicePointManager.DefaultConnectionLimit = 12
DiagnosticMonitor.Start("DiagnosticsConnectionString")
' For information on handling configuration changes
' see the MSDN topic at http://go.microsoft.com/fwlink/?LinkId=166357.
AddHandler RoleEnvironment.Changing, AddressOf RoleEnvironmentChanging
Return MyBase.OnStart()
End Function
Jupiter Motors worker rolebuilding' RoleEnvironmentChanging is executed after configuration changes are made, but before the changes are applied.
' Setting e.Cancel=true allows the role to be recycled. We can make the recycle conditional on some other
' value by modifying this method.
Private Sub RoleEnvironmentChanging (ByVal sender As Object, ByVal e As RoleEnvironmentChangingEventArgs)
' If a configuration setting is changing
If (e.Changes.Any(Function(change) TypeOf change Is RoleEnvironmentConfigurationSettingChange)) Then
' Set e.Cancel to true to restart this role instance
e.Cancel = True
End If
Jupiter Motors worker rolebuildingEnd Sub
Public Sub UpdateOrderStatus (ByVal iOrderHeaderId As Integer, ByVal iOrderStatusId As Integer)
Dim _connStr As String = My.Settings.ConnectionString
Dim _SQLcon As New SqlConnection(_connStr)
Dim _SQLcmd As New SqlCommand()
_SQLcon.Open()
With _SQLcmd
.CommandText = "UpdateOrderStatusForOrderHeaderID"
.CommandType = CommandType.StoredProcedure
.Connection = _SQLcon
.Parameters.AddWithValue ("@OrderHeaderID", iOrderHeaderId)
.Parameters.AddWithValue ("@OrderStatusID", iOrderStatusId)
.ExecuteNonQuery()
End With
End Sub
End Class

An important piece of the worker role is the If Not _msg Is Nothing Then... statement. This will make sure our code is executed only when there is a message in the queue that was picked up by the worker role. Without this, we would receive an Object reference not set to an instance of an object error. Other than that, the worker role is a very straightforward class to run executable code.

Other -----------------

- Working with Data in the Surveys Application : Saving Survey Response Data

- Working with Data in the Surveys Application : Testing and Windows Azure Storage

- Working with Data in the Surveys Application : A Data Model for a Multi-Tenant Application

- Enterprise Application Development : Azure Monitoring and Diagnostics

- Enterprise Application Development : Azure Diagnostics under the hood & Enabling diagnostic logging

- Building a Scalable, Multi-Tenant Application for Windows Azure : Scaling the Surveys Application

- Building a Scalable, Multi-Tenant Application for Windows Azure : Scaling Applications by Using Worker Roles

- Building a Scalable, Multi-Tenant Application for Windows Azure : On-Boarding for Trials and New Customers

- Introduction to SQL Azure : Creating our database

- Introduction to SQL Azure : Migrating schema and data