Four; we need four ways to do background jobs in Azure.
At the time of this writing, I count four (4) ways to create background or scheduled jobs in Windows Azure. Now, Azure is getting pretty big, and while four seems like a lot, in all likelihood, I may have missed one or two others. Do we need this many ways to run background work? Is this another case of Microsoft delivering multiple ways of doing the same thing? [Linq to SQL, EF . . ahem].
While it is quite possible that there is a case to be made that this may be the result of various teams creating the same functionality, I think each of these current offerings provides slightly different functionality and it is important to have them. That said, you can certainly find many cases of overlap.
A worker role is not so much a scheduled job engine as it is an always on machine that can process work. When you create a worker role you implement a Run method which is invoked at machine startup. From there you are free to create threads to do a variety of processing tasks: process messages from an Azure queue or Service Bus Topic/Queue, periodically execute arbitrary code, almost anything you can imagine. You write the code and deploy it to any number of worker role instances, which are essentially virtual machines pre-configured and setup with your code.
A worker role allow you to write arbitrary code and run it at scale. Because you are running on a virtual machine you have dedicated memory and CPU for your processing. Workers can be scaled up or scaled out enabling you to choose more memory and processing power on individual instances as well and choosing the number of instances to run. Additionally, because the virtual machine is dedicated to your worker role you are able to install third party software on the machine as part of your deployment using startup tasks.
Of all the options, this one gives you the most control over scaling the job processing nodes and controlling the software available for your code to use. However, for smaller jobs, it can be overkill.
The scheduler service currently provides two actions to be invoked on a schedule: HTTP requests or sending a message to an Azure queue. You can setup one-time jobs that are invoked immediately or started a future date and time. You can also setup recurring jobs with a typical scheduling capability on timed intervals, days of the week, monthly, etc.
To setup a scheduled job you must first create a job collection associated with a data center region. This is because the jobs will be running in cloud services in those data centers. Once the job collection is created you can add a number of jobs to the collection. The amount of jobs and frequency is determined by the scaling selection (currently free or standard options). Jobs can be created using the Windows Azure Management Portal or using the REST API. History is provided for job executions and monitoring lets you know if the jobs are failing or succeeding.
The scheduler service enables you to trigger custom code through HTTP or an Azure queue on a particular schedule. The key component is the scheduling capability and the bulk of the work happens in your own application. One benefit of this scheduler is that it is not tied to Azure-only targets since the HTTP endpoint you invoke can be any URL. This service is especially useful when your web application may be dormant due to inactivity and therefore not loaded into memory. I would also expect future releases to provide additional actions such as sending a message to Service Bus Topic or Queue though I do not if anything has been stated regarding future plans. This offering will only get more useful as the type of actions increases.
This scheduler was released in preview before the Azure Scheduler service and provided a solution for a common problem of needing back end code to run without user interaction or at specific times. You can manage the scripts for your scheduled job alongside your other code even putting them under source control using the GIT integration. If you are focused on building a mobile service only, then this option may be your best choice as it will simplify the management of your application assets and management experience.
An alternative approach would be to use the Azure Scheduler Service which would call a custom API exposed by your Mobile Service. You might choose this option if you need the more robust scheduling capabilities of the Azure Scheduler Service or if you are already using that scheduler for other work. This approach requires either that your API permission enable everyone to invoke the custom operation or you need to provide your Azure mobile credentials in an HTTP header from the scheduler. At the time of this writing the management portal does not support defining custom headers for HTTP(s) actions. In order to define headers for your action you must use the REST API to create your job in the Azure Scheduler. I’m sure this will get added to the portal over time.
Web jobs enable you to upload a script file (bash, python, bat, cmd, php, etc.) to your web application and run those scripts as your job. When you define the job you can execute it ad hoc, on a schedule or continuously. The continuous option is interesting because it will restart your job after the executable exits from each run. Another unique aspect of these jobs is that the continuous jobs run on all the instances of your website. Additionally, there is a Web Jobs SDK for .NET which provides quick and easy access to Azure Storage and queues so you can run your jobs when new items are added to storage or queues. This extends the reach of your job beyond the website with minimal coding effort required. Like the Azure Scheduler Service you get a history view of your job executions and can review the details of successes or failures as well as detailed logs if you used them.
If you are building a website and need some code to run in the background without being triggered by user actions on the site or requiring a third-party service to invoke your API, then Web Jobs will fit the bill. Like the Azure Mobile Service scheduler these scripts have the benefit of running in the context of your website and can read configuration, work with files in the site directories, etc. The SDK is a nice addition and can really simplify your life if you are also going to work with Azure storage as part of your web job. One of the other benefits to using this model is the variety of supported script types which opens up the libraries and commands available in each of those environments.
Like with Azure Mobile Services, you could use the Azure Scheduler Service to invoke an HTTP request to your website and have an endpoint that handles your work. This may narrow your options in terms of the programming language used for the job, which presumably you are already using for your site, so you may not be able to take advantage of the libraries in other scripting/coding environments.
Continuous jobs, because they run locally on all instances of your site, are really one of the unique characteristics of web jobs in my opinion. Using an outside scheduler would generally enable you to invoke an endpoint on only one server.
As I said earlier, it’s quite possible there are schedulers I don’t know about in area of Azure I don’t tend to use such as Hadoop or Media Services. It’s also possible we’ll see more schedulers or job engines come online for new or existing services. I think the cast of characters right now mostly makes sense and each provides some unique functionality. My hope is that there will be a logic to it all and the notion of running something on a schedule will be centralized on the Azure Scheduler Service and focus placed there to expand the scheduling options even more and to increase the targets with Azure specific options and actions for add-ons from other vendors.
What do you think? Is this scheduler/job overkill? Should there be one scheduler to rule them all?