Batch service workflow and resources

In this overview of the core components of the Azure Batch service, we discuss the high-level workflow that Batch developers can use to build large-scale parallel compute solutions, along with the primary service resources that are used.

Whether you're developing a distributed computational application or service that issues direct REST API calls or you're using another one of the Batch SDKs, you'll use many of the resources and features discussed here.

Tip

For a higher-level introduction to the Batch service, see What is Azure Batch?. Also see the latest Batch service updates.

Basic workflow

The following high-level workflow is typical of nearly all applications and services that use the Batch service for processing parallel workloads:

  1. Upload the data files that you want to process to an Azure Storage account. Batch includes built-in support for accessing Azure Blob storage, and your tasks can download these files to compute nodes when the tasks are run.
  2. Upload the application files that your tasks will run. These files can be binaries or scripts and their dependencies, and are executed by the tasks in your jobs. Your tasks can download these files from your Storage account, or you can use the application packages feature of Batch for application management and deployment.
  3. Create a pool of compute nodes. When you create a pool, you specify the number of compute nodes for the pool, their size, and the operating system. When each task in your job runs, it's assigned to execute on one of the nodes in your pool.
  4. Create a job. A job manages a collection of tasks. You associate each job to a specific pool where that job's tasks will run.
  5. Add tasks to the job. Each task runs the application or script that you uploaded to process the data files it downloads from your Storage account. As each task completes, it can upload its output to Azure Storage.
  6. Monitor job progress and retrieve the task output from Azure Storage.

Note

You need a Batch account to use the Batch service. Most Batch solutions also use an associated Azure Storage account for file storage and retrieval.

Batch service resources

The following topics discuss the resources of Batch that enable your distributed computational scenarios.

Next steps

  • Learn about the Batch APIs and tools available for building Batch solutions.
  • Learn the basics of developing a Batch-enabled application using the Batch .NET client library or Python. These quickstarts guide you through a sample application that uses the Batch service to execute a workload on multiple compute nodes, and includes using Azure Storage for workload file staging and retrieval.
  • Download and install Batch Explorer for use while you develop your Batch solutions. Use Batch Explorer to help create, debug, and monitor Azure Batch applications.
  • See community resources including Stack Overflow, the Batch Community repo, and the Azure Batch forum.