This blog post is now out-of-date. Container applications can now be used with Batch in two ways, either of which should be used instead of the samples described below. See the documentation for how to run container applications natively in Batch or by using Batch Shipyard.
Docker is a tool to package, deploy and run your application inside a container. With the introduction of Linux VM support in the Batch service, it's possible to run container-based tasks on Azure Batch with a Docker hub as the packaging and deployment mechanism.
Azure also provides container hosting as a service with Marathon and Chronos. If you need a hosting environment for your container-based application, Azure Container Service is your choice. On the other hand, if you need a scheduler to run repetitive compute jobs, you should choose the Batch service, which lets you package and deploy job binaries and data in container format.
We’ve added two samples on GitHub to showcase how to use Docker technology on Batch. The first shows how to create a Batch pool of compute nodes and turn it into a Docker swarm cluster. One can use a SSH tunnel to connect to the Docker swarm locally to interact with the cluster.
The 2nd sample shows an end-to-end workflow for running a Docker container as tasks on a Batch pool.
Both samples start with creating a pool that installs a Docker container on the VM. The pool VM is based on Ubuntu 14.04. The pool is created with a start task that will execute Docker_starttask.sh. This script knows how to add a Docker repository to the package source list, install Docker engine and run the post-configuration steps. The script assumes the VM is based on Ubuntu 14.04, which is the default pool configuration. This script will not work with other Linux distributions, but one can update it easily (especially for other versions of Ubuntu). Note that the startup task has the RunElevated flag set to true, thus the script cli needs no “sudo” in it.
The second sample will then create a job with a job preparation task. The task will run “Docker pull” to pull down the image from Docker Hub. The job preparation task is guaranteed to run before other tasks of the job run on the node, so when the actual tasks start, the image will be there. Another option is to put a “Docker pull” as the first thing in the task command line.
The sample finally adds tasks to the job. The task command line is a one-liner that feeds the commands to the container. Note that the container is run with “-i” as interactive mode and attached to STDIN/STDOUT/STDERR of the console so that all output can be captured to stdout and stderr.
The container will finish when the script is done. The task will in turn be marked as completed. Docker run exits with the code returned from the container itself, so you can monitor the task exit code from task execution info property to determine the result.
Since the container is run within the host environment instead of a VM sandbox, normal monitoring tools will work. Commands like top/htop on the host will show the task CPU usage. Stdout/stderr will capture the output from the container. The host can connect to the internet to download needed resources.
One thing worth mentioning is that the container doesn't share files on the host by default. If a resource file is needed for the container to start, one can simply add the “-v” option to the Docker cli to mount the host directory in the container.