Big Compute for large engineering simulations

This post shares our experience working closely with a partner to bring a large engineering simulation solution to Azure.

One of the best aspects of working at a company like Microsoft is the great customers and partners you get to meet, and have the opportunity to work with. Today, we would like to share our experience working with one of those partners, and talk about the great solution that we helped them bring to their customers.

The partner we will be talking about is Schlumberger, and the solution that they just released is their high-resolution reservoir simulator, INTERSECT, running on Azure. This is Schlumberger’s first fully commercial software as a service (SaaS) solution for cloud, and in this blog post you will learn not just about this solution, but also about the reasons why Schlumberger decided that Azure is the right cloud platform to host it on for the initial launch in North America and Europe.

About Schlumberger

If you are not familiar with Schlumberger, it’s probably because you are not too familiar with the oil and gas industry. Otherwise, I am certain that you would know them quite well (and you can probably skip this section). Simply put, Schlumberger is the largest oilfield services company in the world. It employs 115,000 people in more than 85 countries. Their products and services span from exploration through production. It supplies technology, integrated project management and information solutions to customers working in the oil and gas industry worldwide.

The Challenge: engineering models keep getting larger

Our team has been working with partners like Schlumberger to enable Big Compute scenarios on Azure (think: hyperscale meets high-performance computing), based on our previous experience and expertise on HPC workloads on Windows Server (we also work on Microsoft HPC Pack).

One of the most interesting workloads that we have been looking forward to enable on Azure is the simulation of large and complex engineering models. These models have not been traditionally possible in the cloud (at least not efficiently) because they require specialized hardware and an array of different technologies that are not part of a “normal” cloud offering. But, at the same time, these models are starting to outgrow the on-premises resources that companies and service providers have at their disposal. This is how Owen Brazell, HP Solutions Architect for Reservoir Simulation at Schlumberger, explained this trend to us:

The development of more complex fields and increased extraction challenges in the oil and gas industry means that we now need to run simulations over much larger numbers of cells in order to get the higher resolution studies that we need to properly quantify the physical processes involved in hydrocarbon extraction. These simulations now run from the traditional low numbers of million-cell problems, up to the hundreds of millions, and even billion-cell problems. The only way to solve such large problems is to use large-scale HPC systems.

Furthermore, large-scale compute resources of the magnitude and characteristics required to run these models are not always readily available. Again, in Owen’s words:

Our INTERSECT simulator is tuned for hydrocarbon reservoirs with complex geological structures, highly heterogeneous formations, challenging wells and completion configurations and those that demand advanced production controls. INTERSECT has been designed for high resolution, large cell count models. INTERSECT is one of our core engines delivering the complex science needed to accurately and fully understand the reservoir through running multiple iterations of the same or similar model at a very fine scale. This is best achieved with leading edge technology that I certify for use with INTERSECT. It was one of the first simulators that was designed from the ground up for large-scale parallelism and as such is an ideal tool for users running models with multi-million to billion cells. However, to make the best use of a simulator like INTERSECT, users must have access to large-scale HPC resources, which often disenfranchises smaller companies from the technology.

So, in essence, what partners like Owen and his customers are confronted with is a combination of an ever-growing demand for more and faster compute resources (driven by the ever-growing characteristic of the simulations), paired with the difficulty of having access to the right type of resources. Fortunately, this is just the kind of challenge that we have been working hard to solve in Azure.

The Contender: true HPC capabilities on Azure

The accessibility and scalability offered by cloud computing should be the answer to the challenges mentioned earlier, but before this is true, the backend infrastructure must have the right technology to enable the simulator to really perform. This is where Schlumberger and Microsoft working together have created and delivered the most appropriate technology solution, as Owen comments:

The work done on both Azure VMs and Microsoft MPI to reduce the latency across nodes has meant that we can run scale tests using INTERSECT from several hundreds of thousands of cells up to a billion cells, showing the same kind of results and scalability as running on bare metal. We chose Azure for our commercial launch in North America and Europe because of their presence in those markets, their willingness to work tightly with our engineers to build a great solution for our customers, they have an offering that supports low latency networks over RDMA (InfiniBand) which has a notable impact on the scalability of MPI based applications. INTERSECT is our best in class reservoir simulator capable of scaling to a billion cells; it cannot effectively achieve this scale without fast low latency networks.

What Owen was referring to is the high performance A8 and A9 compute instances on Azure. These instances provide the performance and scalability of a world-class supercomputing center, to anyone, on demand. This is possible not just because they offer a large amount of memory and HPC-class CPUs, but more significantly because they are connected to a QDR InfiniBand backend network. This network can achieve MPI latency under 3 microseconds and offer non-blocking throughput of up to 32 Gbps, thanks to the remote direct memory access (RDMA) technology that our team has made available on Azure.

Aside from the specialized hardware configuration, these compute instances have been fine-tuned to achieve the best possible performance when running CPU-intensive and memory-intensive workloads. Just like bare metal systems on on-premises HPC clusters are usually fine-tuned. The following chart shows the results of comparing the time it took to run the same model on one of Schlumberger’s on-premises clusters and on A9 instances on Azure:

10 million cell comparison: Azure vs. on-premises cluster

But what about scale? Well, that is where Microsoft has a true advantage over specialized cloud vendors that offer similar HPC cloud solutions. Our A8 and A9 instances are already available in several regions across the globe. Together, these regions add up to tens of thousands of cores (and growing) of HPC-class hardware.

What all this means is that parallel applications like INTERSECT can scale favorably to thousands of cores, on a true HPC platform that is available on demand, and one which offers the scalability required. That is exactly what Owen and Schlumberger needed for their customers.

The Solution: the first reservoir simulation service on the cloud

To make INTERSECT available on Azure, we worked with Schlumberger to understand the needs of their customers. It was paramount to Schlumberger that cloud and Azure must not change the workflow that their customers were following. The second aspect where we spent a great deal of time was reviewing and challenging the security of the cloud, thus developing additional levels of security that Schlumberger felt were required to making their solution a success.

Thanks to the availability of this service, Schlumberger’s customers can now submit simulations to the cloud directly from Petrel, the industry leading reservoir modelling tool and the simulation environment that Schlumberger provides for generating the models, as well as for visualizing the results. This means that a reservoir engineer never leaves the environment that she or he is familiar with. The engineer just needs to select where the simulation should run: locally, on an on-premises cluster, or on the cloud. After the simulation has run, the engineer can analyze the results back in Petrel, as usual.

But perhaps what is most exciting about this solution is that these capabilities are available on a subscription-based model, making it possible for companies of all sizes, on all parts of the world—including those customers who are unable to afford their own HPC environment—to have access to high science, on an enterprise-class simulation solution like INTERSECT. This, in particular, is one of the greatest advantages of a cloud solution.

Some lessons learned

There is always something new to be learned with every project. One of the concerns that both Schlumberger and we had was the time it would take to upload the simulation data (anywhere from 1 GB to 100 GB) and download the results data (20GB to 300GB), to and from the cloud. What we learned is that data submission and retrieval constituted only around 3% to 7% of the total time it takes to run a job (from beginning to end). The larger the job, the smaller the percentage, as shown in this chart:

Full job times on Azure A9 instances

Of course, it is important to note that the speed and quality of a customer’s internet connection will vary, but the results that we observed during our testing were quite encouraging.

Some final words

Our final words are a call to action: we want to hear from you. We really do. Our team is comprised of a small group of dedicated engineers that love HPC and what the cloud has to offer, and who want to work directly with you to understand and enable your workload on Azure. It does not matter how simple or complex your workload might be, or if it is a “traditional” HPC workload or not. If it requires a lot of compute power, then it is our type of workload, and Azure is the right cloud for it.

About Schlumberger

The Challenge: engineering models keep getting larger

The Contender: true HPC capabilities on Azure

The Solution: the first reservoir simulation service on the cloud

Some lessons learned

Some final words

Learn more

Latest advancements in Premium SSD v2 and Ultra Azure Managed Disks

Improve cloud performance and reliability with a guided learning plan

Unlock AI innovation with new joint capabilities from Microsoft and SAP

Explore
Azure AI solutions

About Schlumberger

The Challenge: engineering models keep getting larger

The Contender: true HPC capabilities on Azure

The Solution: the first reservoir simulation service on the cloud

Some lessons learned

Some final words

Learn more

ExploreAzure AI solutions

Explore
Azure AI solutions