AWS CFD Shootout: Single Instances (Again)

Hey there,

It’s Robin from CFD Engine & I’ve been comparing AWS instance types again, bear with me.

They’ve added some new instances since the last time I did this exercise, so I thought I should check them out.

Results were mixed, some were great, but there were also some stinkers.

In fact, it’s now perfectly possible to pay 4 times more than you need to, just by choosing the wrong instance from what, at face value, seem to be very similar options.

This isn’t a deep dive into memory bandwidths, cores-per-socket or chip architectures. I just wanted to find out if the new instances were any good for CFD? Which (if any) I should use? And which ones are akin to lighting my money on fire?

And of course, to share it all with you, so let’s go…

A bit of background

The last time out I compared the speed (& cost) of running a 22.5 million cell model in simpleFoam on different sized instances from the following families:

c6i instances based on Intel’s Ice Lake chips;
c5a instances powered by AMD’s 2ndGen EPYC (Rome) chips;
c6g instances built on AWS’s in-house Graviton2 ARM chips;

Back then, the ARM instances were the cheapest across the board (in some cases half the cost of the others). However, they were also often the slowest.

The Ice Lake instances were fastest across the board, but in most cases they were a little too expensive to be cost-competitive with ARM.

The EYPC (Rome) instances were neither quick nor cheap & I struggled to think of a reason to use them.

So what’s new?

Since then, three new instance families have been released:

the c7g family featuring AWS’ latest Graviton3 ARM chips, plus DDR5 memory & up to 64 cores per instance.
the c6a family based on AMD’s 3rdGen EPYC (Milan) chips, now with up to 96 cores per instance.
the hpc6a a one-off EPYC (Milan) based instance, squarely aimed at the HPC market (including CFD). It’s only available in a 96-core version & it’s intended to be clustered (but we’re not doing that today).

Would these new instances be good for us? I ran my super-simple benchmark tests again to find out.

“Methodology”

As mentioned, I recorded the time taken to solve 500its of a 22.5million cell model in simpleFoam (OpenFOAM v2106 on Ubuntu 20.04) across the new instance types – that’s about it.

I used the binaries from ESI/OpenCFD on the AMD instances but, as there aren’t any binaries for ARM, the code was compiled out-of-the-box on those instances. Both ARM & AMD instances used the Ubuntu system libraries for OpenMPI & Scotch.

Cases were decomposed (scotch) to the match the number of physical cores on each tested instance.

Timings were for the solution only, including the read-in phase & a single write.

You can take a look at the time & cost data for all the runs (including the previous ones) & come to your own conclusions – but here’s my summary…

The Headlines

ARM

The new ARM (c7g) instances were much (~30%) faster than the previous ones & only slightly (~7%) more expensive, producing the cheapest solutions across all of the instance types I’ve tested 👏

EPYC (Milan)

The Milan story has two sides…

The standard EPYC (Milan) instances (c6a family) were unremarkable. They were slightly better than the previous EPYC (Rome) c5a instances, but they still weren’t interesting.

The HPC instance however, was a different story. It was almost the fastest instance, but it was also weirdly cheap, making it very cost-competitive & extremely capable 👏

Caveat

There is a catch though…

Whilst the new instance types are “generally available” they aren’t available everywhere. The new ARM instances are currently only in North Virginia & Oregon, whilst the HPC instances are only in Ohio.

My guess is that they’ll eventually be rolled out to other regions, but as of today, they’re your options.

Conclusions

The ARM instances are now even more cost-effective for CFD & the HPC instance is a great new addition (thanks to its weird pricing).

Would I recommend them for CFD? Definitely.

Would I switch from running in Europe to running in North Virginia / Ohio / Oregon to use them? Probably not…but maybe 🤔

Would I recommend that you do a similar benchmark for a representative case of your own? Absolutely. YMMV.

You could easily waste a wad of cash by running on the wrong instance types & I wouldn’t want that for you.

Have you tried any of these instances or done a similar exercise? What did you find? Was there a “best” instance for your use case? Care to share?

Drop me a note, I’m keen to hear your experiences & how they stack up with what I saw.

Until next week, stay safe,