|Published on:||July 10th, 2018|
|With:||Eric Miller & Wim Slagter|
|Description:||In this episode your host and Co-Founder of PADT, Eric Miller is joined by Wim Slagter, Director of HPC & Cloud Alliances at ANSYS, Inc. for an interview and discussion of the ANSYS HPC Benchmark Program; it’s capabilities, benefits, and so much more.
If you have any questions, comments, or would like to suggest a topic for the next episode, shoot us an email at email@example.com we would love to hear from you!
Its 6:30am and a dark shadow looms in Eric’s doorway. I wait until Eric finishes his Monday morning company updates. “Eric check this out, the CUBE HVPC w16i-k20x we built for our latest customer ANSYS Mechanical scaled to 16 cores on our test run.” The left eyebrow of Eric’s slightly rises up. I know I have him now I have his full and complete attention.
This is why; Eric knows and probably many of you reading this also know that solving differential equations, distributed, parallel along with using graphic processing unit makes our hearts skip a beat. The finite element method used for solving these equations is CPU intensive and I/O intensive. This is headline news type stuff to us geek types. We love scratching our way along the compute processing power grids to utilize every bit of performance out of our hardware!
Oh and yes a lower time to solve is better! No GPU’s were harmed in this tests. Only one NVIDIA TESLA k20X GPU was used during the test.
I have been gathering and hording years’ worth of ANSYS mechanical benchmark data. Why? Not sure really after all I am wanna-be ANSYS Analysts. However, it wasn’t until a couple weeks ago that I woke up to the why again. MY CUBE HVPC team sold a dual socket INTEL Ivy bridge based workstation to a customer out of Washington state. Once we got the order, our Supermicro reseller‘s phone has been bouncing of the desk. After some back and forth, this is how the parts arrive directly from Supermicro, California. Yes, designed in the U.S.A. And they show up in one big box:
As per normal is as normal does, I ran the series of ANSYS benchmarks. You know the type of benchmarks that perform coupled-physics simulations and solving really huge matrix numbers. So I ran ANSYS v14sp-5, ANSYS FLUENT benchmarks and some benchmarks for this customer, the types of runs they want to use the new machine for. So I was talking these benchmark results over with Eric. He thought that now is a perfect time to release the flood of benchmark data. Well some/a smidge of the benchmark data. I do admit the data does get overwhelming so I have tried to trim down the charts and graphs to the bare minimum. So what makes this workstation recipe for the fastest ANSYS Mechanical workstation so special? What is truly exciting enough to tip me over in my overstuffed black leather chair?
Not only is it the fastest ANSYS Mechanical workstation running on CUBE HVPC hardware. It uses two INTEL CPU’s at 22 nanometers. Additionally, this is the first time that we have had an INTEL dual socket based workstation continue to gain faster times on and up to its maximum core count when solving in ANSYS Mechanical APDL.
Previously the fastest time was on the CUBE HVPC w16i-GPU workstation listed below. And it peaked at 14 cores.
Unfortunately we only had time before we shipped the system off to gather two runs: 14 and 16 cores on the new machine. But you can see how fast that was in this table. It was close to the previous system at 14 cores, but blew past it at 16 whereas the older system actually got clogged up and slowed down:
|Run Time (Sec)|
|Cores Used||Config B||Config C||Config D|
And here are the results as a bar graph for all the runs with this benchmark:
We can’t wait to build one of these with more than one motherboard, maybe a 32 core system with infinband connecting the two. That should allow some very fast run times on some very, very large problems.
ANSYS V14sp-5 ANSYS R14 Benchmark Details
Here are the details and the data of the March 8, 2013 workstation:
Configuration C = CUBE HVPC w16i-GPU
Here are the details from the new, November 1, 2013 workstation:
Configuration D = CUBE HVPC w16i-k20x
You can view the output from the run on the newer box (Configuration D) here:
Here is a picture of the Configuration D machine with the info on its guts:
The one (or two) CPU that rules them all: http://ark.intel.com/products/76161/
Intel® Xeon® Processor E5-2687W v2
The GPU’s that just keep getting better and better:
If you are as impressed as we are, then it is time for you to try out this next iteration of the Intel chip, configured for simulation by PADT, on your problems. There is no reason for you to be using a CAD box or a bloated web server as your HPC workstation for running ANSYS Mechanical and solving in ANSYS Mechanical APDL. Give us a call, our team will take the time to understand the types of problems you run, the IT environment you run in, and custom configure the right system for you:
or call 480.813.4884
Note: The information and data contained in this article was complied and generated on September 12, 2013 by PADT, Inc. on CUBE HVPC hardware using FLUEN 14.5.7. Please remember that hardware and software change with new releases and you should always try to run your own benchmarks, on your own typical problems, to understand how performance will impact you.
A potential customer of ours was interested in a CUBE HVPC mini-cluster. They requested that I run benchmarks and garner some data on a two CPU’s. The CPU’s were benchmarked on two of our CUBE HVPC systems. One mini-cluster has dual INTEL® XEON e5-2690 CPU’s and another mini-cluster has quad AMD® Opteron 8308 CPU’s. The benchmarking was only run on a single server using a total of 16 cores on each machine. The same DDR3-1600 ECC Reg RAM, Supermicro LSI 2208 RAID Controller and Hitachi SAS2 15k RPM hard drives were used on each system.
The models we used can be downloaded from the ANSYS Fluent Benchmark page link: http://www.ansys.com/Support/Platform+Support/Benchmarks+Overview/ANSYS+Fluent+Benchmarks
That was the question Eric proposed to me after he reviewed the data and read this blog article before posting. I told him “yes I am sure data is data, and I even triple checked.” I basically re-ran several of the benchmarks to see if the solve times came out the same on these two CUBE HVPC workstations. I went on to tell Eric , “For example, lets dig into the data for the External Flow Over a Truck Body with a Polyhedral Mesh (truck_poly_14m) benchmark and see what we find.”
|Quad socket Supermicro motherboard
4 x 4c AMD Opteron 6308 @3.5GHz
|Dual socket Supermicro motherboard
2 x 8c INTEL e5-2690 @2.9GHz
The INTEL XEON e5-2690 INTEL CPU dual socket motherboard is impressive; it may have been on the Top500 list of some of the fastest computers in the world ten years ago. Anyways, so after each solve I captured the solve data and as you can see below. The AMD Opteron wall clock time was faster than the INTEL XEON wall clock time.
So why did the AMD Opteron 6308 CPU pull away from the INTEL for the ANSYS FLUENT solve times? Lets take a look at couple of reasons why this happened. I will let you make your own conclusions.
Let us look at the details of what is on the motherboards as well. 4 data paths vs 2 can make a difference:
Dual socket Supermicro motherboard
2 x 8c INTEL e5-2690 @2.9GHz
Quad socket Supermicro motherboard
4 x 4c AMD Opteron 6308 @3.5GHz
|Processor Technology||32-Naometer||32-Naometer SOI (silicon-on-insulator) technology|
|HyperTransport™ Technology Links
Quick Path Interconnect Links
|Two links at up to 8GT/s per link up to 16 GB/s direction peak bandwidth per port||Four x16 links at up to 6.4GT/s per link|
|Memory||Integrated DDR3 memory controller – Up to 51.2 GB/s memory bandwidth per socket|
|Number of Channels and Types of Memory||Four links at up to 51.2GB/s per link||Four x16 links at up to 6.4GT/s per link|
|Number of Channels and Types of Memory||Quad channel support||Quad channel support|
|Packaging||LGA2011-0||Socket G34 – 1944-pin organic Land Grid Array (LGA)|
Here is the up to the minute pricing for each CPU’s. I took these prices off of NewEgg and IngramMicro’s website. The date of the monetary values was captured on September 12, 2013.
PADT offers a line of high performance computing (HPC) systems specifically designed for CFD and FEA number crunching aimed at a balance between cost and performance. We call this concept High Value Performance Computing, or HVPC. These systems have allowed PADT and our customers to carry out larger simulations, with greater accuracy, in less time, at a lower cost than name-brand solutions. This leaves you more cash to buy more hardware or software.
Let CUBE HVPC by PADT, Inc. quote you a configuration today!