Part 3: The ANSYS FLUENT Performance Comparison Series – CUBE Numerical Simulation Appliances by PADT, Inc.
November 22, 2016
External Flow Over a Truck Body with a Polyhedral Mesh (truck_poly_14m)
- External flow over a truck body using a polyhedral mesh
- This test case has around 14 million polyhedral cells
- Uses the Detached Eddy Simulation (DES) model with the segregated implicit solver
ANSYS Benchmark Test Case Information
- ANSYS HPC Licensing Packs required for this benchmark
- I used three (3) HPC Packs to unlock all of the cores used during the ANSYS Fluent Test Cases of the CUBE appliances shown on the Figure 1 chart.
- I did use four (4) HPC Packs for the two 256 core benchmarks shown on the data but only wanted the data for testing.
- The best average seconds per iteration goes to the 2015 CUBE Intel® Xeon® e5-2667 V3 with a 0.625 time using 128 compute cores.
- The 2015 CUBE Intel® Xeon® e5-2667 V3 outperformed the 256 core AMD Opteron™ series ANSYS Fluent 17.2 benchmarks.
- Please note that different numbers of CUBE Compute Nodes were used in this test. However straight across CPU times are also shown for single nodes at 64 cores.
- To illustrate this ANSYS Fluent test case as it relates to the real world. A completely new ANSYS HPC customer is likely to have up two (2) of the entry level INTEL CUBE Compute Nodes versus eight (8) CUBE compute nodes configuration.
- Please contact your local ANSYS Software Sales Representative for more information on purchasing ANSYS HPC Packs. You too may be able to speed up your solve times by unlocking additional compute power!
- What is a CUBE? For more information regarding our Numerical Simulation workstations and clusters please contact our CUBE Hardware Sales Representative at SALES@PADTINC.COM Designed, tested and configured within your budget. We are happy to help and to listen to your specific needs.
Figure 1 – ANSYS 17.2 FLUENT Test Case Graph
|ANSYS FLUENT External Flow Over a Truck Body with a Polyhedral Mesh (truck_poly_14m) Test Case|
|Number of cells||14,000,000|
The CPU Information
Yes, I am still impressed with the performance day after day, 24×7 of these AMD Opeteron CPU’s! After years of operation the AMD Opteron™ series of processors are still relevant and powerful numerical simulation processors. heavy sigh…For example, after reviewing the ANSYS Fluent Test Case data you can see for yourselves below. The 2012 AMD Opteron™ and 2013 AMD Opteron™ CPU’s can still hang in there with the INTEL XEON CPU’s. However one INTEL CPU node vs. four AMD CPU nodes?
I thought a more realistic test case scenario would be to drop the number of AMD Compute Nodes down to four. Indeed, I could have thrown more of the CUBE Compute Nodes with the AMD Opteron™ series CPU’s inside of them. That is why you can see one 256 core benchmark score where I put all 64 cores on each node to the test. As one would hopefully see in their hardware performance unleashing ANSYS Fluent with 256 core did drop the iteration solve time for the test case with the CUBE Compute Appliances.
Realistically a brand new ANSYS HPC customer is not likely to have:
a) Vast qualities of cores (AMD or INTEL) & compute nodes for optimal distributive numerical solving
b) ANSYS HPC licensing for 512 cores
c) The available circuit breakers to provide power
The Intel® Xeon® CPU’s used for this ANSYS Fluent Test Case
- Intel® Xeon® Processor E5-2690 v4 (35M Cache, 2.60 GHz)
- Intel® Xeon® Processor E5-2667 v4 (25M Cache, 3.20 GHz)
- Intel® Xeon® Processor E5-2667 v3 (20M Cache, 3.20 GHz)
- Intel® Xeon® Processor E5-2667 v2 (25M Cache, 3.30 GHz)
The Estimated Wattage?
No the lights did not dim…but here is a quick comparison with energy use by estimated maximum Watt’s used metric shows up in volumes (decibels) and dollars ($$$) saved or spent.
Less & More!
Overall CUBE Compute Node drops in average watts estimated consumption, indeed has moved forward in progress over the past four years!
- 2012 CUBE AMD Numerical Simulation Appliance with the Opteron™ 6278 – Four (4) Compute Nodes
- Estimated CUBE Configuration @ Full Power: ~8000 Watts
- 2013 CUBE AMD Numerical Simulation Appliance with the Opteron™ 6380
- Estimated CUBE Configuration @ Full Power: ~7000 Watts
- 2015 CUBE Numerical Simulation Appliance with the Intel® Xeon® e5-2667 V3 – Eight (8) Compute Nodes
- Estimated CUBE Configuration @ Full Power: ~4000 Watts
- 2016 CUBE Numerical Simulation Appliance with the Intel® Xeon® e5-2667 V4 – One (1) Compute Node.
- Estimated CUBE Configuration @ Full Power: ~900 Watts
- 2016 CUBE Numerical Simulation Appliance with the Intel® Xeon® e5-2690 V4 – Two (2) Compute Nodes
- Estimated CUBE Configuration @ Full Power: ~1200 Watts
Figure 2 – Estimated CUBE compute node power consumption as configured for this ANSYS FLUENT Test Case.
The CUBE phenomenon
|2012 AMD Opteron™ 6278||2015 CUBE Intel® Xeon® e5-2667 V3|
|4 x Compute Node CUBE HPC Appliance||8 x Compute Node CUBE HPC Appliance|
|4 x 16c @2.4GHz/ea||2 x 8c @3.2GHz/ea – Intel® Xeon® e5-2667 V3|
|Quad Socket motherboard||Dual Socket motherboard|
|DDR3-1866 MHz ECC REG||DDR4-2133 MHz ECC REG|
|5 x 600GB SAS2 15k RPM||4 x 600GB SAS3 15k RPM|
|40Gbps Infiniband QDR High Speed Interconnect||2016 CUBE Intel® Xeon® e5-2667 V4|
|2013 CUBE AMD Opteron™ 6380||1 x CUBE HPC Workstation|
|4 x Compute Node CUBE HPC Appliance||2 x 8c @3.2GHz/ea – Intel® Xeon® e5-2667 V4|
|4 x 16c @2.5GHz/ea||Dual Socket motherboard|
|Quad Socket motherboard||DDR4-2400 MHz LRDIMM|
|DDR3-1866 MHz ECC REG||6 x 600GB SAS3 15k RPM|
|3 x 600GB SAS2 15k RPM||2016 CUBE Intel® Xeon® e5-2690 V4|
|40Gbps Infiniband QDRT High Speed Interconnect||1 x 1U CUBE APPLIANCE – 2 Compute Nodes|
|2014 CUBE Intel® Xeon® e5-2667 V2||2 x 14c @2.6GHz/ea – Intel® Xeon® e5-2690 V4|
|1 x CUBE HPC Workstation||Dual Socket motherboard|
|2 x 8c @3.3GHz/ea – Intel® Xeon® e5-2667 V2||DR4-2400 MHz LRDIMM|
|Dual Socket motherboard||4 x 600GB SAS3 15k RPM – RAID 10|
|DDR3-1866 MHz ECC REG||56Gbps Infiniband FDR CPU High Speed Interconnect|
|3 x 600GB SAS2 15k RPM||10Gbps Ethernet Low Latency|
Operating Systems Used
- Linux 64-bit
- Windows 7 Professional 64-Bit
- Windows 10 Professional 64-Bit
- Windows Server 2012 R2 Standard Edition w/HPC
It Is All About The Data
Test Metric – Average Seconds Per Iteration
- Fastest Time: 0.625 seconds per iteration – 2015 CUBE Intel® Xeon® e5-2667 V3
- ANSYS FLUENT 17.2
|Cores||2014 CUBE Intel® Xeon® e5-2667 V2
(1 x Node)
|2015 CUBE Intel® Xeon® e5-2667 V3
(8 x Nodes)
|2016 CUBE Intel® Xeon® e5-2667 V4
(1 x Node)
|2016 CUBE Intel® Xeon® e5-2690 V4
(2 x Nodes)
|2012 AMD Opteron™ 6278
(4 x Nodes)
|2013 CUBE AMD Opteron™ 6380
(4 x Nodes)
* One (1) CUBE Compute Node with 4 x AMD Opteron™ Series CPU’s for a total of 64 cores was used to derive these two ANSYS Fluent Benchmark data points (Baseline).
PADT offers a line of high performance computing (HPC) systems specifically designed for CFD and FEA number crunching aimed at a balance between cost and performance. We call this concept High Value Performance Computing, or HVPC. These systems have allowed PADT and our customers to carry out larger simulations, with greater accuracy, in less time, at a lower cost than name-brand solutions. This leaves you more cash to buy more hardware or software.
Related Blog Posts