Making Solids Water Tight in ANSYS Spaceclaim for ANSYS Workbench Meshing

Occasionally when solid geometry is imported from CAD into ANSYS SpaceClaim the geometry will come in as solids, but when a mesh is generated on the solids the mesh will appear to “leak” into the surrounding space. Below is an assembly that was imported from CAD into SpaceClaim. In the SpaceClaim Structure Window all of the parts can be seen to be solid components.

When the mesh is generated in ANSYS Mechanical it appears like the assembly has been successfully meshed.

However, when you look at the mesh a little closer, the mesh can be missing from some of the surfaces and not displayed correctly on others.

Additionally, if you create a cross-section through the mesh, the mesh on some of the parts will “leak” outside of the part boundaries and will look like the image below.

Based on the mesh color, the mesh of the part in the center of the assembly has grown outside of the surfaces of the part.
To repair the part you need to go back to SpaceClaim and rebuild it. First you need to hide the rest of the parts.

Next, create a sketch plane that passes through the problem part.

In the sketch mode create a rectangle that surrounds the part. When you return to 3D mode in SpaceClaim, that rectangle will become a surface that passes through the part.

Now use the Pull tool in SpaceClaim to turn that surface into a part that completely surrounds the part to be repaired, making sure to turn on the “No Merge” option for the pull before you begin.

After you have pulled the surface into a solid, it should like the image below where the original part is completely buried inside the new part.

Now you will use the Combine tool to divide the box with the original part. Select Combine from the Tool Bar, then select the box that you created in the previous step. The cutter will be activated and you will move the cursor around until the original part is highlighted inside the box. Select it with the left mouse button. The Combine tool will then give you the option to select the part of the box that you want to remove. Select the part that surrounds the original part. After it is finished, close the combine tool and the Structure Tree and 3D window will now look like the following:

Now move the new solid that was created with the Combine tool into the location of the original part and turn off the original one and re-activate the other parts of the assembly. The assembly and Structure Tree should now look like the pictures below.

Now save the project, re-open the meshing tool, and re-generate the mesh. The mesh should now be correct and not “leaking” beyond the part boundaries.

Cellular Design Strategies in Nature: A Classification

What types of cellular designs do we find in nature?

Cellular structures are an important area of research in Additive Manufacturing (AM), including work we are doing here at PADT. As I described in a previous blog post, the research landscape can be broadly classified into four categories: application, design, modeling and manufacturing. In the context of design, most of the work today is primarily driven by software that represent complex cellular structures efficiently as well as analysis tools that enable optimization of these structures in response to environmental conditions and some desired objective. In most of these software, the designer is given a choice of selecting a specific unit cell to construct the entity being designed. However, it is not always apparent what the best unit cell choice is, and this is where I think a biomimetic approach can add much value. As with most biomimetic approaches, the first step is to frame a question and observe nature as a student. And the first question I asked is the one described at the start of this post: what types of cellular designs do we find in the natural world around us? In this post, I summarize my findings.

Design Strategies

In a previous post, I classified cellular structures into 4 categories. However, this only addressed “volumetric” structures where the objective of the cellular structure is to fill three-dimensional space. Since then, I have decided to frame things a bit differently based on my studies of cellular structures in nature and the mechanics around these structures. First is the need to allow for the discretization of surfaces as well: nature does this often (animal armor or the wings of a dragonfly, for example). Secondly, a simple but important distinction from a modeling standpoint is whether the cellular structure in question uses beam- or shell-type elements in its construction (or a combination of the two). This has led me to expand my 4 categories into 6, which I now present in Figure 1 below.

Figure 1. Classification of cellular structures in nature: Volumetric – Beam: Honeycomb in bee construction (Richard Bartz, Munich Makro Freak & Beemaster Hubert Seibring), Lattice structure in the Venus flower basket sea sponge (Neon); Volumetric – Shell: Foam structure in douglas fir wood (U.S. National Archives and Records Administration), Periodic Surface similar to what is seen in sea urchin skeletal plates (Anders Sandberg); Surface: Tessellation on glypotodon shell (Author’s image), Scales on a pangolin (Red Rocket Photography for The Children’s Museum of Indianapolis)

Setting aside the “why” of these structures for a future post, here I wish to only present these 6 strategies from a structural design standpoint.

  1. Volumetric – Beam: These are cellular structures that fill space predominantly with beam-like elements. Two sub-categories may be further defined:
    • Honeycomb: Honeycombs are prismatic, 2-dimensional cellular designs extruded in the 3rd dimension, like the well-known hexagonal honeycomb shown in Fig 1. All cross-sections through the 3rd dimension are thus identical. Though the hexagonal honeycomb is most well known, the term applies to all designs that have this prismatic property, including square and triangular honeycombs.
    • Lattice and Open Cell Foam: Freeing up the prismatic requirement on the honeycomb brings us to a fully 3-dimensional lattice or open-cell foam. Lattice designs tend to embody higher stiffness levels while open cell foams enable energy absorption, which is why these may be further separated, as I have argued before. Nature tends to employ both strategies at different levels. One example of a predominantly lattice based strategy is the Venus flower basket sea sponge shown in Fig 1, trabecular bone is another example.
  2. Volumetric – Shell:
    • Closed Cell Foam: Closed cell foams are open-cell foams with enclosed cells. This typically involves a membrane like structure that may be of varying thickness from the strut-like structures. Plant sections often reveal a closed cell foam, such as the douglas fir wood structure shown in Fig 1.
    • Periodic Surface: Periodic surfaces are fascinating mathematical structures that often have multiple orders of symmetry similar to crystalline groups (but on a macro-scale) that make them strong candidates for design of stiff engineering structures and for packing high surface areas in a given volume while promoting flow or exchange. In nature, these are less commonly observed, but seen for example in sea urchin skeletal plates.
  3. Surface:
    • Tessellation: Tessellation describes covering a surface with non-overlapping cells (as we do with tiles on a floor). Examples of tessellation in nature include the armored shells of several animals including the extinct glyptodon shown in Fig 1 and the pineapple and turtle shell shown in Fig 2 below.
    • Overlapping Surface: Overlapping surfaces are a variation on tessellation where the cells are allowed to overlap (as we do with tiles on a roof). The most obvious example of this in nature is scales – including those of the pangolin shown in Fig 1.
Figure 2. Tessellation design strategies on a pineapple and the map Turtle shell [Scans conducted at PADT by Ademola Falade]

What about Function then?

This separation into 6 categories is driven from a designer’s and an analyst’s perspective – designers tend to think in volumes and surfaces and the analyst investigates how these are modeled (beam- and shell-elements are at the first level of classification used here). However, this is not sufficient since it ignores the function of the cellular design, which both designer and analyst need to also consider. In the case of tessellation on the skin of an alligator for example as shown in Fig 3, was it selected for protection, easy of motion or for controlling temperature and fluid loss?

Figure 3. Varied tessellation on an alligator conceals a range of possible functions (CCO public domain)

In a future post, I will attempt to develop an approach to classifying cellular structures that derives not from its structure or mechanics as I have here, but from its function, with the ultimate goal of attempting to reconcile the two approaches. This is not a trivial undertaking since it involves de-confounding multiple functional requirements, accounting for growth (nature’s “design for manufacturing”) and unwrapping what is often termed as “evolutionary baggage,” where the optimum solution may have been sidestepped by natural selection in favor of other, more pressing needs. Despite these challenges, I believe some first-order themes can be discerned that can in turn be of use to the designer in selecting a particular design strategy for a specific application.


This is by no means the first attempt at a classification of cellular structures in nature and while the specific 6 part separation proposed in this post was developed by me, it combines ideas from a lot of previous work, and three of the best that I strongly recommend as further reading on this subject are listed below.

  1. Gibson, Ashby, Harley (2010), Cellular Materials in Nature and Medicine, Cambridge University Press; 1st edition
  2. Naleway, Porter, McKittrick, Meyers (2015), Structural Design Elements in Biological Materials: Application to Bioinspiration. Advanced Materials, 27(37), 5455-5476
  3. Pearce (1980), Structure in Nature is a Strategy for Design, The MIT Press; Reprint edition

As always, I welcome all inputs and comments – if you have an example that does not fit into any of the 6 categories mentioned above, please let me know by messaging me on LinkedIn and I shall include it in the discussion with due credit. Thanks!

License Usage and Reporting with ANSYS License Manager Release 18.0

Remember the good old days of having to peruse through hundreds and thousands of lines of text in multiple files to see ANSYS license usage information?  Trying to hit Ctrl+F and search for license names.  Well those days were only about a couple months ago and they are over…well for the most part.

With the ANSYS License Manager Release 18.0, we have some pretty nifty built in license reporting tools that help to extract information from the log files so the administrator can see anything from current license usage to peak usage and even any license denials that occur.  Let’s take a look at how to do this:

First thing is to open up the License Management Center:

  • In Windows you can find this by going to Start>Programs>ANSYS Inc License Manager>ANSYS License Management Center
  • On Linux you can find this in the ansys directory /ansys_inc/shared_files/licensing/start_lmcenter

This will open up your License Manager in your default browser as shown below.   For the reporting just take a look at the Reporting Section.  We’ll cover each of these 4 options below.

License Management Center at Release 18.0

License Reporting Options




As the title says, this is where you’ll go to see a breakdown of the current license usage.  What is great is that you can see all the licenses that you have on the server, how many licenses of each are being used and who is using them (through the color of the bars).  Please note that PADT has access to several ANSYS Licenses.  Your list will only include the licenses available for use on your server.

Scrolling page that shows Current License Usage and Color Coded Usernames

You can also click on Show Tabular Data to see a table view that you can then export to excel if you wanted to do your own manipulation of the data.

Tabular Data of Current License Usage – easy to export





In this section you will be able to not only isolate the license usage to a specific time period, you can also filter by license type as well.  You can use the first drop down to define a time range, whether that is the previous 1 month, 1 year, all available or even your own custom time range

Isolate License Usage to Specific Time Period

Once you hit Generate you will be able to then isolate by license name as shown below.  I’ve outlined some examples below as well.  The axis on the left shows number of licenses used.

Filter Time History by License Name

1 month history of ANSYS Mechanical Enterprise

 1 month history of ANSYS CFD

Custom Date Range history of ANSYS SpaceClaim Direct Modeler





This section will allow you to see what the peak usage of a particular license during a particular time period and filter it based on data range.  First step is to isolate to a date range as before, for example 1 month.  Then you can select which month you want to look at data for.

Selecting specific month to look at Peak License Usage

Then you can isolate the data to whether or not you want to look at an operational period of 24/7, Monday to Friday 24/5 or even Monday to Friday 9am-5pm.  This way you can isolate license usage between every day of the week, working week or normal working hours in a week. Again, axis on left shows number of licenses.

Isolating data to 24/7, Weekdays or Weekday Working Hours

 Peak License Usage in March 2017 of ANSYS Mechanical Enterprise (24/7)

Peak License Usage in February 2017 of ANSYS CFD (Weekdays Only)




If any of the users who are accessing the License Manager get license denials due to insufficient licenses or for any other reason, this will be displayed in this section.  Since PADT rarely, if ever, gets License Denials, this section is blank for us.  The procedure is identical to the above sections – it involves isolating the data to a time period and filtering the data to your interested quantities.

Isolate data with Time Period as other sections



Although these 4 options doesn’t include every conceivable filtering method, this should allow managers and administrators to filter through the license usage in many different ways without needing to manually go through all the log files.   This is a very convenient and easy set of options to extract the information.

Please let us know if you have any questions on this or anything else with ANSYS.

DesignCon 2017 Trends in Chip, Board, and System Design

Considered the “largest gathering of chip, board, and systems designers in the country,” with over 5,000 attendees this year and over 150 technical presentations and workshops, DesignCon exhibits state of the art trends in high-speed communications and semiconductor communities.

Here are the top 5 trends I noticed while attending DesignCon 2017:

1. Higher data rates and power efficiency.

This is of course a continuing trend and the most obvious. Still, I like to see this trend alive and well because I think this gets a bit trickier every year. Aiming towards 400 Gbps solutions, many vendors and papers were demonstrating 56 Gbps and 112 Gbps channels, with no less than 19 sessions with 56 Gbps or more in the title. While IC manufacturers continue to develop low-power chips, connector manufacturers are offering more vented housings as well as integrated sinks to address thermal challenges.

2. More conductor-based signaling.

PAM4 was everywhere on the exhibition floor and there were 11 sessions with PAM4 in the title. Shielded twinaxial cables was the predominant conductor-based technology such as Samtec’s Twinax Flyover and Molex’s BiPass.

A touted feature of twinax is the ability to route over components and free up PCB real estate (but there is still concern for enclosing the cabling). My DesignCon 2017 session, titled Replacing High-Speed Bottlenecks with PCB Superhighways, would also fall into this category. Instead of using twinax, I explored the idea of using rectangular waveguides (along with coax feeds), which you can read more about here. I also offered a modular concept that reflects similar routing and real estate advantages.

3. Less optical-based signaling.

Don’t get me wrong, optical-based signaling is still a strong solution for high-speed channels. Many of the twinax solutions are being designed to be compatible with fiber connections and, as Teledyne put it in their QPHY-56G-PAM4 option release at DesignCon, Optical Internetworking Forum (OIF) and IEEE are both rapidly standardizing PAM4-based interfaces. Still, the focus from the vendors was on lower cost conductor-based solutions. So, I think the question of when a full optical transition will be necessary still stands.
With that in mind, this trend is relative to what I saw only a couple years back. At DesignCon 2015, it looked as if the path forward was going to be fully embracing optical-based signaling. This year, I saw only one session on fiber and, as far as I could tell, none on photonic devices. That’s compared to DesignCon 2015 with at least 5 sessions on fiber and photonics, as well as a keynote session on silicon photonics from Intel Fellow Dr. Mario Paniccia.

4. More Physics-based Simulations.

As margins continue to shrink, the demand for accurate simulation grows. Dr. Zoltan Cendes, founder of Ansoft, shared the difficulties of electromagnetic simulation over the past 40+ years and how Ansoft (now ANSYS) has improved accuracy, simplified the simulation process, and significantly reduced simulation time. To my personal delight, he also had a rectangular waveguide in his presentation (and I think we were the only two). Dr. Cendes sees high-speed electrical design at a transition point, where engineers have been or will ultimately need to place physics-based simulations at the forefront of the design process, or as he put it, “turning signal integrity simulation inside out.” A closer look at Dr. Cendes’ keynote presentation can be found in DesignNews.

5. More Detailed IC Models.

This may or may not be a trend yet, but improving IC models (including improved data sheet details) was a popular topic among presenters and attendees alike; so if nothing else it was a trend of comradery. There were 12 sessions with IBIS-AMI in the title. In truth, I don’t typically attend these sessions, but since behavioral models (such as IBIS-AMI) impact everyone at DesignCon, this topic came up in several sessions that I did attend even though they weren’t focused on this topic. Perhaps with continued development of simulation solutions like ANSYS’ Chip-Package-System, Dr. Cende’s prediction will one day make a comprehensive physics-based design (to include IC models) a practical reality. Until then, I would like to share an interesting quote from George E. P. Box that was restated in one of the sessions: “Essentially all models are wrong, but some are useful.” I think this is good advice that I use for clarity in the moment and excitement for the future.

By the way, the visual notes shown above were created by Kelly Kingman from on the spot during presentations. As an engineer, I was blown away by this. I have a tendency to obsess over details but she somehow captured all of the critical points on the fly with great graphics that clearly relay the message. Amazing!

How To Update The Firmware Of An Intel® Solid-State Drive DC P3600

How To Update The Firmware Of An Intel® Solid-State Drive DC P3600 in four easy steps!

The Dr. says to keep that firmware fresh! so in this How To blog post I illustrate to you how to verify and/or update the firmware on a 1.2TB  Intel® Solid-State Drive DC 3600 Series NVMe MLC card.

CUBE Workstation Specifications – The Tester

PADT, Inc. – CUBE w32i Numerical Simulation Workstation

  • 2 x 16c @2.6GHz/ea. (INTEL XEON e5-2697A V4 CPU), 40M Cache, 9.6GT, 145 Watt/each
  • Dual Socket Super Micro X10DAi motherboard
  • 8 x 32GB DDR4-2400MHz ECC REG DIMM
  • 1 x NVIDIA QUADRO M2000 – 4GB GDDR5
  • 1 x  Intel® DC P3600 1.2TB, NVMe PCIe 3.0, MLC AIC 20nm
  • Windows 7 Ultimate Edition 64-bit

Step 1: Prepping

Check for and download the latest downloads for the Intel® Solid-State DC 3600 here:

You will need the latest downloads of the:

Intel® SSD Data Center Family for NVMe Drivers
  • Intel® Solid State Drive Toolbox

  • Intel® SSD Data Center Tool

  • Intel® SSD Data Center Family for NVMe Drivers

Step 2: Installation

After instaling, the Intel® Solid State Drive Toolbox and the Intel® SSD Data Center Tool reboot the workstation and move on to the next step.


INTEL SSD Toolbox Install

Step 3: Trust But Verify

Check the status of the 1.2TB NVMe card by running the INTEL SSD DATA Center Tool. Next, I will be using the Windows 7 Ultimate 64-bit version for the operating system. Running the INTEL DATA CENTER TOOLS  within an elevated command line prompt.

Right-Click –> Run As…Administrator
Command Line Text: isdct show –intelssd

INTEL DATA Center Command Line Tool
INTEL DATA Center Command Line Tool

As the image indicates below the firmware for this 1.2TB NVMe card is happy and it’s firmware is up to date! Yay!

If you have more than one SSD take note of the Drive Number.

  • Pro Tip – In this example the INTEL DC P3600 is Drive number zero. You can gather this information from the output syntax. –> Index : 0

Below is what the command line output text looks like while the firmware process is running.

C:\isdct >isdct.exe load –intelssd 0 WARNING! You have selected to update the drives firmware! Proceed with the update? (Y|N): y Updating firmware…The selected Intel SSD contains current firmware as of this tool release.
isdct.exe load –intelssd 0 WARNING! You have selected to update the drives firmware! Proceed with the update? (Y|N): n Canceled.
isdct.exe load –f –intelssd 0 Updating firmware… The selected Intel SSD contains current firmware as of this tool release.
isdct.exe load –intelssd 0 WARNING! You have selected to update the drives firmware! Proceed with the update? (Y|N): y Updating firmware… Firmware update successful.

Step 4: Reboot Workstation

The firmware update process has been completed.

shutdown /n

Using External Data in ANSYS Mechanical to Tabular Loads with Multiple Variables

ANSYS Mechanical is great at applying tabular loads that vary with an independent variable. Say time or Z.  What if you want a tabular load that varies in multiple directions and time. You can use the External Data tool to do just that. You can also create a table with a single variable and modify it in the Command Editor.

In the Presentation below, I show how to do all of this in a step-by-step description.


You can also download the presentation here.

Experiences with Developing a “Somewhat Large” ACT Extension in ANSYS

With each release of ANSYS the customization toolkit continues to evolve and grow.  Recently I developed what I would categorize as a decent sized ACT extension.    My purpose in this post is to highlight a few of the techniques and best practices that I learned along the way.

Why I chose C#?

Most ACT extensions are written in Python.  Python is a wonderfully useful language for quickly prototyping and building applications, frankly of all shapes and sizes.  Its weaker type system, plethora of libraries, large ecosystem and native support directly within the ACT console make it a natural choice for most ACT work.  So, why choose to move to C#?

The primary reasons I chose to use C# instead of python for my ACT work were the following:

  1. I prefer the slightly stronger type safety afforded by the more strongly typed language. Having a definitive compilation step forces me to show my code first to a compiler.  Only if and when the compiler can generate an assembly for my source do I get to move to the next step of trying to run/debug.  Bugs caught at compile time are the cheapest and generally easiest bugs to fix.  And, by definition, they are the most likely to be fixed.  (You’re stuck until you do…)
  2. The C# development experience is deeply integrated into the Visual Studio developer tool. This affords not only a great editor in which to write the code, but more importantly perhaps the world’s best debugger to figure out when and how things went wrong.   While it is possible to both edit and debug python code in Visual Studio, the C# experience is vastly superior.

The Cost of Doing ACT Business in C#

Unfortunately, writing an ACT extension in C# does incur some development cost in terms setting up the development environment to support the work.  When writing an extension solely in Python you really only need a decent text editor.  Once you setup your ACT extension according to the documented directory structure protocol, you can just edit the python script files directly within that directory structure.  If you recall, ACT requires an XML file to define the extension and then a directory with the same name that contains all of the assets defining the extension like scripts, images, etc…  This “defines” the extension.

When it comes to laying out the requisite ACT extension directory structure on disk, C# complicates things a bit.  As mentioned earlier, C# involves a compilation step that produces a DLL.  This DLL must then somehow be loaded into Mechanical to be used within the extension.  To complicate things a little further, Visual Studio uses a predefined project directory structure that places the build products (DLLs, etc…) within specific directories of the project depending on what type of build you are performing.   Therefore the compiled DLL may end up in any number of different directories depending on how you decide to build the project.  Finally, I have found that the debugging experience within Visual Studio is best served by leaving the DLL located precisely wherever Visual Studio created it.

Here is a summary list of the requirements/problems I encountered when building an ACT extension using C#

  1. I need to somehow load the produced DLL into Mechanical so my extension can use it.
  2. The DLL that is produced during compilation may end up in any number of different directories on disk.
  3. An ACT Extension must conform to a predefined structural layout on the filesystem. This layout does not map cleanly to the Visual studio project layout.
  4. The debugging experience in Visual Studio is best served by leaving the produced DLL exactly where Visual Studio left it.

The solution that I came up with to solve these problems was twofold.

First, the issue of loading the proper DLL into Mechanical was solved by using a combination of environment variables on my development machine in conjunction with some Python programming within the ACT main python script.  Yes, even though the bulk of the extension is written in C#, there is still a python script to sort of boot-load the extension into Mechanical.  More on that below.

Second, I decided to completely rebuild the ACT extension directory structure on my local filesystem every time I built the project in C#.  To accomplish this, I created in visual studio what are known as post-build events that allow you to specify an action to occur automatically after the project is successfully built.  This action can be quite generic.  In my case, the “action” was to locally run a python script and provide it with a few arguments on the command line.  More on that below.

Loading the Proper DLL into Mechanical

As I mentioned above, even an ACT extension written in C# requires a bit of Python code to bootstrap it into Mechanical.  It is within this bit of Python that I chose to tackle the problem of deciding which dll to actually load.  The code I came up with looks like the following:

Essentially what I am doing above is querying for the presence of a particular environment variable that is on my machine.  (The assumption is that it wouldn’t randomly show up on end user’s machine…) If that variable is found and its value is 1, then I determine whether or not to load a debug or release version of the DLL depending on the type of build.  I use two additional environment variables to specify where the debug and release directories for my Visual Studio project exist.  Finally, if I determine that I’m running on a user’s machine, I simply look for the DLL in the proper location within the extension directory.  Setting up my python script in this way enables me to forget about having to edit it once I’m ready to share my extension with someone else.  It just works.

Rebuilding the ACT Extension Directory Structure

The final piece of the puzzle involves rebuilding the ACT extension directory structure upon the completion of a successful build.  I do this for a few different reasons.

  1. I always want to have a pristine copy of my extension laid out on disk in a manner that could be easily shared with others.
  2. I like to store all of the various extension assets, like images, XML files, python files, etc… within the Visual Studio Project. In this way, I can force the project to be out of date and in need of a rebuild if any of these files change.  I find this particularly useful for working with the XML definition file for the extension.
  3. Having all of these files within the Visual Studio Project makes tracking thing within a version control system like SVN or git much easier.

As I mentioned before, to accomplish this task I use a combination of local python scripting and post build events in Visual Studio.  I won’t show the entire python code, but essentially what it does is programmatically work through my local file system where the C# code is built and extract all of the files needed to form the ACT extension.  It then deletes any old extension files that might exist from a previous build and lays down a completely new ACT extension directory structure in the specified location.  The definition of the post build event is specified within the project settings in Visual Studio as follows:

As you can see, all I do is call out to the system python interpreter and pass it a script with some arguments.  Visual Studio provides a great number of predefined variables that you can use to build up the command line for your script.  So, for example, I pass in a string that specifies what type of build I am currently performing, either “Debug” or “Release”.  Other strings are passed in to represent directories, etc…

The Synergies of Using Both Approaches

Finally, I will conclude with a note on the synergies you can achieve by using both of the approaches mentioned above.  One of the final enhancements I made to my post build script was to allow it to “edit” some of the text based assets that are used to define the ACT extension.  A text based asset is something like an XML file or python script.  What I came to realize is that certain aspects of the XML file that define the extension need to be different depending upon whether or not I wish to debug the extension locally or release the extension for an end user to consume.  Since I didn’t want to have to remember to make those modifications before I “released” the extension for someone else to use, I decided to encode those modifications into my post build script.  If the post build script was run after a “debug” build, I coded it to configure the extension for optimal debugging on my local machine.  However, if I built a “release” version of the extension, the post build script would slightly alter the XML definition file and the main python file to make it more suitable for running on an end user machine.   By automating it in this way, I could easily build for either scenario and confidently know that the resulting extension would be optimally configured for the particular end use.


Now that I have some experience in writing ACT extensions in C# I must honestly say that I prefer it over Python.  Much of the “extra plumbing” that one must invest in in order to get a C# extension up and running can be automated using the techniques described within this post.  After the requisite automation is setup, the development process is really straightforward.  From that point onward, the increased debugging fidelity, added type safety and familiarity a C based language make the development experience that much better!  Also, there are some cool things you can do in C# that I’m not 100% sure you can accomplish in Python alone.  More on that in later posts!

If you have ideas for an ACT extension to better serve your business needs and would like to speak with someone who has developed some extensions, please drop us a line.  We’d be happy to help out however we can!


Connection Groups and Your Sanity in ANSYS Mechanical

You kids don’t know how good you have it with automatic contact creation in Mechanical.  Back in my day, I’d have to use the contact wizard in MAPDL or show off my mastery of the ESURF command to define contacts between parts.  Sure, there were some macros somewhere on the interwebs that would go through and loop for surfaces within a particular offset, but for the sake of this stereotypical “old-tyme” rant, I didn’t use them (I actually didn’t, I was just TOO good at using ESURF to need anyone else’s help).

Image result for old tyme

Hey, it gets me from point A to B

In Mechanical contact is automatically generated based on a set of rules contained in the ‘Connection Group’ object:


It might look a little over-whelming, but really the only thing you’ll need to play around with is the ‘Tolerance Type’.  This can either ‘Slider’ or ‘Value’ (or use sheet thickness if you’re working with shells).  What this controls is the face offset value for which Mechanical will automatically build contact.  So in the picture shown above faces that are 5.9939E-3in apart will automatically have contact created.  You can play around with the slider value to change what the tolerance

image image image

As you can see, the smaller the tolerance slider the larger the ‘acceptable’ gap becomes.  If you change the Tolerance Type to be ‘Value’ then you can just directly type in a number.

Typically the default values do a pretty good job automatically defining contact.  However, what happens if you have a large assembly with a lot of thin parts?  Then what you run into is non-sensical contact between parts that don’t actually touch (full disclosure, I actually had to modify the contact settings to have the auto-generated contact do something like this…but I have seen this in other assemblies with very thin/slender parts stacked on top of each other):


In the image above, we see that contact has been defined between the bolt head and a plate when there is clearly a washer present.  So we can fix this by going in and specifying a value of 0, meaning that only surfaces that are touching will have contact defined.  But now let’s say that some parts of your assembly aren’t touching (maybe it’s bad CAD, maybe it’s a welded assembly, maybe you suppressed parts that weren’t important).


The brute force way to handle this would be to set the auto-detection value to be 0 and then go back and manually define the missing contacts using the options shown in the image above.  Or, what we could do is modify the auto-contact to be broken up into groups and apply appropriate rules as necessary.  The other benefit to this is if you’re working in large assemblies, you can retain your sanity by having contact generated region by region.   In the words of the original FE-guru, Honest Abe, it’s easier to manage things when they’re logically broken up into chunks.


Said No One Ever

Sorry…that was bad.  I figured in the new alt-fact world with falsely-attributed quotes to historical leaders, I might as well make something up for the oft-overlooked FE-crowd.

So, how do you go about implementing this?  Easy, first just delete the default connection group (right-mouse-click on it and select delete).  Next, just select a group of bodies and click the ‘Connection Group’ button:

image image image

In the image series above, I selected all the bolts and washers, clicked the connection group, and now I have created a connection group that will only automatically generate contact between the bolts and washers.  I don’t have to worry about contact being generated between the bolt and plate.  Rinse, lather, and repeat the process until you’ve created all the groups you want:


ALL the Connection Groups!

Now that you have all these connection groups, you can fine-tune the auto-detection rules to meet the ‘needs’ of those individual body groups.  Just zooming in on one of the groups:


By default, when I generate contact for this group I’ll get two contact pairs:

image image

While this may work, let’s say I don’t want a single contact pair for the two dome-like structures, but 2.  That way I can just change the behavior on the outer ‘ring’ to be frictionless and force the top to be bonded:


I modified the auto-detection tolerance to be a user-defined distance (note that when you type in a number and move your mouse over into the graphics window you will see a bulls-eye that indicates the search radius you just defined).  Next, I told the auto-detection not to group any auto-detected contacts together.  The result is I now get 3 contact pairs defined:

image image image

Now I can just modify the auto-generated contacts to have the middle-picture shown in the series above to be frictionless.  I could certainly just manually define the contact regions, but if you have an assembly of dozens/hundreds of parts it’s significantly easier to have Mechanical build up all the contact regions and then you just have to modify individual contact pairs to have the type/behavior/etc you want (bonded, frictionless, symmetric, asymmetric, custom pinball radius, etc).  This is also useful if you have bodies that need to be connected via face-to-edge or edge-to-edge contact (then you can set the appropriate priority as to which, if any of those types should be preserved over others).

So the plus side to doing all of this is that after any kind of geometry update you shouldn’t have much, if any, contact ‘repair’ to do.  All the bodies/rules have already been fine tuned to automatically build what you want/need.  You also know where to look to modify contacts (although using the ‘go to’ functionality makes that pretty easy as well).  That way you can define all these connection groups, leave everything as bonded and do a preliminary solve to ensure things look ‘okay’.  Then go back and start introducing some more reality into the simulation by allowing certain regions to move relative to each other.

The downside to doing your contacts this way is you risk missing an interface because you’re now defining the load path.  To deal with that you can just insert a dummy-modal environment into your project, solve, and check that you don’t have any 0-Hz modes.

Exploring High-Frequency Electromagnetic Theory with ANSYS HFSS

I recently had the opportunity to present an interesting experimental research paper at DesignCon 2017, titled Replacing High-Speed Bottlenecks with PCB Superhighways. The motivation behind the research was to develop a new high-speed signaling system using rectangular waveguides, but the most exciting aspect for me personally was salvaging a (perhaps contentious) 70 year old first-principles electromagnetic model. While it took some time to really understand how to apply the mathematics to design, their application led to an exciting convergence of theory, simulation, and measurement.

One of the most critical aspects of the design was exciting the waveguide with a monopole probe antenna. Many different techniques have been developed to match the antenna impedance to the waveguide impedance at the desired frequency, as well as increase the bandwidth. Yet, all of them rely on assumptions and empirical measurement studies. Optimizing a design to nanometer precision empirically would be difficult at best and even if the answer was found it wouldn’t inherently reveal the physics. To solve this problem, we needed a first-principles model, a simulation tool that could quickly iterate designs accurately, and some measurements to validate the simulation methodology.

A rigorous first-principles model was developed by Robert Collin in 1960, but this solution has since been forgotten and replaced by simplified rules. Unfortunately, these simplified rules are unable to deliver an optimal design or offer any useful insight to the critical parameters. In fairness, Collin’s equations are difficult to implement in design and validating them with measurement would be tedious and expensive. Because of this, empirical measurements have been considered a faster and cheaper alternative. However, we wanted the best of both worlds… we wanted the best design, for the lowest cost, and we wanted the results quickly.

For this study, we used ANSYS HFSS to simulate our designs. Before exploring new designs, we first wanted to validate our simulation methodology by correlating results with available measurements. We were able to demonstrate a strong agreement between Collin’s theory, ANSYS HFSS simulation, and VNA measurement.

Red simulated S-parameters strongly correlated with blue measurements.

To perform a series of parametric studies, we swept thousands of antenna design iterations across a wide frequency range of 50 GHz for structures ranging from 50-100 guide wavelengths long. High-performance computing gave us the ability to solve return loss and insertion loss S-parameters within just a few minutes for each design iteration by distributing across 48 cores.

Sample Parametric Design Sweep

Finally, we used the lessons we learned from Collin’s equations and the parametric study to develop a new signaling system with probe antenna performance never before demonstrated. You can read the full DesignCon paper here. The outcome also pertains to RF applications in addition to potentially addressing Signal Integrity concerns for future high-speed communication channels.

Rules-of-thumb are important to fast and practical design, but their application can many times be limited. Competitive innovation demands we explore beyond these limitations but the only way to match the speed and accuracy of design rules is to use simulations capable of offering fast design exploration with the same reliability as measurement. ANSYS HFSS gave us the ability to, not only optimize our design, but also teach us about the physics that explain our design and allow us to accurately predict the behavior of new innovative designs.

Importing Material Properties from Solidworks into ANSYS Mechanical…Finally!

Finally! One of the most common questions we get from our customers who use Solidworks is “Why can’t I transfer my materials from Solidworks? I have to type in the values all over again every time.”  Unfortunately, until now, ANSYS has not been able to access the Solidworks material library to access that information.

There is great news with ANSYS 18.  ANSYS is now able to import the material properties from Solidworks and use them in an analysis within Workbench.  Let’s see how it works.

I have a Solidworks assembly that I downloaded from Grabcad.  The creator had pre-defined all the materials for this model as you can see below.

Once you bring in the geometry into Workbench, just ensure that the Material Properties item is checked under the Geometry cell’s properties.  If you don’t see the panel, just right-click on the geometry cell and click on Properties.

Once you are in ANSYS Mechanical, for example you will see that the parts are already pre-defined with the material specified in Solidworks .

The trick now is to find out where this material is getting stored. If we go to Engineering Data, the only thing we will see is Structural Steel. However when we go to Engineering Data Sources that is where we see a new material library called CADMaterials.  That will show you a list of all the materials and their properties that were imported from a CAD tool such as Solidworks, Creo, NX, etc.

You can of course copy the material and store it for future use in ANSYS like any other material.  This will save you from having to manually define all the materials for a part or assembly from scratch within ANSYS.

Please let us know if you have any questions and we’ll be happy to answer them for you.

ANSYS Video Tips: ANSYS SpaceClaim 18.0 Skin Surface Tool Changes

There were some changes in ANSYS SpaceClaim to the very useful tool that lets you create a surface patch on scan or STL data at 18.0.  In this video we show how to create corner points for a surface patch boundary and how to get an accurate measurement of how far the surface you create deviates from the STL or scan data underneath.

How-To: ANSYS 18 RSM CLIENT SETUP on Windows 2012 R2 HPC

We put this simple how-to together for users to speed up the process on getting your Remote Solve Manager client up and running on Microsoft Windows 2012 R2 HPC.

Download the step-by-step slides here:


You might also be interested in a short article on the setup and use of monitoring for ANSYS R18 RSM.

Monitoring Jobs Using ANSYS RSM 18.0

If you are an ANSYS RSM (Remote Solve Manager) user, you’ll find some changes in version 18.0. Most of the changes, which are improvements to the installation and configuration process, are under the hood from a user standpoint. One key change for users, though, is how you monitor a running job. This short entry shows how to do it in version 18.0.

Rather than bring up the RSM monitor window from the Start menu as was done in prior version, in 18.0 we launch the RSM job monitor directly from the Workbench window, by clicking on Jobs > Open Job Monitor… as shown here:

When a solution has been submitted to RSM for solution on a remote cluster or workstation, it will show up in the resulting Job Monitor window, like this:

Hopefully this saves some effort in trying to figure out where to monitor jobs you have submitted to RSM. Happy solving!

PADT Named ANSYS North American Channel Partner of the Year and Becomes an ANSYS Certified Elite Channel Partner

The ANSYS Sales Team at PADT was honored last week when we were recognized four times at the recent kickoff meeting for the ANSYS North American Sales orginization.  The most humbling of those trips up to the stage was when PADT was recognized as the North American Channel Partner of the Year for 2016.  It was humbling because there are so many great partners that we have had the privilege of worked with for almost 20 years now.  Our team worked hard, and our customers were fantastic, so we were able to make strides in adding capability at existing accounts, finding new customers that could benefit from ANSYS simulation tools, and expanding our reach further in Southern California.  It helps that simulation driven product development actually works, and ANSYS tools allow it to work well.

Here we are on stage, accepting the award:

PADT Accepts the Channel Partner of the Year Award. (L-R: ANSYS CEO Ajei Gopal, ANSYS VP Worldwide Sales and Customer Excellence Rick Mahoney, ANSYS Director of WW Channel Ravi Kumar, PADT Co-Owner Ward Rand, PADT Co-Owner Eric Miller, PADT Software Sales Manager Bob Calvin, ANSYS VP Sales for the Americas Ubaldo Rodriguez

We were also recognized two other times; for exceeding our sales goals and for making the cut to the annual President’s Club retreat.   As a reminder, PADT sells the full multiphysics product line from PADT in Southern California, Arizona, New Mexico, Colorado, Utah, and Nevada.  This is a huge geographic area with a very diverse set of industries and customers.

In addition, ANSYS, Inc. announced that PADT was one of several Channel Partners who had obtained Elite Certified Channel Partner status. This will allow PADT to provide our customers with better services and gives our team access to more resources within ANSYS, Inc.

Once we made it back from the forests and hills of Western Pennsylvania we were able to get a picture with the full sale team.  Great job guys:

We could not have had such a great 2016 without the support of everyone at PADT. The sales team, the application engineers, the support engineers, business operations, and everyone else that pitches in.   We look forward to making more customers happy in 2017 and coming back with additional hardware.

ANSYS HPC Distributed Parallel Processing Decoded: CUBE Workstation

ANSYS HPC Distributed Parallel Processing Decoded: CUBE Workstation

Meanwhile, in the real world the land of the missing-middle:  To read and learn more about the missing middle please read this article by Dr. Stephen Wheat. Click Here

This blog post is about distributed parallel processing performance in a missing-middle world of science, tech, engineering & numerical simulation. I will be using two of PADT, Inc.’s very own CUBE workstations along with ANSYS 17.2. To illustrate facts and findings on the ANSYS HPC benchmarks. I will also show you how to decode and extract key bits of data out of your own ANSYS benchmark out files. This information will assist you with locating and describing the performance how’s and why’s on your own numerical simulation workstations and HPC clusters. With the use of this information regarding your numerical simulation hardware. You will be able to trust and verify your decisions. Assist you with understanding in addition to explaining the best upgrade path for your own unique situation. In this example, I am providing to you in this post. I am illustrating a “worst case” scenario.

You already know you need to increase your parallel processing solves times of your models. “No I am not ready with my numerical simulation results. No I am waiting on Matt to finish running the solve of his model.” “Matt said that it will take four months to solve this model using this workstation. Is this true?!”

  1. How do I know what to upgrade and/or you often find yourself asking yourself. What do I really need to buy?
    1. One or three ANSYS HPC Packs?
    2. Purchase more compute power? NVidia TESLA K80’s GPU Accelerators? RAM? A Subaru or Volvo?
  2. I have no budget. Are you sure? Often IT departments set a certain amount of money for component upgrades and parts. Information you learn in these findings may help justify a $250-$5000 upgrade for you.
  3. These two machines as configured will not break the very latest HPC performance speed records. This exercise is a live real world example of what you would see in the HPC missing middle market.
  4.  Benchmarks were formed months after a hardware and software workstation refresh was completed using NO BUDGET, zip, zilch, nada, none.

Backstory regarding the two real-world internal CUBE FEA Workstations.

  1. These two CUBE Workstations were configured on a tight budget. Only the components at a minimum were purchased by PADT, Inc.
  2. These two internal CUBE workstations have been in live production, in use daily for one or two years.
    1. Twenty-four hours a day seven days a week.
  3. These two workstations were both in desperate need of some sort of hardware and operating system refresh.
  4. As part of Microsoft upgrade initiative in 2016.  Windows 10 Professional was upgraded for free! FREE!

Again, join me in this post and read about the journey of two CUBE workstations being reborn and able to produce impressive ANSYS benchmarks to appease the sense of wining in pure geek satisfaction.

Uh-oh?! $$$

As I mentioned, one challenge that I set for myself on this mission is that I would not allow myself to purchase any new hardware or software. What? That is correct; my challenge was that I would not allow myself to purchase new components for the refresh.

How would I ever succeed in my challenge? Think and then think again.

Harvesting the components of old workstations recently piling up in the IT Lab over the past year! That was the solution. This idea just may be the idea I needed for succeeding in my NO BUDGET challenge. First, utilize existing compute components from old tired machines that had showed in the IT boneyard. Talk to your IT department, you never know what they find or remember that they had laying around in their own IT boneyard. Next, I would also use any RMA’d parts that I could find that had trickled in over the past year. Indeed, by utilizing these old feeder workstations, I was on my way to succeeding in my no budget challenge. The leftovers? Please do not email me for the discarded not worthy components handouts. There is nothing left, none, those components are long gone a nice benefit from our recent in-house next PADT Tech Recycle event.

*** Public Service Announcement *** Please remember to reuse, recycle and erase old computer parts from the landfills.

CUBE Workstation Specifications

PADT, Inc. – CUBE w12ik Numerical Simulation Workstation

(INTENAL PADT CUBE Workstation “CUBE #10”)
1 x CUBE Mid-Tower Chassis (SQ edition)

2 x 6c @3.4GHz/ea (INTEL XEON e5-2643 V3 CPU)

Dual Socket motherboard

16 x 16GB DDR4-2133 MHz ECC REG DIMM

1 x SMC LSI 3108 Hardware RAID Controller – 12 Gb/s

4 x 600GB SAS2 15k RPM – 6 Gb/s – RAID0

3 x 2TB SAS2 7200 RPM Hard Drives – 6 Gb/s (Mid-Term Storage Array – RAID5)

NVIDIA QUADRO K6000 (NVidia Driver version 375.66)

2 x LED Monitors (1920 x 1080)

Windows 10 Professional 64-bit

ANSYS 17.2


PADT, Inc. CUBE w16i-k Numerical Simulation Workstation

(INTENAL PADT CUBE Workstation “CUBE #14″)
1 x CUBE Mid-Tower Chassis

2 x 8c @3.2GHz/ea (INTEL XEON e5-2667 V4 CPU)

Dual Socket motherboard

8 x 32GB DDR4-2400 MHz ECC REG DIMM

1 x SMC LSI 3108 Hardware RAID Controller – 12 Gb/s

4 x 600GB SAS3 15k RPM 2.5” 12 Gb/s – RAID0

2 x 6TB SAS3 7.2k RPM 3.5” 12 Gb/s – RAID1

NVIDIA QUADRO K6000 (NVidia Driver version 375.66)

2 x LED Monitors (1920 x 1080)

Windows 10 Professional 64-bit

ANSYS 17.2


The ANSYS sp-5 Ball Grid Array Benchmark

ANSYS Benchmark Test Case Information

  • BGA (V17sp-5)
    • Analysis Type Static Nonlinear Structural
    • Number of Degrees of Freedom 6,000,000
    • Equation Solver Sparse
    • Matrix Symmetric
  • ANSYS 17.2
  • ANSYS HPC Licensing Packs required for this benchmark –> (2) HPC Packs
  • Please contact your local ANSYS Software Sales Representative for more information on purchasing ANSYS HPC Packs. You too may be able to speed up your solve times by unlocking additional compute power!
  • What is a CUBE? For more information regarding our Numerical Simulation workstations and clusters please contact our CUBE Hardware Sales Representative at SALES@PADTINC.COM Designed, tested and configured within your budget. We are happy to help and to listen to your specific needs.

Comparing the data from the 12 core CUBE vs. a 16 core CUBE with and without GPU Acceleration enabled.

ANSYS 17.2 Benchmark  SP-5 Ball Grid Array
CUBE w12i-k 2643 v3 CUBE w12i-k 2643 v3 w/GPU Acceleration Total Speedup w/GPU CUBE w16i-k 2667 V4 CUBE w16i-k 2667 V4 w/GPU Acceleration Total Speedup w/GPU
2 878.9 395.9 2.22 X 888.4 411.2 2.16 X
4 485.0 253.3 1.91 X 499.4 247.8 2.02 X
6 386.3 228.2 1.69 X 386.7 221.5 1.75 X
8 340.4 199.0 1.71 X 334.0 196.6 1.70 X
10 269.1 184.6 1.46 X 266.0 180.1 1.48 X
11 235.7 212.0 1.11 X
12 230.9 171.3 1.35 X 226.1 166.8 1.36 X
14 213.2 173.0 1.23 X
15 200.6 152.8 1.31 X
16 189.3 166.6 1.14 X
11/15/2016 & 1/5/2017
CUBE w12i-k v17sp-5 Benchmark Graph 2017
CUBE w12i-k v17sp-5 Benchmark Graph 2017
CUBE w16i-k v17sp-5 Benchmark Graph 2017
CUBE w16i-k v17sp-5 Benchmark Graph 2017

Initial impressions

  1. I was very pleased with the results of this experiment. Using the Am I bound bound or I/O bound overall parallel performance indicators the data showed healthy workstations that were both I/O bound. I assumed the I/O bound issue would happen. During several of the benchmarks, the data reveals almost complete system bandwidth saturation. Upwards of ~82 GB/s of bandwidth created during the in-core distributed solve!
  2. I was pleasantly surprised to see a 1.7X or greater solve speedup using one ANSYS HPC licensing pack and GPU Acceleration!

The when and where of numerical simulation performance bottleneck’s for numerical simulation. Similar to how the clock is ticking on the wall, over the years I have focused on the question of, “is your numerical simulation compute hardware compute bound or I/O bound”. This quick and fast benchmark result will show general parallel performance of the workstation and help you find the performance sweet spot for your own numerical simulation hardware.

As a reminder, to determine the answer to that question you need to record the results of your CPU Time For Main Thread, Time Spent Computing Solution and Total Elapsed Time. If the results time for my CPU Main is about the same as my Total Elapsed Time result. The compute hardware is in a Compute Bound situation. If the Total Elapsed Time result is larger than the CPU Time For Main Thread than the compute hardware is I/O bound. I did the same analysis with these two CUBE workstations. I am pickier than most when it comes to tuning my compute hardware. So often I will use a percentage around 95 percent. The percentage column below determines if the workstation is Compute Bound or O/O bound. Generally, what I have found in the industry, is that a percentage of greater than 90% indicates the workstation is wither Compute Bound, I/O bound or in worst-case scenario is both.

**** Result sets data garnered from the ANSYS results.out files on these two CUBE workstations using ANSYS Mechanical distributed parallel solves.

Data mine that ANSYS results.out file!

The data is all there, at your fingertips waiting for you to trust and verify.

Compute Bound or I/O bound

Results 1 – Compute Cores Only


“CUBE #10”

Cores CPU Time For Main Thread Time Spent Computing Solution Total Elapsed Time % Compute Bound IO Bound
2 2 914.2 878.9 917.0 99.69 YES NO
4 4 517.2 485.0 523.0 98.89 YES NO
6 6 418.8 386.3 422.0 99.24 YES NO
8 8 374.7 340.4 379.0 98.87 YES NO
10 10 302.5 269.1 307.0 98.53 YES NO
11 11 266.6 235.7 273.0 97.66 YES NO
12 12 259.9 230.9 268.0 96.98 YES NO

“CUBE #14”

Cores CPU Time For Main Thread Time Spent Computing Solution Total Elapsed Time % Compute Bound IO Bound
2 2 925.8 888.4 927.0 99.87 YES NO
4 4 532.1 499.4 535.0 99.46 YES NO
6 6 420.3 386.7 425.0 98.89 YES NO
8 8 366.4 334.0 370.0 99.03 YES NO
10 10 299.7 266.0 303.0 98.91 YES NO
12 12 258.9 226.1 265.0 97.70 YES NO
14 14 244.3 213.2 253.0 96.56 YES NO
15 15 230.3 200.6 239.0 96.36 YES NO
16 16 219.6 189.3 231.0 95.06 YES NO

Results 2 – GPU Acceleration + Cores


“CUBE #10”

Cores  + GPU CPU Time For Main Thread Time Spent Computing Solution Total Elapsed Time % Compute Bound IO Bound
2 2 416.3 395.9 435.0 95.70 YES YES
4 4 271.8 253.3 291.0 93.40 YES YES
6 6 251.2 228.2 267.0 94.08 YES YES
8 8 219.9 199.0 239.0 92.01 YES YES
10 10 203.2 184.6 225.0 90.31 YES YES
11 11 227.6 212.0 252.0 90.32 YES YES
12 12 186.0 171.3 213.0 87.32 NO YES
CUBE 14 Cores + GPU CPU Time For Main Thread Time Spent Computing Solution Total Elapsed Time % Compute Bound IO Bound
2 2 427.2 411.2 453.0 94.30 YES YES
4 4 267.9 247.8 286.0 93.67 YES YES
6 6 245.4 221.5 259.0 94.75 YES YES
8 8 219.6 196.6 237.0 92.66 YES YES
10 10 201.8 180.1 222.0 90.90 YES YES
12 12 191.2 166.8 207.0 92.37 YES YES
14 14 195.2 173.0 217.0 89.95 NO YES
15 15 172.6 152.8 196.0 88.06 NO YES
16 16 177.1 166.6 213.0 83.15 NO YES

Identifying Memory, I/O, Parallel Solver Balance and Performance

Results 3 – Compute Cores Only


“CUBE #10”

Ratio of nonzeroes in factor (min/max) Ratio of flops for factor (min/max) Time (cpu & wall) for numeric factor Time (cpu & wall) for numeric solve Effective I/O rate (MB/sec) for solve Effective I/O rate (GB/sec) for solve No GPU Maximum RAM used in GB
0.9376 0.8399 662.822706 5.609852 19123.88932 19.1 78
0.8188 0.8138 355.367914 3.082555 35301.9759 35.3 85
0.6087 0.6913 283.870728 2.729568 39165.1946 39.2 84
0.3289 0.4771 254.336758 2.486551 43209.70175 43.2 91
0.5256 0.644 191.218882 1.781095 60818.51624 60.8 94
0.5078 0.6805 162.258872 1.751974 61369.6918 61.4 95
0.3966 0.5287 157.315184 1.633994 65684.23821 65.7 96

“CUBE #14”

Ratio of nonzeroes in factor (min/max) Ratio of flops for factor (min/max) Time (cpu & wall) for numeric factor Time (cpu & wall) for numeric solve Effective I/O rate (MB/sec) for solve Effective I/O rate (GB/sec) for solve No GPU Maximum RAM used in GB
0.9376 0.8399 673.225225 6.241678 17188.03613 17.2 78
0.8188 0.8138 368.869242 3.569551 30485.70397 30.5 85
0.6087 0.6913 286.269409 2.828212 37799.17161 37.8 84
0.3289 0.4771 251.115087 2.701804 39767.17792 39.8 91
0.5256 0.644 191.964388 1.848399 58604.0123 58.6 94
0.3966 0.5287 155.623476 1.70239 63045.28808 63.0 96
0.5772 0.6414 147.392121 1.635223 66328.7728 66.3 101
0.6438 0.5701 139.355605 1.484888 71722.92484 71.7 101
0.5098 0.6655 130.042438 1.357847 78511.36377 78.5 103

Results 4 – GPU Acceleration + Cores


“CUBE #10”

Ratio of nonzeroes in factor (min/max) Ratio of flops for factor (min/max) Time (cpu & wall) for numeric factor Time (cpu & wall) for numeric solve Effective I/O rate (MB/sec) for solve Effective I/O rate (GB/sec) for solve % GPU Accelerated The Solve Maximum RAM used in GB
0.9381 0.8405 178.686155 5.516205 19448.54863 19.4 95.78 78
0.8165 0.8108 124.087864 3.031092 35901.34876 35.9 95.91 85
0.6116 0.6893 122.433584 2.536878 42140.01391 42.1 95.74 84
0.3365 0.475 112.33829 2.351058 45699.89654 45.7 95.81 91
0.5397 0.6359 103.586986 1.801659 60124.33358 60.1 95.95 94
0.5123 0.6672 137.319938 1.635229 65751.09125 65.8 85.17 95
0.4132 0.5345 97.252285 1.562337 68696.85627 68.7 95.75 97

“CUBE #14”

Ratio of nonzeroes in factor (min/max) Ratio of flops for factor (min/max) Time (cpu & wall) for numeric factor Time (cpu & wall) for numeric solve Effective I/O rate (MB/sec) for solve Effective I/O rate (GB/sec) for solve % GPU Accelerated The Solve Maximum RAM used in GB
0.9381 0.8405 200.007118 6.054831 17718.44411 17.7 94.96 78
0.8165 0.8108 122.200896 3.357233 32413.68282 32.4 95.20 85
0.6116 0.6893 122.742966 2.624494 40733.2138 40.7 94.91 84
0.3365 0.475 114.618006 2.544626 42223.539 42.2 94.97 91
0.5397 0.6359 105.4884 1.821352 59474.26914 59.5 95.18 94
0.4132 0.5345 96.750618 1.988799 53966.06502 54.0 94.96 97
0.5825 0.6382 106.573973 1.989103 54528.26599 54.5 88.96 101
0.6604 0.566 91.345275 1.374242 77497.60151 77.5 92.21 101
0.5248 0.6534 107.672641 1.301668 81899.85539 81.9 85.07 103

The ANSYS results.out file – The decoding continues

CUBE w12i-k (“CUBE #10”)

  1. Elapsed Time Spent Computing The Solution
    1. This value determines how efficient or balanced the hardware solution for running in distributed parallel solving.
      1. Fastest Solve Time For CUBE 10
    2. 12 out of 12 Cores w/GPU @ 171.3 seconds Time Spent Computing The Solution
  2. Elapsed Time
    1. This value is the actual time to complete the entire solution process. The clock on the wall time.
    2. Fastest Time For CUBE10
      1. 12 out of 12 w/GPU @ 213.0 seconds
  3. CPU Time For Main Thread
    1. This value indicates the RAW number crunching time of the CPU.
    2. Fastest Time For CUBE10
      1. 12 out of 12 w/GPU @186.0 seconds
  4. GPU Acceleration
    1. The NVidia Quadro K6000 accelerated ~96% of the matrix factorization flops
    2. Actual percentage of GPU accelerated flops = 95.7456
  5. Cores and storage solver performance 12 out of 12 cores and using 1 NVidia Quadro K6000
    1. ratio of nonzeroes in factor (min/max) = 0.4132
    2. ratio of flops for factor (min/max) = 0.5345
      1. These two values above indicate to me that the system is well taxed for compute power/hardware viewpoint.
    3. Effective I/O rate (MB/sec) for solve = 68696.856274 (or 69 GB/sec)
      1. No issues here indicates that the workstation has ample bandwidth available for the solving.

CUBE w16i-k (“CUBE #14”)

  1. Elapsed Time Spent Computing The Solution
    1. This value determines how efficient or balanced the hardware solution for running in distributed parallel solving.
    2. Fastest Time For CUBE w16i-k “CUBE #14”
      1. 15 out of 16 Cores w/GPU @ 152.8 seconds
  2. Elapsed Time
    1. This value is the actual time to complete the entire solution process. The clock on the wall time.
    2. CUBE w16i-k “CUBE #14”
      1. 15 out of 16 Cores w/GPU @ 196.0 seconds
  3. CPU Time For Main Thread
    1. This value indicates the RAW number crunching time of the CPU.
    2. CUBE w16i-k “CUBE #14”
      1. 15 out of 16 Cores w/GPU @ 172.6 seconds
  4. GPU Acceleration Percentage
    1. The NVIDIA QUADRO K6000 accelerated ~92% of the matrix factorization flops
    2. Actual percentage of GPU accelerated flops = 92.2065
  5. Cores and storage 12 out of 12 cores and one Nvidia Quadro K6000
    1. ratio of nonzeroes in factor (min/max) = 0.6604
    2. ratio of flops for factor (min/max) = 0.566
      1. These two values above indicate to me that the system is well taxed for compute power/hardware.
    3. Please note that when reviewing these two data points. A balanced solver performance is when both of these values are as close to 1.0000 as possible.
      1. At this point the compute hardware is no longer as efficient and these values will continue to move farther away from 1.0000.
    4. Effective I/O rate (MB/sec) for solve = 77497.6 MB/sec (or ~78 GB/sec)
      1. No issues here indicates that the workstation has ample bandwidth with fast I/O performance for in-core SPARSE Solver solving.
    1. Maximum amount of RAM used by the ANSYS distributed solve
      1. 103GB’s of RAM needed for in-core solve

Conclusions Summary And Upgrade Path Suggestions

It is important for you to locate your bottleneck on your numerical simulation hardware. By utilizing data provided in the ANSYS results.out files, you will be able to logically determine your worst parallel performance inhibitor and plan accordingly on how to resolve what is slowing the parallel performance of your distributed numerical simulation solve.

I/O Bound and/or Compute Bound Summary

  • I/O Bound
    • Both CUBE w12i-k “CUBE #10” and w16i-k “CUBE #14” are I/O Bound.
      • Almost immediately when GPU Acceleration is enabled.
      • When GPU Acceleration is not enabled, I/O bound is no longer an issue compute solving performance. However solve times are impacted due to available and unused compute power.
  • Compute Bound
    • Both CUBE w12i-k “CUBE #10” and w16i-k “CUBE #14” would benefit from additional Compute Power.
    • CUBE w12i-k “CUBE #10” would get the most bang for the buck by adding in the additional compute power.

Upgrade Path Recommendations

CUBE w12i-k “CUBE #10”

  1. I/O:
    1. Hard Drives
    2. Remove & replace the previous generation hard drives
      1. 3.5″ SAS2.0 6Gb/s 15k RPM Hard Drives
    3. Hard Drives could be upgraded to Enterprise Class SSD or PCIe NVMe
      1. COST =  HIGH
    1. Hard Drives could be upgraded to SAS 3.0 12 Gb/s Drives
      1. COST =  MEDIUM
  2.  RAM:
    1. Remove and replace the previous generation RAM
    2. Currently all available RAM slots of RAM are populated.
      1. Optimum slots per these two CPU’s are four slots of RAM per CPU. Currently eight slots of RAM per CPU are installed.
    3. RAM speeds 2133MHz ECC REG DIMM’
      1. Upgrade RAM to DDR4-2400MHz LRDIMM RAM
      2. COST =  HIGH
  3. GPU Acceleration
    1. Install a dedicated GPU Accelerator card such as an NVidia Tesla K40 or K80
    2. COST =  HIGH
  4.  CPU:
    1. Remove and replace the current previous generation CPU’s:
    2. Currently installed dual  x INTEL XEON e5-2643 V3
    3. Upgrade the CPU’s to the V4 (Broadwell) CPU’s
      1. COST =  HIGH

CUBE w16i-k “CUBE #14”

  1. I/O: Hard Drives SAS3.0 15k RPM Hard Drives 12Gbps 2.5”
    1.  Replace the current 2.5” SAS3 12Gb/s 15k RPM Drives with Enterprise Class SSD’s or PCIe NVMe disk
      1. COST =  HIGH
    2. Replace the 2.5″ SAS3 12 Gb/s hard drives with 3.5″ hard drives.
      1. COST =  HIGH
    3. INTEL 1.6TB P3700 HHHL AIC NVMe
      1. Click Here:
  2. Currently a total of four Hard Drives are installed
    1. Increase existing hard drive count from four hard drives to a total ofsix or eight.
    2. Change RAID configuration to RAID 50
      1. COST =  HIGH
  3. RAM:
    1. Using DDR4-2400Mhz ECC REG DIMM’s
      1. Upgrade RAM to DDR4-2400MHz LRDIMM RAM
      2. COST =  HIGH

Considering RAM: When determining how much System RAM you need to perform a six million degree of freedom ANSYS numerical simulation. Add the additional amounts to your Maximum Amount of RAM used number indicated in your ANSYS results.out file.

  • ANSYS reserves  ~5% of your RAM
  • Office products can use an additional l ~10-15% to the above number
  • Operating System please add an additional ~5-10% for the Operating System
  • Other programs? For example, open up your windows task manager and look at how much RAM your anti-virus program is consuming. Add for the amount of RAM consumed by these other RAM vampires.

Terms & Definition Goodies:

  • Compute Bound
    • A condition that occurs when your CPU processing power sites idle while the CPU waits for the next set of instructions to calculate. This occurs most often when hardware bandwidth is unable to feed the CPU more data to calculate.
  • CPU Time For Main Thread
    • CPU time (or process time) is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to, for example, waiting for input/output (I/O) operations or entering low-power (idle) mode.
  • Effective I/O rate (MB/sec) for solve
    • The amount of bandwidth used during the parallel distributed solve moving data from storage to CPU input and output totals.
    • For example the in-core 16 core + GPU solve using the CUBE w16i-k reached an effective I//O rate of 82 GB/s.
    • Theoretical system level bandwidth possible is ~96 GB/s
  • IO Bound
    • The ability for the input-output of the system hardware for reading, writing and flow of data pulsing through the system has become inefficient and/or detrimental to running an efficient parallel analysis.
  • Maximum total memory used
    • The maximum amount of memory used by analysis during your analysis.
  • Percentage (%) GPU Accelerated The Solve
    • The percentage of acceleration added to your distributed solve provided by the Graphics Processing Unit (GPU). The overall impact of the GPU will be diminished due to slow and saturated system bandwidth of your compute hardware.
  • Ratio of nonzeroes in factor (min/max)
    • A performance indicator of efficient and balanced the solver is performing on your compute hardware. In this example the solver performance is most efficient when this value is as close to the value of 1.0.
  • Ratio of flops for factor (min/max)
    • A performance indicator of efficient and balanced the solver is performing on your compute hardware. In this example the solver performance is most efficient when this value is as close to the value of 1.0.
  • Time (cpu & wall) for numeric factor
    • A performance indicator used to determine how the compute hardware bandwidth is affecting your solve times. When time (cpu & wall) for numeric factor & time (cpu & wall) for numeric solve values are somewhat equal it means that your compute hardware I/O bandwidth is having a negative impact on the distributed solver functions.
  • Time (cpu & wall) for numeric solve
    • A performance indicator used to determine how the compute hardware bandwidth is affecting your solve times. When time (cpu & wall) for numeric solve & time (cpu & wall) for numeric factor values are somewhat equal it means that your compute hardware I/O bandwidth is having a negative impact on the distributed solver functions.
  • Total Speedup w/GPU
    • Total performance gain for compute systems task using a Graphics Processing Unit (GPU).
  • Time Spent Computing Solution
    • The actual clock on the wall time that it took to compute the analysis.
  • Total Elapsed Time
    • The actual clock on the wall time that it took to complete the analysis.