Reputation: 21
I am working on a python script to set up an input file for a solid mechanics simulation software. The part of the script I'm struggling with is where I format nodal data (node numbers and the corresponding 3D coordinates) from a numpy array to string format, with one node's data per line. I've been working on improving the run time of the script, and this is by far the slowest portion of the whole thing. I originally used np.array2string, but found that it gets pretty slow above about 100,000 nodes.
The numpy array with nodal data is called 'nodes', and is an Nx4 array, where N is the number of nodes in the model and can vary from run to run. There is some additional formatting of the data in 'nodeString' that takes place later in the code to remove extraneous brackets, parentheses and commas, but that is relatively quick and pretty much the same between all methods below.
I've tried a couple different settings for the parameters of array2string:
np.set_printoptions(threshold=np.inf)
nodeString = np.array2string(nodes, precision=4, suppress_small=True, separator=',') # original syntax
nodeString = np.array2string(nodes, suppress_small=True, separator=',')
nodeString = np.array2string(nodes, precision=4, suppress_small=False, separator=',')
I've tried array_str instead:
np.set_printoptions(threshold=np.inf)
nodeString = np.array_str(nodes, precision=4, suppress_small=True)
I've also tried just writing the numpy array to a text file and opening it back up:
np.set_printoptions(threshold=np.inf)
fmt = '%s', '%.4f', '%.4f', '%.4f'
np.savetxt('temp.txt', nodes, delimiter=',', fmt=fmt)
with open('temp.txt', 'r') as file:
nodeString = file.read()
Comparison of processing time vs number of nodes for different numpy array to string techniques
(The run time reported in the figure above is in seconds.) By far, the fastest technique I've found is to save the data and then read it back in. I'm really surprised by this, and I wonder if I'm doing something wrong with the native numpy functions like array2string that are negatively impacting their performance. I'm a Mechanical Engineer, and I've been told we code by brute force rather than by elegance, so if someone has a better way of doing what I'm trying to do, or an explanation why it's faster to write and read than just to reformat, I'd appreciated any insight. Thanks!
Upvotes: 2
Views: 276
Reputation: 2174
Instead of writing and reading from a file, read and write to a StringIO object:
from io import StringIO
sb = StringIO()
np.savetxt(sb, nodes, delimiter=',', fmt=fmt)
nodeString = sb.getvalue()
I believe this will save you time by avoiding reading and writing from the harddrive. Rather it keeps everything in memory.
Upvotes: 1