Timebird
Timebird

Reputation: 199

np.load for several files

It is needed to load several files with .npy extension and write content from it to a new array.

Why that code doesn't work? What should I improve?

A = np.empty(0)
for file in files:
    inf_from_every_file = np.load(file)
    A = np.append(A, inf_from_every_file)

Upvotes: 1

Views: 3591

Answers (1)

WhoIsJack
WhoIsJack

Reputation: 1498

As has been pointed out in the comments, it is not clear what you mean by "doesn't work". Given that files contains file names or paths of the form ('a_file.npy') and given that these files actually exist in the current working directory or in the given paths, your code should "work".

However, your code as it is will flatten the loaded arrays and append them together into one big 1D array. This is most likely not what you want, so I will assume that this is your problem.

To solve this problem with your approach, you would have to look at the axis keyword for np.append.

However, appending to numpy arrays in a loop is generally a bad idea, since a complete copy of the array is created during each append. Therefore, you should either stick with a list followed by np.concatenate:

A = []
for fname in files:
    inf_from_every_file = np.load(fname)
    A.append(inf_from_every_file)
A = np.concatenate(A)

...or you should initialize the entire empty array from the start and then add values to it by indexing:

A = np.empty( (len(files),) + np.load(files[0]).shape )
for index,fname in enumerate(files):
    A[index,...] = np.load(fname)

Note that the line initializing A gets a bit complicated if you don't know the shape of the arrays that should be loaded. Here, I am loading the first array from the list to get the shape information. If you already know that the shape is e.g. (100,100), you can instead use

A = np.empty( (len(files), 100, 100) )

Upvotes: 1

Related Questions