sco1
sco1

Reputation: 12214

Insert field into structured array at a specific column index

I'm currently using np.loadtxt to load some mixed data into a structured numpy array. I do some calculations on a few of the columns to output later. For compatibility reasons I need to maintain a specific output format so I'd like to insert those columns at specific points and use np.savetxt to export the array in one shot.

A simple setup:

import numpy as np

x = np.zeros((2,),dtype=('i4,f4,a10'))
x[:] = [(1,2.,'Hello'),(2,3.,'World')]

newcol = ['abc','def']

For this example I'd like to make newcol the 2nd column. I'm very new to Python (coming from MATLAB). From my searching all I've been able to find so far are ways to append newcol to the end of x to make it the last column, or x to newcol to make it the first column. I also turned up np.insert but it doesn't seem to work on a structured array because it's technically a 1D array (from my understanding).

What's the most efficient way to accomplish this?

EDIT1:

I investigated np.savetxt a little further and it seems like it can't be used with a structured array, so I'm assuming I would need to loop through and write each row with f.write. I could specify each column explicitly (by field name) with that approach and not have to worry about the order in my structured array, but that doesn't seem like a very generic solution.

For the above example my desired output would be:

1, abc, 2.0, Hello
2, def, 3.0, World

Upvotes: 2

Views: 207

Answers (1)

gg349
gg349

Reputation: 22671

This is a way to add a field to the array, at the position you require:

from numpy import zeros, empty


def insert_dtype(x, position, new_dtype, new_column):
    if x.dtype.fields is None:
        raise ValueError, "`x' must be a structured numpy array"
    new_desc = x.dtype.descr
    new_desc.insert(position, new_dtype)
    y = empty(x.shape, dtype=new_desc)
    for name in x.dtype.names:
        y[name] = x[name]
    y[new_dtype[0]] = new_column
    return y


x = zeros((2,), dtype='i4,f4,a10')
x[:] = [(1, 2., 'Hello'), (2, 3., 'World')]

new_dt = ('my_alphabet', '|S3')
new_col = ['abc', 'def']

x = insert_dtype(x, 1, new_dt, new_col)

Now x looks like

array([(1, 'abc', 2.0, 'Hello'), (2, 'def', 3.0, 'World')], 
  dtype=[('f0', '<i4'), ('my_alphabet', 'S3'), ('f1', '<f4'), ('f2', 'S10')])

The solution is adapted from here.

To print the recarray to file, you could use something like:

from matplotlib.mlab import rec2csv
rec2csv(x,'foo.txt')

Upvotes: 2

Related Questions