krishnab
krishnab

Reputation: 10060

slice a numpy structured 1-d array to get part of a record

I have a numpy 1-D structured array and I want to get just a part of one record out. I was trying to figure out how to slice this type of request. Here is my code:

summary_stat_list = ['mean', 'variance', 'median', 'kurtosis', 'skewness']
model_summary_stats = np.zeros(5,dtype=[('statistic',
                                                       'object'),
                                           ('f1', 'float'),
                                           ('f2', 'float'),
                                           ('f3', 'float'),
                                           ('m1', 'float'),
                                           ('m2', 'float'),
                                           ('m3', 'float'),
                                           ('t3', 'float'),
                                           ('t2', 'float'),
                                           ('t1', 'float'),
                                           ('prom1', 'float'),
                                           ('prom2', 'float')])
for r in range(model_summary_stats.shape[0]):
    model_summary_stats['statistic'][r] = summary_stat_list[r]

Now, the array looks like this:

[('mean', 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
('variance', 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
('median', 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
('kurtosis', 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
('skewness', 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)]

My question is, how can I get all but the first element of the first array. That is, in the 'mean' array, I want to just get the numeric entries.

I was trying something like

model_summary_stats[0]['f1':]

or:

model_summary_stats[0][1:]

but these are not working. Any suggestions.

Upvotes: 4

Views: 2241

Answers (1)

hpaulj
hpaulj

Reputation: 231425

Slicing does not work with field name indexing. You have to use a list of the desired field names instead:

model_summary_stats[0][['f1','f2','f3',etc]

You also get that list with something like

model_summary_stats.dtype.names[1:]

You should keep in mind that this kind of multifield indexing is poorly developed. It's ok for retrieving values, but you can't set values this way. And you can't do math across columns.

http://docs.scipy.org/doc/numpy/user/basics.rec.html#accessing-multiple-fields-at-once

A different dtype might be more useful

dt = np.dtype([('statistic',object),('values',(float,11))])
dt = np.dtype([('statistic',object),('values',(float,8)),('prom',(float,3))])

or what ever grouping makes most sense when processing the data.

Upvotes: 5

Related Questions