AlmaThom
AlmaThom

Reputation: 133

Split numpy array by unique values in column

I have a large array that I imported from a csv (np.recfromcsv) that I want to divide into smaller arrays by an ID column in said array. For example my array(a) looks like:

[(842, 129826, 2018, 7246, '1/4/2009', 452, '1/4/2009', 452, '1/4/2009')
 (863, 129827, 2018, 7246, '1/7/2009', 452, '1/7/2009', 452, '1/7/2009')
 (890, 129828, 2019, 7246, '1/11/2009', 452, '1/11/2009', 452, '1/11/2009')
 ...,
 (339, 131268, 1085, 4211, '12/1/2009', 220, '12/2/2009', 220, '12/1/2009')
 (376, 131535, 1085, 4211, '12/8/2009', 220, '12/9/2009', 220, '12/8/2009')
 (470, 131536, 1087, 4211, '12/28/2009', 220, '12/29/2009', 220, '12/28/2009')]

And I would like to split this into arrays based on the third column (2018, 2019, 1085, etc). I've been trying to find a way to use numpy's vsplit method using a list I generated of unique ID values (id_list = list(set(a['id']))), however I get the erorr: ValueError: vsplit only works on arrays of 2 or more dimensions. Which makes me think the np.recfromcsv tool doesn't generate dimensions properly. Should I be using a different import tool?
I have also tried doing this in a simple loop:

for e in id_list:
    name = "id" + str(e)
    name = a[a['id']==e]

But this generates an error: SyntaxError: can't assign to operator. I know the problem is the dynamic variable, but I see no other way to achieve this without overwriting the array for each ID.

I'd really appreciate advice on how to figure this out.

Upvotes: 3

Views: 3226

Answers (1)

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58885

To read a column from a recarray you do not pass the index, but the name, for example:

my_col = a['id']

So that your command will be:

id_list = list(set(a['id'])))

Just as an observation. The recfromcsv() works properly. Each field in the structured array (or record array) works like a 1D-array. Maybe you could try using np.loadtxt() passing delimiter=',', which will return a 2D-array.

Upvotes: 1

Related Questions