user14438759
user14438759

Reputation:

How do i slice an array for any size

I have an array of 2D, called X and a 1D array for X's classes, what i want to do is slice the same amount of first N percent elements for each class and store inside a new array, for example, in a simple way without doing for loops:

For the following X array which is 2D:

[[0.612515  0.385088 ]
 [0.213345  0.174123 ]
 [0.432596  0.8714246]
 [0.700230  0.730789 ]
 [0.455105  0.128509 ]
 [0.518423  0.295175 ]
 [0.659871  0.320614 ]
 [0.459677  0.940614 ]
 [0.823733  0.831789 ]
 [0.236175  0.10750  ]
 [0.379032  0.241121 ]
 [0.512535  0.8522193]

Output is 3.

Then, i'd like to store the first 3 index that belongs to class 0 and first 3 elements that belongs to class 0 and maintain the occurence order of the indices, the following output:

First 3 from each class: [1 0 0 1 0 1]

New_X = 
        [[0.612515  0.385088 ]
         [0.213345  0.174123 ]
         [0.432596  0.8714246]
         [0.700230  0.730789 ]
         [0.455105  0.128509 ]
         [0.518423  0.295175 ]]

Upvotes: 1

Views: 51

Answers (1)

David
David

Reputation: 8308

First, 30% is only 2 elements from each class (even when using np.ceil).

Second, I'll assume both arrays are numpy.array.

Given the 2 arrays, we can find the desired indices using np.where and array y in the following way:

in_ = sorted([x for x in [*np.where(y==0)[0][:np.ceil(0.3*6).astype(int)],*np.where(y==1)[0][:np.ceil(0.3*6).astype(int)]]])  # [0, 1, 2, 3]

Now we can simply slice X like so:

X[in_]
# array([[0.612515 , 0.385088 ],
#        [0.213345 , 0.174123 ],
#        [0.432596 , 0.8714246],
#        [0.70023  , 0.730789 ]])

The definition of X and y are:

X = np.array([[0.612515 , 0.385088 ],
       [0.213345 , 0.174123 ],
       [0.432596 , 0.8714246],
       [0.70023  , 0.730789 ],
       [0.455105 , 0.128509 ],
       [0.518423 , 0.295175 ],
       [0.659871 , 0.320614 ],
       [0.459677 , 0.940614 ],
       [0.823733 , 0.831789 ],
       [0.236175 , 0.1075   ],
       [0.379032 , 0.241121 ],
       [0.512535 , 0.8522193]])
y = np.array([1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0])

Edit

The following line: np.where(y==0)[0][:np.ceil(0.3*6).astype(int)] doing the following:

  1. np.where(y==0)[0] - returns all the indices where y==0
  2. Since you wanted only the 30%, we slice those indices to get all the values up to 30% - [:np.ceil(0.3*6).astype(int)]

Upvotes: 1

Related Questions