shankar
shankar

Reputation: 39

how to find out count of occurrence particular element in each row of nd array?

[['fake' 'fake' 'fake' 'fake' 'fake']
 ['real' 'real' 'real' 'real' 'real']
 ['real' 'real' 'fake' 'fake' 'real']
 ...
 ['real' 'real' 'real' 'real' 'real']
 ['fake' 'fake' 'fake' 'fake' 'fake']
 ['fake' 'fake' 'fake' 'real' 'fake']]

here is my data set, I need to find out whether the fake or real prediction count is greater in each row of nd array and store the result in 3rd array, is there any function available in NumPy array for such operation kindly help regarding this.

Upvotes: 0

Views: 78

Answers (2)

Valdi_Bo
Valdi_Bo

Reputation: 30971

Assuming that your array (arr) contains either fake or real, you can run:

moreReal = (arr == 'real').sum(axis=1) > arr.shape[1] / 2

Details:

  • (arr == 'real') - converts your array into a bool array (whether each element is real).
  • sum(axis=1) - generates sums by each row.
  • ... > arr.shape[1] / 2 - whether the number of real elements (in particular row) is greater than a half of row size.

The result is:

array([False,  True,  True,  True, False, False])

i.e. rows 1, 2, and 3 have more real entries than fake.

Edit

If your starting point is a plain pythonic list of lists, start from creation of a Numpy array:

arr = np.array([
    ['fake', 'fake', 'fake', 'fake', 'fake'],
    ...
])

Then, if you want to generate a bool array, run:

isReal = arr == 'real'

Upvotes: 1

Lith
Lith

Reputation: 803

Assuming your data have dimensions (n,2), you could do:

import numpy as np
# Example array
a = np.array([['fake', 'fake', 'fake', 'fake', 'fake'],
              ['real', 'real' ,'real' ,'real' ,'real'],
              ['real', 'real', 'fake', 'fake' ,'real']])

# For n even this will bias into the 'real' category
print(np.sum((a == 'fake'), axis = 1) > a.shape[1] // 2)

Upvotes: 1

Related Questions