how to get a numpy array from frequency and indices

Question

I have a numpy array like this:

nparr = np.asarray([[u'fals', u'nazi', u'increas', u'technolog', u'equip', u'princeton', 
                     u'realiti', u'civilian', u'credit', u'ten'],
                    [u'million', u'thousand', u'nazi', u'stick', u'visibl', u'realiti', 
                     u'west', u'singl', u'jack', u'charl']])

What I need to do is to calculate the frequency of each item, and have another numpy array with the corresponding frequency of each item in the same position.

So, here as my array shape is (2, 10). I need to have a numpy array of shape (2, 10) but with the frequency values. Thus, the output of the above would be:

[[1, 2, 1, 1, 1, 1, 2, 1, 1, 1]
 [1, 1, 2, 1, 1, 2, 1, 1, 1, 1]]

What I have done so far:

unique, indices, count = np.unique(nparr, return_index=True, return_counts=True)

Though in this way the count is the frequency of unique values and it does not give me the same shape as the original array.

Mad Physicist · Accepted Answer

You need to use return_inverse rather than return_index:

_, i, c = np.unique(nparr, return_inverse=True, return_counts=True)

_ is a convention to denote discarded return values. You don't need the unique values to know where the counts go.

You can get the counts arranged in the order of the original array with a simple indexing operation. Unraveling to the original shape is necessary, of course:

c[i].reshape(nparr.shape)

how to get a numpy array from frequency and indices

Answers (1)

Related Questions