Colin
Colin

Reputation: 277

Find unique values across each row

I Have a 2 dimensional numpy array say as follows:

[["cat","dog","dog","mouse","man"],
["rhino","rhino","bat","rhino","dino","dino"],
["zebra","alien","alien","alien","alien"]]

I want to perform numpy.unique along each row in order to count the number of occurrences of each label, unfortunately I don't think this is possible as numpy.unique would return vectors of different lengths:

[["cat","dog","mouse","man"]
["rhino","bat","dino"]
["zebra","alien"]]
(similar then for the counts)

so this won't work obviously.

Does anybody know of a way I can get around this problem?

Upvotes: 2

Views: 72

Answers (1)

piRSquared
piRSquared

Reputation: 294586

Try this:

a = pd.DataFrame([["cat","dog","dog","mouse","man"],
                  ["rhino","rhino","bat","rhino","dino","dino"],
                  ["zebra","alien","alien","alien","alien"]])

a.apply(lambda x: pd.Series(x.unique()), axis=1)

Upvotes: 1

Related Questions