The most frequent pattern of specific columns in Pandas.DataFrame in python

Question

I know how to get the most frequent element of list of list, e.g.

a = [[3,4], [3,4],[3,4], [1,2], [1,2], [1,1],[1,3],[2,2],[3,2]]
print max(a, key=a.count)

should print [3, 4] even though the most frequent number is 1 for the first element and 2 for the second element.

My question is how to do the same kind of thing with Pandas.DataFrame.

For example, I'd like to know the implementation of the following method get_max_freq_elem_of_df:

def get_max_freq_elem_of_df(df):
  # do some things
  return freq_list

df = pd.DataFrame([[3,4], [3,4],[3,4], [1,2], [1,2], [1,1],[1,3],[2,2],[4,2]])
x = get_max_freq_elem_of_df(df)
print x # => should print [3,4]

Please notice that DataFrame.mode() method does not work. For above example, df.mode() returns [1, 2] not [3,4]

Update

have explained why DataFrame.mode() doesn't work.

DSM · Accepted Answer

You could use groupby.size and then find the max:

>>> df.groupby([0,1]).size()
0  1
1  1    1
   2    2
   3    1
2  2    1
3  4    3
4  2    1
dtype: int64
>>> df.groupby([0,1]).size().idxmax()
(3, 4)

The most frequent pattern of specific columns in Pandas.DataFrame in python

Update

Answers (2)

Related Questions