most frequent *rolling* value on a object/str pandas dataframe column

Question

I would like to have a new column with the most frequent rolling value from another str/object column.

          date     name state
0   2024-02-29    Alice    CA
1   2024-02-27      Bob    HI
2   2024-02-29    Cindy    ID
3   2024-02-25      Dan    MT
4   2024-02-29  Elliott    CA
..       ...        ...   ...

I am trying to get the most frequent rolling state (for each row).

I have tried different combinations (and subsets) of

.rolling()
.apply()
.mode()
mode() from different libraries
custom mode() function

which usually generates one of a handful of errors complaining that the column is non-numeric. I understand what the error is telling me - that it expects to aggregate and apply a numeric function (.mean() .sum() ...) - but its not even getting to the .apply() function...

def fail_now(x):
    raise Exception('wow! we made it here!')

>>> df['state'].rolling(window=25).apply(fail_now)
...
pandas.errors.DataError: No numeric types to aggregate

>>> df[['state']].rolling(window=25).apply(fail_now)
...
pandas.errors.DataError: Cannot aggregate non-numeric type: object

>>> df[['state']].rolling(window=25)['state'].apply(fail_now)
...
pandas.errors.DataError: No numeric types to aggregate

I also tried a multitude of different things, including the raw flag in .apply() with no luck

most frequent rolling value on a object/str pandas dataframe column

Answers (1)

Related Questions