Reputation: 49
I've got a df with 10 columns which I'm running with streamlit to create a dashboard. One of the things I do with the df is use idxmin():
df.loc[df.groupby('ID').created_date.idxmin()]
In my original df there are multiple rows with each ID so I'm using idxmin() to only return one row for each ID where it is the oldest record. However I keep getting the error TypeError: reduction operation 'argmin' not allowed for this dtype.
I've read up on it and it seems converting the ID column to a numeric dtype should work since it's currently an object dtype. However, a lot of the IDs cannot be converted. For exmaple these are the first 5 IDs in my df
ID
0 5F8306CE-5331-449F-9035-87D0C370E3A9
1 14720
2 FFDE5CB4-5DFD-48B7-8682-959124A11990
3 29927
4 00055450
The IDs that have hyphens throw up the error ValueError: Unable to parse string...
I also cannot change these IDs to get rid of hyphens or anything as they relate to real data, how else could I return one row for each ID based on the oldest created_date while keeping all 10 columns in the df.
Upvotes: 0
Views: 1688
Reputation: 580
You need to make sure that the created_date
column is not an object. If so, convert it into a datetime
format.
In order to recreate your issue, I used the following steps:
# list of dates
dts = ['2021-12-12', '2022-12-03', '2022-09-22', '2022-01-01', '2022-08-12']
# list of IDs (numeric and string)
ids = ['54-44-ff-12', 14729, 'FF-24-11-CD', 29927, '00055450']
# create a pandas dataframe with these values
df = pd.DataFrame(columns=['created_date', 'ID'])
df['created_date'] = dts
df['ID'] = ids
# check data types
print(df.dtypes)
>>> created_date object
>>> ID object
>>> dtype: object
# running `idxmax()` on this would throw an error
df.loc[df.groupby('ID').created_date.idxmin()]
>>> TypeError: reduction operation 'argmin' not allowed for this dtype
# let's change created_date to datetime
df['created_date'] = pd.to_datetime(df['created_date'])
# now `idxmax()` runs without any issue
df.loc[df.groupby('ID').created_date.idxmin()]
>>> created_date ID
>>> 0 2021-12-12 29927
>>> 2 2022-09-22 FF-24-11-CD
Upvotes: 1