BAC83
BAC83

Reputation: 891

Selecting multiple rows in a Pandas df based on an numpy array of datetime64[ns]

Headache time.

I have a dataframe where an important column is of the type 'datetime64[ns]'. There are >1500000 rows, the datetime column contains about 3000 unique 'start times' (runs of a piece of equipment).

After performing some groupbys and data analysis I have ID'd ~ 30 runs of interest (identifiable by their run/start_time).analysis_df.start_time.unique() creates a numpy array of these datetimes.

I would like to use to pull out the raw data for these ~30 runs, and was hoping to use isin a la Filter dataframe rows if value in column is in a set list of values, however I have fallen foul of ISIN function does not work for dates and I'm a bit stuck.

I've looked into converting formats: Converting between datetime, Timestamp and datetime64 but the fact that Wes refers to this aspect as 'Hell' is doesn't fill me with hope! https://stackoverflow.com/a/13753918/2658194

Any help on how to achieve this 'filter' would be greatly appreciated - if anyone has a different (better!) approach i'd be happy to change up.

Thanks

Upvotes: 0

Views: 133

Answers (1)

jpp
jpp

Reputation: 164823

ISIN function does not work for dates

That's not true. The question you have quoted is 5 years old. You can use a list, series or array. Here's a demonstration, tested on Pandas 0.19.2.

df = pd.DataFrame({'A': pd.to_datetime(['2018-01-01', '2018-11-30', '2018-12-24'])})

L = ['2018-01-01', '2018-12-24']
S = pd.to_datetime(L)
A = S.values

res1 = df[df['A'].isin(L)]
res2 = df[df['A'].isin(S)]
res3 = df[df['A'].isin(A)]

assert (res1.values == res2.values).all()
assert (res1.values == res3.values).all()

print(res1)

#            A
# 0 2018-01-01
# 2 2018-12-24

Upvotes: 2

Related Questions