Reputation: 355
I have a data frame created with Pandas. It has 3 columns. One of them has the date in the format %Y%m%d%H. I need to find the rows that match a date with the format %Y%m%d.
I tried
df.loc[df["MESS_DATUM"] == 20170807]
which doesn't work. Only when I do
df.loc[df["MESS_DATUM"] == 2017080723]
it works for that single line. But I need the other lines containing the date only (without the hour). I know there is something like .str.cotains("")
. Is there something similar for numeric values or a way to use wildcards in the lines above?
Upvotes: 0
Views: 109
Reputation: 210882
We can "integer divide" MESS_DATUM
column by 100
:
df.loc[df["MESS_DATUM"]//100 == 20170807]
Demo:
In [29]: df
Out[29]:
MESS_DATUM
0 2017080719
1 2017080720
2 2017080721
3 2017080722
4 2017080723
In [30]: df.dtypes
Out[30]:
MESS_DATUM int64
dtype: object
In [31]: df["MESS_DATUM"]//100
Out[31]:
0 20170807
1 20170807
2 20170807
3 20170807
4 20170807
Name: MESS_DATUM, dtype: int64
But I would consider converting it to datetime
dtype:
df["MESS_DATUM"] = pd.to_datetime(df["MESS_DATUM"].astype(str), format='%Y%m%d%H')
If df["MESS_DATUM"]
is of float
dtype, then we can use the following trick:
In [41]: pd.to_datetime(df["MESS_DATUM"].astype(str).str.split('.').str[0],
format='%Y%m%d%H')
Out[41]:
0 2017-08-07 19:00:00
1 2017-08-07 20:00:00
2 2017-08-07 21:00:00
3 2017-08-07 22:00:00
4 2017-08-07 23:00:00
Name: MESS_DATUM, dtype: datetime64[ns]
Upvotes: 2