BallerNacken
BallerNacken

Reputation: 355

How to find a partial numeric value in column in Pandas?

I have a data frame created with Pandas. It has 3 columns. One of them has the date in the format %Y%m%d%H. I need to find the rows that match a date with the format %Y%m%d.

I tried

df.loc[df["MESS_DATUM"] == 20170807]

which doesn't work. Only when I do

df.loc[df["MESS_DATUM"] == 2017080723]

it works for that single line. But I need the other lines containing the date only (without the hour). I know there is something like .str.cotains(""). Is there something similar for numeric values or a way to use wildcards in the lines above?

Upvotes: 0

Views: 109

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

We can "integer divide" MESS_DATUM column by 100:

df.loc[df["MESS_DATUM"]//100 == 20170807]

Demo:

In [29]: df
Out[29]:
   MESS_DATUM
0  2017080719
1  2017080720
2  2017080721
3  2017080722
4  2017080723

In [30]: df.dtypes
Out[30]:
MESS_DATUM    int64
dtype: object

In [31]: df["MESS_DATUM"]//100
Out[31]:
0    20170807
1    20170807
2    20170807
3    20170807
4    20170807
Name: MESS_DATUM, dtype: int64

But I would consider converting it to datetime dtype:

df["MESS_DATUM"] = pd.to_datetime(df["MESS_DATUM"].astype(str), format='%Y%m%d%H')

If df["MESS_DATUM"] is of float dtype, then we can use the following trick:

In [41]: pd.to_datetime(df["MESS_DATUM"].astype(str).str.split('.').str[0], 
                        format='%Y%m%d%H')
Out[41]:
0   2017-08-07 19:00:00
1   2017-08-07 20:00:00
2   2017-08-07 21:00:00
3   2017-08-07 22:00:00
4   2017-08-07 23:00:00
Name: MESS_DATUM, dtype: datetime64[ns]

Upvotes: 2

Related Questions