dim_yf_95
dim_yf_95

Reputation: 59

Drop dataframe rows by comparing timestamps

I have a dataframe that looks like the one below. Since the dataframe has data from many different days, I want to keep only those with a specific date. So I tried to check %d, %m for days and months respectively. I used the code follows but it doesn't drop the rows I wanted to. Thanks in advance!

2018-08-29T00:03:09      12310                  
2018-08-23T00:08:10      21231              
2018-08-29T00:13:10      19.6                   
2018-08-29T00:18:10     19321                   
2018-08-30T00:23:10     182134                  
2018-02-29T00:28:10     172319                  
2018-09-19T00:33:10     1734                    
2018-08-29T00:38:10     1764                    
2018-08-29T00:43:10     169743                  
2018-11-17T00:48:10     16747                   
2018-08-29T00:53:10     17830

What I have so far

temp = df['Timestamps'][0]
tempTS = datetime.datetime.strptime(str(df['Timestamps'][0]), "%Y-%m-%dT%H:%M:%S")    
firstDay = tempTS.strftime("%d")
firstMonth = tempTS.strftime("%m")


for i in df['Timestamps']:
    timestamp = datetime.datetime.strptime(str(i), "%Y-%m-%dT%H:%M:%S")        
    if ((timestamp.strftime("%d") != firstDay) and (timestamp.strftime("%m") != firstMonth):            
        df.drop(df.index[i])

Upvotes: 0

Views: 633

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62393

Selecting dates by Boolean

  • Use pandas.Series.dt.
    • The .dt accessor has many methods, which makes it easy to select specific components of a datetime value.
import pandas as pd

# create dataframe
data = {'DateTime': ['2018-08-29T00:03:09', '2018-08-23T00:08:10', '2018-08-29T00:13:10', '2018-08-29T00:18:10', '2018-08-30T00:23:10', '2018-02-28T00:28:10', '2018-09-19T00:33:10', '2018-08-29T00:38:10', '2018-08-29T00:43:10', '2018-11-17T00:48:10', '2018-08-29T00:53:10'],
        'Value': [12310.0, 21231.0, 19.6, 19321.0, 182134.0, 172319.0, 1734.0, 1764.0, 169743.0, 16747.0, 17830.0]}

df = pd.DataFrame(data)

# convert to datetime
df.DateTime = pd.to_datetime(df.DateTime, format='%Y-%m-%dT%H:%M:%S')

# conditions
first_day = df.DateTime.dt.day[0]
first_month = df.DateTime.dt.month[0]

# select rows not equal to conditions
df[(df.DateTime.dt.month != first_month) & (df.DateTime.dt.day != first_day)]

             DateTime     Value
5 2018-02-28 00:28:10  172319.0
6 2018-09-19 00:33:10    1734.0
9 2018-11-17 00:48:10   16747.0

# select rows equal to conditions
df[(df.DateTime.dt.month == first_month) & (df.DateTime.dt.day == first_day)].to_clipboard(sep='\\s+')

              DateTime     Value
0  2018-08-29 00:03:09   12310.0
2  2018-08-29 00:13:10      19.6
3  2018-08-29 00:18:10   19321.0
7  2018-08-29 00:38:10    1764.0
8  2018-08-29 00:43:10  169743.0
10 2018-08-29 00:53:10   17830.0

Upvotes: 1

Milad Yousefi
Milad Yousefi

Reputation: 309

df.drop(df.index[i])

is not correct for dropping row. you're passing a date in string as i. you should write

 df.drop(df[df['Timestamps'] == i].index[0])

don't forget to set inplace = True if you want to completely remove row from datafame. so the complete line of code is:

df.drop(df[df['Timestamps'] == i].index[0],inplace=True)

Upvotes: 0

Related Questions