Reputation: 43840
I'd like to filter out weekend data and only look at data for weekdays (mon(0)-fri(4)). I'm new to pandas, what's the best way to accomplish this in pandas?
import datetime
from pandas import *
data = read_csv("data.csv")
data.my_dt
Out[52]:
0 2012-10-01 02:00:39
1 2012-10-01 02:00:38
2 2012-10-01 02:01:05
3 2012-10-01 02:01:07
4 2012-10-01 02:02:03
5 2012-10-01 02:02:09
6 2012-10-01 02:02:03
7 2012-10-01 02:02:35
8 2012-10-01 02:02:33
9 2012-10-01 02:03:01
10 2012-10-01 02:08:53
11 2012-10-01 02:09:04
12 2012-10-01 02:09:09
13 2012-10-01 02:10:20
14 2012-10-01 02:10:45
...
I'd like to do something like:
weekdays_only = data[data.my_dt.weekday() < 5]
AttributeError: 'numpy.int64' object has no attribute 'weekday'
but this doesn't work, I haven't quite grasped how column datetime objects are accessed.
The eventual goal being to arrange hierarchically to weekday hour-range, something like:
monday, 0-6, 7-12, 13-18, 19-23
tuesday, 0-6, 7-12, 13-18, 19-23
Upvotes: 16
Views: 19820
Reputation: 8683
Faster way would be to use DatetimeIndex.weekday
, like so:
temp = pd.DatetimeIndex(data['my_dt'])
data['weekday'] = temp.weekday
Much much faster, especially for a large number of rows. For further info, check this answer.
Upvotes: 10
Reputation: 516
your call to the function "weekday" does not work as it operates on the index of data.my_dt, which is an int64 array (this is where the error message comes from)
you could create a new column in data containing the weekdays using something like:
data['weekday'] = data['my_dt'].apply(lambda x: x.weekday())
then you can filter for weekdays with:
weekdays_only = data[data['weekday'] < 5 ]
I hope this helps
Upvotes: 28