Reputation: 399
I have a DataFrame like this:
Date X
....
2014-01-02 07:00:00 16
2014-01-02 07:15:00 20
2014-01-02 07:30:00 21
2014-01-02 07:45:00 33
2014-01-02 08:00:00 22
....
2014-01-02 23:45:00 0
....
1) So my "Date" Column is a datetime and has values vor every 15min of a day.
What i want is to remove ALL Rows where the time is NOT between 08:00 and 18:00 o'clock.
2) Some days are missing in the datas...how could i put the missing days in my dataframe and fill them with the value 0 as X.
My approach: Create a new Series between two Dates and set 15min as frequenz and concat my X Column with the new created Series. Is that right?
Edit: Problem for my second Question:
#create new full DF without missing dates and reindex
full_range = pandas.date_range(start='2014-01-02', end='2017-11-
14',freq='15min')
df = df.reindex(full_range,fill_value=0)
df.head()
Output:
Date X
2014-01-02 00:00:00 1970-01-01 0
2014-01-02 00:15:00 1970-01-01 0
2014-01-02 00:30:00 1970-01-01 0
2014-01-02 00:45:00 1970-01-01 0
2014-01-02 01:00:00 1970-01-01 0
That didnt work as you see.
The "Date" Column is not a index btw. i need it as Column in my df
and why did he take "1970-01-01"? 1970 as year makes no sense to me
Upvotes: 0
Views: 281
Reputation: 40878
What I want is to remove ALL Rows where the time is NOT between 08:00 and 18:00 o'clock.
Create a mask with datetime.time
. Example:
from datetime import time
idx = pd.date_range('2014-01-02', freq='15min', periods=10000)
df = pd.DataFrame({'x': np.empty(idx.shape[0])}, index=idx)
t1 = time(8); t2 = time(18)
times = df.index.time
mask = (times > t1) & (times < t2)
df = df.loc[mask]
Some days are missing in the data...how could I put the missing days in my DataFrame and fill them with the value 0 as X?
pd.date_range()
(see above).reindex()
on df
and specify fill_value=0
.Answering your questions in comments:
np.empty
creates an empty array. I was just using it to build some "example" data that is basically garbage. Here idx.shape
is the shape of your index (length, width), a tuple. So np.empty(idx.shape[0])
creates an empty 1d array with the same length as idx
.times = df.index.time
creates a variable (a NumPy array) called times
. df.index.time
is the time for each element in the index of df
. You can explore this yourself by just breaking the code down in pieces and experimenting with it on your own.Upvotes: 2