Reputation: 1939
I have a dataframe of the following:
>>> a = pd.DataFrame({'values':[random.randint(-10,10) for i in range(10)]})
>>> a
values
0 -3
1 -8
2 -2
3 3
4 8
5 6
6 -5
7 0
8 8
9 -4
And would like to reindex it so the index is entirely date time. I am doing that with the following code:
>>> times = [datetime.datetime(2018,1,2,12,40,0) + datetime.timedelta(seconds=i) for i in range(10)]
>>> times
[datetime.datetime(2018, 1, 2, 12, 40), datetime.datetime(2018, 1, 2, 12, 40, 1), datetime.datetime(2018, 1, 2, 12, 40, 2), datetime.datetime(2018, 1, 2, 12, 40, 3), datetime.datetime(2018, 1, 2, 12, 40, 4), datetime.datetime(2018, 1, 2, 12, 40, 5), datetime.datetime(2018, 1, 2, 12, 40, 6), datetime.datetime(2018, 1, 2, 12, 40, 7), datetime.datetime(2018, 1, 2, 12, 40, 8), datetime.datetime(2018, 1, 2, 12, 40, 9)]
>>> a.reindex(times)
values
2018-01-02 12:40:00 NaN
2018-01-02 12:40:01 NaN
2018-01-02 12:40:02 NaN
2018-01-02 12:40:03 NaN
2018-01-02 12:40:04 NaN
2018-01-02 12:40:05 NaN
2018-01-02 12:40:06 NaN
2018-01-02 12:40:07 NaN
2018-01-02 12:40:08 NaN
2018-01-02 12:40:09 NaN
As you can see, it instead deletes the values I just had and just puts NaN's in their place. How would I reindex this dataframe to look something like this:
values
2018-01-02 12:40:00 -3
2018-01-02 12:40:01 -8
2018-01-02 12:40:02 -2
2018-01-02 12:40:03 3
2018-01-02 12:40:04 8
2018-01-02 12:40:05 6
2018-01-02 12:40:06 -5
2018-01-02 12:40:07 0
2018-01-02 12:40:08 8
2018-01-02 12:40:09 -4
Upvotes: 4
Views: 2370
Reputation: 298
Code
import random
import datetime
import pandas as pd
a = pd.DataFrame({'values':[random.randint(-10,10) for i in range(10)]})
a['times'] = [datetime.datetime(2018,1,2,12,40,0) + datetime.timedelta(seconds=i) for i in range(10)]
a = a.set_index('times')
Result
times values
2018-01-02 12:40:00 -2
2018-01-02 12:40:01 -3
2018-01-02 12:40:02 5
2018-01-02 12:40:03 -9
2018-01-02 12:40:04 -6
2018-01-02 12:40:05 2
2018-01-02 12:40:06 1
2018-01-02 12:40:07 -1
2018-01-02 12:40:08 5
2018-01-02 12:40:09 3
Upvotes: 2
Reputation: 25249
as long as you have size of times
the same as df.size
, you may pass it to set_index
df = df.set_index([times])
Out[64]:
values
2018-01-02 12:40:00 -3
2018-01-02 12:40:01 -8
2018-01-02 12:40:02 -2
2018-01-02 12:40:03 3
2018-01-02 12:40:04 8
2018-01-02 12:40:05 6
2018-01-02 12:40:06 -5
2018-01-02 12:40:07 0
2018-01-02 12:40:08 8
2018-01-02 12:40:09 -4
Or you assign it directly to index
In [67]: df.index = times
In [68]: df
Out[68]:
values
2018-01-02 12:40:00 -3
2018-01-02 12:40:01 -8
2018-01-02 12:40:02 -2
2018-01-02 12:40:03 3
2018-01-02 12:40:04 8
2018-01-02 12:40:05 6
2018-01-02 12:40:06 -5
2018-01-02 12:40:07 0
2018-01-02 12:40:08 8
2018-01-02 12:40:09 -4
Upvotes: 2