Reputation: 440
I have a dataframe like this:
Band1 lat1 lon1
latitude level longitude
41.0 1000 19.50 23.0 41.015335 19.548331
19.50 44.0 41.015335 19.565497
19.50 12.0 41.015335 19.582663
19.75 35.0 41.015335 19.668494
19.75 83.0 41.015335 19.685660
I want to add following column to multiIndex ( this is DatetimeIndex
Type ):
DatetimeIndex(['1979-01-01 00:00:00', '1979-01-01 01:00:00',
'1979-01-01 02:00:00', '1979-01-01 03:00:00',
'1979-01-01 04:00:00', '1979-01-01 05:00:00',
'1979-01-01 06:00:00', '1979-01-01 07:00:00',
'1979-01-01 08:00:00', '1979-01-01 09:00:00',
...
'2019-12-30 15:00:00', '2019-12-30 16:00:00',
'2019-12-30 17:00:00', '2019-12-30 18:00:00',
'2019-12-30 19:00:00', '2019-12-30 20:00:00',
'2019-12-30 21:00:00', '2019-12-30 22:00:00',
'2019-12-30 23:00:00', '2019-12-31 00:00:00'],
dtype='datetime64[ns]', length=179305, freq=None)
I tried procedure described here but it takes hours of looping without result ( probably due to large number of rows - 179305 in this case. Desired results would be:
Band1 lat1 lon1
latitude level longitude time
41.0 1000 19.50 '1979-01-01 00:00:00' 23.0 41.015335 19.548331
'1979-01-01 01:00:00' 23.0 41.015335 19.548331
'1979-01-01 02:00:00' 23.0 41.015335 19.548331
'1979-01-01 03:00:00' 23.0 41.015335 19.548331
'1979-01-01 04:00:00' 23.0 41.015335 19.548331
... ... ... ...
19.60 '1979-01-01 00:00:00' 44.0 41.015335 19.565497
'1979-01-01 01:00:00' 44.0 41.015335 19.565497
'1979-01-01 02:00:00' 44.0 41.015335 19.565497
'1979-01-01 03:00:00' 44.0 41.015335 19.565497
'1979-01-01 04:00:00' 44.0 41.015335 19.565497
... ....
19.65 12.0 41.015335 19.582663
19.75 35.0 41.015335 19.668494
19.75 83.0 41.015335 19.685660
... ... ...
46.5 850 23.00 1280.0 46.491333 23.015891
23.00 1390.0 46.491333 23.033057
23.00 1508.0 46.491333 23.050223
23.00 1519.0 46.491333 23.067389
23.00 1544.0 46.491333 23.084556
Main issue is speed, hence for
looping is not an option. Any help is appreciated.
Upvotes: 0
Views: 60
Reputation: 150735
You want append
option in set_index
:
# toy data
idx = pd.MultiIndex.from_arrays([list('aabbcc'), list('111111')], names=['x','y'])
df = pd.DataFrame(np.arange(18).reshape(-1,3),
index=idx,
columns=list('abc'))
times = [11,22]
# calculate multiplicity of the last index
multi = len(df.index)//len(times)
df = (df.assign(time=np.tile(times, multi)) # replace [0,1,2,3,4] with your datetime series
.set_index('time', append=True)
)
Output:
a b c
x y time
a 1 11 0 1 2
22 3 4 5
b 1 11 6 7 8
22 9 10 11
c 1 11 12 13 14
22 15 16 17
Upvotes: 2