Stacey
Stacey

Reputation: 5107

Shifting dataframe index to match another dataframe

I have two data-frames. I am trying to update one data-frame with another.

df1 looks like:

            col2  col3 dept
date                       
2020-05-08    29    21    A
2020-05-09    56    12    B
2020-05-10    82    15    C
2020-05-11    13     9    D
2020-05-12    35    13    E
2020-05-13    53    87    F
2020-05-14    25     9    G
2020-05-15    23    63    H

and df2 looks like:

            col2 dept
date                 
2020-05-08    64    A
2020-05-09    41    B
2020-05-10    95    C
2020-05-11    58    D
2020-05-12    89    E
2020-05-13    37    F
2020-05-14    24    G
2020-05-15    67    H

I can update df1 with df2 no problem when the dates in the index of both data-frames match using:

df1 = df1.set_index('dept', append=True)
df2 = df2.set_index('dept', append=True)
df1.update(df2)
df1 = df1.reset_index(level='dept')
df2 = df2.reset_index(level='dept')

The above brings dept into the index and I update df1 with df2

So col2 in df1 is updated with the output being:

           dept  col2  col3
date                       
2020-05-07    A    64    21
2020-05-08    B    41    12
2020-05-09    C    95    15
2020-05-10    D    58     9
2020-05-11    E    89    13
2020-05-12    F    37    87
2020-05-13    G    24     9
2020-05-14    H    67    63

However - when the index dates are different for example when df2 looks like:

            col2 dept
date                 
2020-05-07    64    A
2020-05-08    41    B
2020-05-09    95    C
2020-05-10    58    D
2020-05-11    89    E
2020-05-12    37    F
2020-05-13    24    G
2020-05-14    67    H

Using the above code df1 does not update with the values from df2. How can I update a date-frame with another data-frame when the index is different? A solution is to make the dates in df2 the same as df1 then that will be fine (but how would you do that).

Upvotes: 1

Views: 636

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34056

You want to update df1 from df2 on a combination of indexes(date and dept).

Your same code would have worked fine, only if you would have had matching indexes across dataframes.

The indexes for df1 after df1 = df1.set_index('dept', append=True) are:

In [160]: df1.index.tolist()                                                                                                                                                                                
Out[160]: 
[('2020-05-08', 'A'),
 ('2020-05-09', 'B'),
 ('2020-05-10', 'C'),
 ('2020-05-11', 'D'),
 ('2020-05-12', 'E'),
 ('2020-05-13', 'F'),
 ('2020-05-14', 'G'),
 ('2020-05-15', 'H')]

Taking your second df2 for example. Index for df2 after df2 = df2.set_index('dept', append=True) are:

In [159]: df2.index.tolist()                                                                                                                                                                                
Out[159]: 
[('2020-05-07', 'A'),
 ('2020-05-08', 'B'),
 ('2020-05-09', 'C'),
 ('2020-05-10', 'D'),
 ('2020-05-11', 'E'),
 ('2020-05-12', 'F'),
 ('2020-05-13', 'G'),
 ('2020-05-14', 'H')]

You can see that indexes of both dataframes do not match for any row. Hence, it did not update. If you would have had matching indexes, you would get desired output from your code itself.

Upvotes: 3

Related Questions