Reputation: 987
I have a series of dataframes which contain rainfall data from a selection of raingauges that were operational at overlapping times over the last twenty years. For example the first worked between 2001 and 2004, then second worked between 2003 and 2008, the third between 2007 and 2015. They all have date as their index, but I can't figure out how to merge them while keeping all indexes even when I use the following which I thought would work:
RG1_2 = RG1.merge(RG2, left_index=True, right_index=True)
I had expected this to produce a dataframe with an index from 2001 and 2008, with two columns containing the recorded data. Instead, it returns from 2003 to 2008, i.e. the indexes from the second dataframe... any ideas?
Many thanks in advance!
Upvotes: 1
Views: 2071
Reputation: 162
Instead of using
RG1_2 = RG1.merge(RG2, left_index=True, right_index=True)
Try this instead:
RG1_2 = RG1.merge(RG2, on='join_key',how='outer',left_index=True, right_index=True)
That will merge the indexes together now returning 2003 to 2008.
Upvotes: 0
Reputation: 97
I think you should try a merge with an outer join:
result = pd.merge(RG1, RG2, on='date', how='outer')
and here is a link with some examples: pandas merge examples
Upvotes: 1
Reputation: 4607
I think you should try outer join
, default merge is with inner join , so only correlated values are joining your case it seems.
RG1_2 = RG1.merge(RG2, left_index=True, right_index=True,how='outer')
Upvotes: 1