An old man in the sea.
An old man in the sea.

Reputation: 1528

Concatenating a list of Multi-indices pandas dataframe

I have a list whose elements are MultiIndex objects like:

MultiIndex([(48, 39),
        (48, 40),
        (48, 41),
        (48, 42),
        (48, 43),
        (49, 39),
        (49, 40),
        (49, 41),
        (49, 42),
        (49, 43)],
       )
MultiIndex([(48, 48),
            (48, 49),
            (49, 48),
            (49, 49)],
           )

I want to concatenate both, vertically, such that I have:

MultiIndex([(48, 39),
            (48, 40),
            (48, 41),
            (48, 42),
            (48, 43),
            (49, 39),
            (49, 40),
            (49, 41),
            (49, 42),
            (49, 43),
            (48, 48),
            (48, 49),
            (49, 48),
            (49, 49)],
           )

If possible, I would also like it to:

  1. contain only unique pairs(but a,b is different from b,a )
  2. and be ordered (the above is not ordered since two pairs starting with 48 show up after pairs with 49).

Upvotes: 0

Views: 229

Answers (3)

constantstranger
constantstranger

Reputation: 9379

Contrary to the comment on the answer by @krmogi indicating that it raised an exception, the proposed code works fine in my environment.

Here is the full sample code I used, in case errors encountered by others were down to variable initialization issues:

import pandas as pd
mi1 = pd.MultiIndex.from_tuples([(48, 39),
        (48, 40),
        (48, 41),
        (48, 42),
        (48, 43),
        (49, 39),
        (49, 40),
        (49, 41),
        (49, 42),
        (49, 43)]
    )
mi2 = pd.MultiIndex.from_tuples([(48, 48),
        (48, 49),
        (49, 48),
        (49, 49)]
    )

res = mi1.union(mi2, sort=True)

Output:

MultiIndex([(48, 39),
            (48, 40),
            (48, 41),
            (48, 42),
            (48, 43),
            (48, 48),
            (48, 49),
            (49, 39),
            (49, 40),
            (49, 41),
            (49, 42),
            (49, 43),
            (49, 48),
            (49, 49)],
           )

Upvotes: 0

constantstranger
constantstranger

Reputation: 9379

Try this for two MultiIndex objects mi1 and mi2:

import pandas as pd
mi1 = pd.MultiIndex.from_tuples([(48, 39),
        (48, 40),
        (48, 41),
        (48, 42),
        (48, 43),
        (49, 39),
        (49, 40),
        (49, 41),
        (49, 42),
        (49, 43)]
    )
mi2 = pd.MultiIndex.from_tuples([(48, 48),
        (48, 49),
        (49, 48),
        (49, 49)]
    )

res = ( pd.MultiIndex.from_frame(pd.concat([
    mi1.to_frame(index=False), mi2.to_frame(index=False)], ignore_index=True)
    .drop_duplicates().pipe(lambda d: d.sort_values(list(d.columns))))
    .set_names([None, None]) )

print(res)

Output:

MultiIndex([(48, 39),
            (48, 40),
            (48, 41),
            (48, 42),
            (48, 43),
            (48, 48),
            (48, 49),
            (49, 39),
            (49, 40),
            (49, 41),
            (49, 42),
            (49, 43),
            (49, 48),
            (49, 49)],
           )

The calls to the to_frame() method convert each index to a DataFrame object with columns labels equal to the levels in the original calling MultiIndex and values equal to the original's labels.

After concatenating the DataFrame objects vertically with pd.concat(), eliminating duplicate rows and sorting the DataFrame on all columns, we use from_frame() to convert back to a MultiIndex and set_names() to revert to unnamed MultiIndex levels as in OP's question.

Upvotes: 0

Krishay R.
Krishay R.

Reputation: 2814

You could use the Union method to concatenate both of the MultiIndex Objects, and then set sort to True to sort it correctly:

#m1 is the first object and m2 is the second
result = m1.union(m2, sort=True)

This outputs:

MultiIndex([(48, 39),
            (48, 40),
            (48, 41),
            (48, 42),
            (48, 43),
            (48, 48),
            (48, 49),
            (49, 39),
            (49, 40),
            (49, 41),
            (49, 42),
            (49, 43),
            (49, 48),
            (49, 49)],
           )

Upvotes: 1

Related Questions