alternative to df.stack().unstack()

Question

I am reading a bunch of tables from an excel file, all of which have different indices and multiple columns, and combining them on the index column.

What I end up with is a dataframe like the following, where each column has one entry for each index value, and where the indices are duplicated:

import pandas as pd
import numpy as np


nan = np.nan

df = pd.DataFrame([[1, nan, nan, nan],
                   [nan, 2, nan, nan],
                   [nan, nan, 3, nan],
                   [nan, nan, nan, 4],
                   [5, nan, nan, nan],
                   [nan, 6, nan, nan]],
                  index = [1, 1, 1, 1, 2, 2])

print(df)
#      0    1    2    3
# 1  1.0  NaN  NaN  NaN
# 1  NaN  2.0  NaN  NaN
# 1  NaN  NaN  3.0  NaN
# 1  NaN  NaN  NaN  4.0
# 2  5.0  NaN  NaN  NaN
# 2  NaN  6.0  NaN  NaN

..What I then do to get unique indices and get rid of all the nan values is the following:

df = df.stack().unstack()

print(df)
#      0    1    2    3
# 1  1.0  2.0  3.0  4.0
# 2  5.0  6.0  NaN  NaN

..Which seems a bit awkward/inefficient, so my question is:

Is there a better alternative to .stack().unstack() for accomplishing this?

alternative to df.stack().unstack()

Answers (1)

Related Questions