Reputation: 431
I am trying to get a new dataframe (I set it the original dataframe as a blank DF) from a group of series. This is the code I have to get the series.
all_keys = list(dict_months.keys())
for i in all_keys:
for j in range(len(dict_months[i])):
temp_num = df_mth_return.loc['1992-'+str(i),dict_months[i][j]]
blank_df = blank_df.append(temp_num) # append Series to blank_df
Here is a sample output of the resulting series with each temp_num being a pandas Series
Date
1992-02-03 -2.174845
Name: IBM US Equity, dtype: float64
Date
1992-02-03 0.878127
Name: MMM US Equity, dtype: float64
Date
1992-03-02 -3.884848
Name: IBM US Equity, dtype: float64
This is the result I get
en IBM US Equity MMM US Equity IBM US Equity MMM US Equity IBM US Equity IBM US Equity
2/3/1992 -2.17485 0.878127 NaN all the way across >> NaN
3/2/1992 NaN NaN -3.88485 -2.47076 NaN acorss >>
1/2/1992 NaN NaN NaN NaN 1.123077 NaN across >>>>
7/1/1992 NaN NaN NaN NaN NaN -3.19279 3.091772 NaN across >>>>
4/1/1992 ETC.... DOWN
But I want the final dataframe to look like the following so that the columns that are the same are only shown once. Can someone help out. This is a small sample of the blank_DF, it goes on for multiple columns and many more rows.
IBM US Equity MMM US Equity
2/3/1992 -2.17485 0.878127
3/2/1992 -3.88485 -2.47076
1/2/1992 1.123077 NaN
7/1/1992 -3.19279 3.091772
4/1/1992 NaN 5.63469
5/1/1992 1.312976 2.867628
Upvotes: 2
Views: 89
Reputation: 862511
I believe you need groupby
by columns and apply lambda function with bfill
for back filling NaN
s with iloc
for select first column:
df = df.groupby(axis=1, level=0).apply(lambda x: x.bfill(axis=1).iloc[:, 0])
print (df)
IBM US Equity MMM US Equity
en
2/3/1992 -2.174850 0.878127
3/2/1992 -3.884850 -2.470760
1/2/1992 1.123077 NaN
7/1/1992 -3.192790 3.091772
Another solution with numpy
and perfect Divakar function justify
- only select first values in 2d array by [:, 0]
:
f = lambda x: pd.Series(justify(x.values, invalid_val=np.nan, axis=1, side='left')[:, 0])
df = df.groupby(axis=1, level=0).apply(f)
print (df)
Upvotes: 1