Reputation: 1049
I am again manipulating dataframes. Here I concatenate multiple dataframe using row as common reference. Then I want to reorder the columns by "pairing" the first one columns of each df together, and so on. All for the sake of data readability
Here is my code:
df_list=[df_1,df_2,df_3]
return_df=pd.concat(df_list,axis=1, join='outer')
dfcolumns_list=[df_1.columns,df_2.columns,df_3.columns]
print (return_df.columns)
print(dfcolumns_list)
list_columns=np.array(list(zip(*dfcolumns_list))).reshape(1,-1)[0]
print (list_columns)
list_columns=np.array([x for x in zip(*dfcolumns_list)]).reshape(1,-1)[0]
print (list_columns)
return_df=return_df[list_columns]
My question is related to:
list_columns=np.array(list(zip(*dfcolumns_list))).reshape(1,-1)[0]
or alternatively
list_columns=np.array([x for x in zip(*dfcolumns_list)]).reshape(1,-1)[0]
It takes the list of indexes, unpacks it in the zip, takes the first element of each column index, outputs it as a tuple/sublist contained in a list, transforms it into an array ,then reshapes it to get rid of the sublists which would cause the
return_df=return_df[list_columns]
to break. At last, the call to index 0 [0]
allows it to retrieve the final list into the np.array (which I need to reshape).
My question is: is there nothing less ugly than that? I like zip
and similar functions, but I hate to have no simple mean/trick to unpack the generated tuples/sublist for reordering purposes.
(It also came to my mind while redacting that I could maybe do the df differently, so I would also give points to that, but my main question is still how to do what I am doing more elegantly/with more Pythonic syntax.
The [0]
in the end is the dirtiest of all...
Upvotes: 0
Views: 1119
Reputation: 1152
You may just zip all column lists and then flatten the list of lists
list_columns = [ col for cols in zip( *dfcolumns_list ) for col in cols ]
Upvotes: 1