Satyajit Pattnaik
Satyajit Pattnaik

Reputation: 135

Convert list of dataframes to a single dataframe based on some logic

I have a list of dataframes, let say like below:

ListofDataframes[0]
Out[26]: 
                     Column1
LastNotify                                                          
2016-11-28 00:37:07     1
ListofDataframes[1]
Out[27]: 
                     Column2
LastNotify                                                                                                     
2016-11-28 04:25:44     1                                         
ListofDataframes[2]
Out[28]: 
                     Column3
LastNotify                                                                                                       
2016-12-02 11:32:49     1
2016-12-02 11:34:19     0

And my final output should be a single dataframe but with some logic. let say, there are 3 dataframes in my list, my final outcome should be

LastNotify     Column1     Column2     Column3
1                1          0             0
2                0          1             0
3                0          0             1
4                0          0             0

where LastNotify has 1-4, as we have total 4 LastNotify values in the list of dataframes, and Column1 is 1 in 1st entry, Column2 is 1 in 2nd entry and so on....

One more pre-requisite is that, while creating the final dataframe, it should also check for the existing Column names, if Column1 is already present in the final dataframe, it shouldn't create another column with same name..

Upvotes: 1

Views: 4742

Answers (1)

jezrael
jezrael

Reputation: 862511

I believe you need concat, replace NaNs by fillna and cast to ints, then sort_index and remove DatetimeIndex by reset_index. Last add new column by insert:

df1 = pd.DataFrame({ 'Column1': [1]}, index=pd.to_datetime(['2016-11-28 00:37:07']))
df2 = pd.DataFrame({'Column2': [1]}, index=pd.to_datetime(['2016-11-28 04:25:44']))
df3 = pd.DataFrame({'Column1': [1,0]}, 
                    index=pd.to_datetime(['2016-12-02 11:32:49', '2016-12-02 11:34:19']))

ListofDataframes = [df1, df2, df3]
df = pd.concat(ListofDataframes).fillna(0).astype(int).sort_index().reset_index(drop=True)
df.insert(0, 'LastNotify', range(1, len(df) + 1))
print (df)
   LastNotify  Column1  Column2  Column3
0           1        1        0        0
1           2        0        1        0
2           3        0        0        1
3           4        0        0        0

Upvotes: 1

Related Questions