Reputation: 135
I have a list of dataframes, let say like below:
ListofDataframes[0]
Out[26]:
Column1
LastNotify
2016-11-28 00:37:07 1
ListofDataframes[1]
Out[27]:
Column2
LastNotify
2016-11-28 04:25:44 1
ListofDataframes[2]
Out[28]:
Column3
LastNotify
2016-12-02 11:32:49 1
2016-12-02 11:34:19 0
And my final output should be a single dataframe but with some logic. let say, there are 3 dataframes in my list, my final outcome should be
LastNotify Column1 Column2 Column3
1 1 0 0
2 0 1 0
3 0 0 1
4 0 0 0
where LastNotify has 1-4, as we have total 4 LastNotify values in the list of dataframes, and Column1 is 1 in 1st entry, Column2 is 1 in 2nd entry and so on....
One more pre-requisite is that, while creating the final dataframe, it should also check for the existing Column names, if Column1 is already present in the final dataframe, it shouldn't create another column with same name..
Upvotes: 1
Views: 4742
Reputation: 862511
I believe you need concat
, replace NaN
s by fillna
and cast to int
s, then sort_index
and remove DatetimeIndex
by reset_index
. Last add new column by insert
:
df1 = pd.DataFrame({ 'Column1': [1]}, index=pd.to_datetime(['2016-11-28 00:37:07']))
df2 = pd.DataFrame({'Column2': [1]}, index=pd.to_datetime(['2016-11-28 04:25:44']))
df3 = pd.DataFrame({'Column1': [1,0]},
index=pd.to_datetime(['2016-12-02 11:32:49', '2016-12-02 11:34:19']))
ListofDataframes = [df1, df2, df3]
df = pd.concat(ListofDataframes).fillna(0).astype(int).sort_index().reset_index(drop=True)
df.insert(0, 'LastNotify', range(1, len(df) + 1))
print (df)
LastNotify Column1 Column2 Column3
0 1 1 0 0
1 2 0 1 0
2 3 0 0 1
3 4 0 0 0
Upvotes: 1