Reputation: 1080
I Have a Data Frame as descibed Below
Dt_Frame = pd.DataFrame()
AIDList = ['ID1','ID2','ID3','ID4','ID5']
BIDList = ['ID1','ID2','ID3']
Dt_Frame = Dt_Frame.append ({'Country': 'USA', 'Schedule': 'Daily', 'Date': '2016-12-07', 'Status': 'Active','AListIDs' : AIDList ,'BListIDs' : BIDList}, ignore_index=True)
I have a add a Column Difference
which shows the Differences in 2 columns namely AIDList and BIDList ,Which in this case is 'ID4,'ID5'
,Which i think Sets Can be used in case, But Not Sure How do i do it? AIDList and BIDList types is List.And also How can i add One more column Numb_Items
which gives the Number of objects in the list AIDList
Upvotes: 0
Views: 361
Reputation: 2361
To add new column, you can Dt_Frame["newColumnName"] = value
.
Regarding the set
difference, your intuition is right. First, you can use apply
to convert the list
's into set
's
A = Dt_Frame["AListIDs"].apply(set)
B = Dt_Frame["BListIDs"].apply(set)
Then applying minus on each side will give you the difference w.r.t the other set. That is
A - B
0 {ID4, ID5}
dtype: object
B - A
0 {}
dtype: object
For symmetric difference we'll need A,B
to be in the same DataFrame
(either for the symmetric_difference
method or for the |
operator):
# We add two new columns
Dt_Frame["ASetIDs"] = A
Dt_Frame["BSetIDs"] = B
# We need to transpose since apply operates on columns
Dt_Frame[["ASetIDs", "BSetIDs"]].T.apply(lambda x: x.ASetIDs.symmetric_difference(x.BSetIDs))
Upvotes: 1