Reputation: 11696
I have a pandas data frame that looks like this:
asset, cusip, information1, information2, ...., information_n
1x4, 43942, 45, , NaN, , , NaN
1x4, 43942, NaN, , "hello", , NaN
1x4, 43942, NaN, , NaN, , "goodbye"
...
What I want is:
asset, cusip, information1, information2, ...., information_n
1x4, 43942, 45, , "hello", , , "goodbye"
...
Essentially I want to collapse down over matching "assets" and "cusips" regardless of the fields. There will be only one entry that's not NAN in information1...information_n.
Note that some columns might be int, some strings, others floats, etc.
Upvotes: 1
Views: 42
Reputation: 38415
You can use groupby and first() which gives you first and in your case only non-NaN value
df = df.groupby(['asset', 'cusip']).first().reset_index()
asset cusip information1 information2 information_n
0 1x4 43942 45 "hello" "goodbye"
Upvotes: 3