Reputation: 41
I currently have the following dataframe
| Number | Number | Name | Name | Country | Country |
| ------ | ------- | ---- | ---- | ------- | ------- |
| 1 | Nan | Bob | Nan | USA | Nan |
| 4 | Nan | Nan | Nan | India | Nan |
| Nan | 12 | Joe | Nan | Nan | China |
| Nan | 6 | Nan | Matt | Nan | Australia|
My question is, how do I create new Number, Name, Country columns (etc.) that just finds the like column and then sees which column is populated and pulls that value. Note in this example I only have two duplicate columns but would ideally like to be able to handle many different duplicate columns not just two.
So far I have the following code to identify the duplicate columns and then create a dictionary a key which is the desired column name and then the values are just the numbered original columns. (e.g. {'country': ['country_2', 'country_1']})
How can I now traverse through this dictionary and create a new "country" column that parses through all of the values in the list dictionary to find which column is populated?
Upvotes: 0
Views: 34
Reputation: 23166
Try with stack
, dropna
and unstack
:
>>> df.stack().dropna().unstack()
Country Name Number
0 USA Bob 1.0
1 India NaN 4.0
2 China Joe 12.0
3 Australia Matt 6.0
Upvotes: 1