Reputation: 707
I hope to help me. I have a dataframe. It has 2 columns (CONFIRM_STATUS and OUTCOME), the combination of which affects the logical display of the third column (VALUE).
CONFIRM_STATUS has 4 unique values
result1 = df['CONFIRM_STATUS'].unique()
result1
array(['CONFIRMED', 'PROBABLE', 'SUSPECTED', 'TOTAL'], dtype=object)
OUTCOME has 2 unique value
result2 = df['OUTCOME'].unique()
result2
array(['CASE', 'DEATH'], dtype=object)
As a result, I have 8 unique combinations that directly affect the meaning of the numeric value of the column VALUE. I need to convert these combinations into 8 columns so that each of them displays one of these combinations. Relatively speaking: death, recovery,...
How can this be done with pandas? I know, it turned out not very detailed, here is a screenshot of these several fields.
EVENT_NAME SOURCE DATE_LOW DATE_HIGH DATE_REPORT DATE_TYPE SPATIAL_RESOLUTION AL0_CODE AL0_NAME AL1_CODE AL1_NAME AL2_NAME AL3_NAME LOCALITY_NAME LOCATION_TYPE CONFIRM_STATUS OUTCOME CUMULATIVE_FLAG VALUE
2752 nCoV_2019 WHO COVID-19 Overview 2020-01-03 2020-01-03 2020-01-03 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought CONFIRMED CASE False 0
2753 nCoV_2019 WHO COVID-19 Overview 2020-01-03 2020-01-03 2020-01-03 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought CONFIRMED CASE True 0
2754 nCoV_2019 WHO COVID-19 Overview 2020-01-03 2020-01-03 2020-01-03 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought CONFIRMED DEATH False 0
2755 nCoV_2019 WHO COVID-19 Overview 2020-01-03 2020-01-03 2020-01-03 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought CONFIRMED DEATH True 0
2756 nCoV_2019 WHO COVID-19 Overview 2020-01-03 2020-01-03 2020-01-03 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought PROBABLE CASE False 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4494958 nCoV_2019 WHO COVID-19 Overview 2020-11-22 2020-11-22 2020-11-22 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought SUSPECTED DEATH False 0
4494959 nCoV_2019 WHO COVID-19 Overview 2020-11-22 2020-11-22 2020-11-22 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought TOTAL CASE False 24581
4494960 nCoV_2019 WHO COVID-19 Overview 2020-11-22 2020-11-22 2020-11-22 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought TOTAL CASE True 2089329
4494961 nCoV_2019 WHO COVID-19 Overview 2020-11-22 2020-11-22 2020-11-22 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought TOTAL DEATH False 401
4494962 nCoV_2019 WHO COVID-19 Overview 2020-11-22 2020-11-22 2020-11-22 Authority notification AL0 RU Russian Federation NaN NaN NaN NaN NaN Clinical care sought TOTAL DEATH True 36179
Upvotes: 1
Views: 65
Reputation: 5648
I didn't rebuild your dataframe but you should be able to just create 8 new columns like this example (I only show two). You can get fancier with creating the combinations and building the columns but if it's only eight, just code it simply.
df[['CASE_CONFIRMED', 'CASE_PROBABLE']] = ''
Once you have the columns just search on the two columns and set that particular column equal to VALUE.
df.loc[(df['CONFIRM_STATUS'] == 'CONFIRMED') & (df['OUTCOME'] == 'CASE'}, 'CASE_CONFIRMED' ]] = df['VALUE']
df.loc[(df['CONFIRM_STATUS'] == 'PROBABLE') & (df['OUTCOME'] == 'CASE'}, 'CASE_PROBABLE' ]] = df['VALUE']
If that doesn't work, paste part of the dataset using df.head(15).to_json().
Upvotes: 1