Reputation: 153
I need to reformat some data. I've never used pandas before, and could use some help.
I have two DataFrames: df1 dfTarget
df1 is the un-formatted data, dfTarget is how I need the data to be formatted
Based on a condition, I need one group of columns in df1 to be copied to certain columns in dfTarget. If the condition is false, I need another group of columns in df1 to be copied to certain columns in dfTarget.
Simplified df1:
city state condition city2 state2
0
1
2
3
4
Simplified dfTarget:
mCity mState
0
1
2
3
4
Basically, if the condition is true, I need to move 'city' and 'state' into 'mCity' and 'mState' respectively. If the condition is false, I need to move 'city2' and 'state2' into 'mCity' and 'mState'.
dfTarget is starting off empty, and needs to be filled row by row based on a bunch of conditions in df1.
I've never used pandas, and tried to research this myself, but got lost quickly in all the different methods. Please, what's the best way to do this?
Upvotes: 0
Views: 965
Reputation: 402814
It should be simple enough to conditionally assign the columns, assuming the indices and/or number of rows is the same.
If the condition comes from a column, you can try np.where
:
dfTarget[['mCity', 'mState']] = np.where(
df1[['condition']], df1[['city', 'state']], df1[['city2', 'state2']])
Minimal Example
df1 = pd.DataFrame({
'city': list('abc'),
'state': list('def'),
'condition': [True, False, True],
'city2': list('hij'),
'state2': list('klm')})
dfTarget = pd.DataFrame(index=df1.index, columns=['mCity', 'mState'])
dfTarget[['mCity', 'mState']] = np.where(
df1[['condition']], df1[['city', 'state']], df1[['city2', 'state2']])
mCity mState
0 a d
1 i l # comes from second group of columns
2 c f
Upvotes: 3