Reputation: 392
I have a DataFrame
>> test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4], 'C': [np.nan, np.nan, np.nan, np.nan], 'D': [np.nan, np.nan, np.nan, np.nan]})
A B C D
0 a 1
1 b 2
2 b 3
3 b 4
I also have a dictionary, where b
in input_b
signifies that I'm only modifying rows where row.A = b
.
>> input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
How do I populate the DataFrame with values from the dictionary to get
A B C D
0 a 1
1 b 2 Moon Elephant
2 b 3
3 b 4 Sun Mouse
Upvotes: 1
Views: 3968
Reputation: 186
Using apply
test['C'] = test['B'].map(input_b).apply(lambda x: x[0] if type(x)==list else x)
test['D'] = test['B'].map(input_b).apply(lambda x: x[1] if type(x)==list else x)
yields
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
Upvotes: 1
Reputation: 164623
You can use loc
indexing after setting your index to B
:
test = test.set_index('B')
test.loc[input_b, ['C', 'D']] = list(input_b.values())
test = test.reset_index()
print(test)
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
Upvotes: 1
Reputation: 323226
Using update
test=test.set_index('B')
test.update(pd.DataFrame(input_b,index=['C','D']).T)
test=test.reset_index()
test
B A C D
0 1 a NaN NaN
1 2 b Moon Elephant
2 3 b NaN NaN
3 4 b Sun Mouse
Upvotes: 1
Reputation: 722
This may not be the most efficient solution, but from what I understand it got the job done:
import pandas as pd
import numpy as np
test = pd.DataFrame({'A': ['a', 'b', 'b', 'b'], 'B': [1, 2, 3, 4],
'C': [np.nan, np.nan, np.nan, np.nan],
'D': [np.nan, np.nan, np.nan, np.nan]})
input_b = {2: ['Moon', 'Elephant'], 4: ['Sun', 'Mouse']}
for key, value in input_b.items():
test.loc[test['B'] == key, ['C', 'D']] = value
print(test)
Yields:
A B C D
0 a 1 NaN NaN
1 b 2 Moon Elephant
2 b 3 NaN NaN
3 b 4 Sun Mouse
This will get slower if the dictionary input_b
gets too large (too many rows are being updated, too many iterations in the for loop), but should be relatively fast with small input_b
's even with large test
dataframes.
This answer also assumes the keys in the input_b
dictionary refer to the values of the B
column in the original dataframe, and will add repeated values in the C
and D
columns for repeated values in the B
column.
Upvotes: 3