Reputation: 49
I have a pandas DataFrame:
sample_data = {'Sample': ['A', 'B', 'A', 'B'],
'Surface': ['Top', 'Bottom', 'Top', 'Bottom'],
'Intensity' : [21, 32, 14, 45]}
sample_dataframe = pd.DataFrame(data=sample_data)
And I have a function to get user input to create a column with a 'Condition' for each 'Sample'
def get_choice(df, column):
#df['Condition'] = user_input
user_input = []
for i in df[column]:
print('\n', i)
user_input.append(input('Condition= '))
df['Condition'] = user_input
return df
get_choice(group_fname, 'Sample')
This works, however the the user is prompted for each row that a 'Sample' exists. It is not a problem in this example where the Samples have two rows each, but when the DataFrame is larger and there are multiple samples that occupy multiple rows then it gets tedious.
How do I create a function that will fill the 'Condition' column for each row that a 'Sample' occupies by just getting the input once.
I tried creating the function to return a dictionary then .apply()
that to the DataFrame, but when I do that it still asks for input for each instance of the 'Sample'.
Upvotes: 1
Views: 1100
Reputation: 195553
If I understand your question right, you want to get user input only once for each unique value and then create column 'Condition'
:
sample_data = {'Sample': ['A', 'B', 'A', 'B'],
'Surface': ['Top', 'Bottom', 'Top', 'Bottom'],
'Intensity' : [21, 32, 14, 45]}
sample_dataframe = pd.DataFrame(data=sample_data)
def get_choice(df, column):
m = {}
for v in df[column].unique():
m[v] = input('Condition for [{}] = '.format(v))
df['Condition'] = df[column].map(m)
return df
print( get_choice(sample_dataframe, 'Sample') )
Prints (for example)
Condition for [A] = 1
Condition for [B] = 2
Sample Surface Intensity Condition
0 A Top 21 1
1 B Bottom 32 2
2 A Top 14 1
3 B Bottom 45 2
Upvotes: 1