Reputation: 105
I have a dataframe with headers 'Category', 'Factor1', 'Factor2', 'Factor3', 'Factor4', 'UseFactorA', 'UseFactorB'.
The value of 'UseFactorA' and 'UseFactorB' are one of the strings ['Factor1', 'Factor2', 'Factor3', 'Factor4'], keyed based on the value in 'Category'.
I want to generate a column, 'Result', which equals dataframe[UseFactorA]/dataframe[UseFactorB]
Take the below dataframe as an example:
[Category] [Factor1] [Factor2] [Factor3] [Factor4] [useFactor1] [useFactor2]
A 1 2 5 8 'Factor1' 'Factor3'
B 2 7 4 2 'Factor3' 'Factor1'
The 'Result' series should be [2, .2]
However, I cannot figure out how to feed the value of useFactor1 and useFactor2 into an index to make this happen--if the columns to use were fixed, I would just give
df['Result'] = df['Factor1']/df['Factor2']
However, when I try to give
df['Results'] = df[df['useFactorA']]/df[df['useFactorB']]
I get the error
ValueError: Wrong number of items passed 3842, placement implies 1
Is there a method for doing what I am trying here?
Upvotes: 0
Views: 55
Reputation: 2579
Here's the one liner:
df['Results'] = [df[df['UseFactorA'][x]][x]/df[df['UseFactorB'][x]][x] for x in range(len(df))]
How it works is:
df[df['UseFactorA']]
Returns a data frame,
df[df['UseFactorA'][x]]
Returns a Series
df[df['UseFactorA'][x]][x]
Pulls a single value from the series.
Upvotes: 1
Reputation: 2007
Probably not the prettiest solution (because of the iterrows), but what comes to mind is to iterate through the sets of factors and set the 'Result' value at each index:
for i, factors in df[['UseFactorA', 'UseFactorB']].iterrows():
df.loc[i, 'Result'] = df[factors['UseFactorA']] / df[factors['UseFactorB']]
Edit:
Another option:
def factor_calc_for_row(row):
factorA = row['UseFactorA']
factorB = row['UseFactorB']
return row[factorA] / row[factorB]
df['Result'] = df.apply(factor_calc_for_row, axis=1)
Upvotes: 1