Reputation: 23
I'm trying to do a rather simple loop in Python 3.6.1 that involves a list of strings. Essentially, I have a dataframe that looks like this:
X_out Y_out Z_in X_in Y_in Z_in
Year
1969 4 3 4 4 3 3
1970 2 0 1 3 2 2
1971 3 1 1 0 1 2
1972 2 0 0 3 1 0
and I'd like to find the net change of X, Y, and Z, making them new columns in this dataframe.
In its simplest form, this could be
df['x_net'] = df['x_in'] - df['x_out']
df['y_net'] = df['y_in'] - df['y_out']
df['z_net'] = df['z_in'] - df['z_out']
but in actuality, there are about fifteen columns that need to be created in this way. Since it'll be a bear, I figure it's best to put in a function, or at least a loop. I made a list of our initial "root" variables, without the suffixes that looks like this:
root_vars = ['x', 'y', 'z']
And I think that my code might(?) look something like:
for i in root_vars:
df['%s_net'] = df['%s_in'] - df['%s_out'] %(root_vars_[i])
but that's definitely not right. Could someone give me a hand on this one please?
Thank you so much!
Upvotes: 2
Views: 86
Reputation: 7840
You can use the relatively new (Python 3.6) formatted string literals:
for i in root_vars:
df[f'{i}_net'] = df[f'{i}_in'] - df[f'{i}_out']
The f
prefix before each string causes the {i}
to be replaced with the value of the variable i
. If you want the code to be usable in Python versions before 3.6, you can go with the more usual formatting:
for i in root_vars:
df['{}_net'.format(i)] = df['{}_in'.format(i)] - df['{}_out'.format(i)]
Upvotes: 1