Reputation: 147
I have a list list = ['OUT', 'IN']
where all the elements of the list is a variable name in the data frame with suffixes _3M, _6M, _9M, 15M
attached to it.
List:
list = ['OUT', 'IN']
Input_df:
ID OUT_3M OUT_6M OUT_9M OUT_15M IN_3M IN_6M IN_9M IN_15M
A 2 3 4 6 2 3 4 6
B 3 3 5 7 3 3 5 7
C 2 3 6 6 2 3 6 6
D 3 3 7 7 3 3 7 7
The problem I am solving to do is subtracting the
1.OUT_6M
from OUT_3M
and entering in into separate column as Out_3M-6M
2.OUT_9M
from OUT_6M
and entering in into separate column as Out_6M-9M
3.OUT_15M
from OUT_9M
and entering in into separate column as Out_9M-15M
The Same repeats to each and every element in the list while keeping the OUT_3M
and IN_3M
which I mentioned in the sample Output_df
dataset.
Output_df:
ID Out_3M Out_3M-6M Out_6M-9M Out_9M-15M IN_3M IN_3M-6M IN_6M-9M IN_9M-15M
A 2 1 1 2 2 1 1 2
B 3 0 2 2 3 0 2 2
C 2 1 3 0 2 1 3 0
D 3 0 4 0 3 0 4 0
There are many elements in the list which I need to perform operation on. Is there any way I could solve this by writing a function. Thanks!
Upvotes: 0
Views: 57
Reputation: 1879
I'm not sure what you mean by writing a function, aren't a couple of for cycles enough for what you want to do? Something like:
postfixes = ['3M','6M','9M','15M']
prefixes = ['IN','OUT']
# Allocate the space, while also copying _3M
output_df = input_df.copy()
# Rename a few
output_df.rename(columns={'_'.join((prefix, postfixes[i])): '_'.join((prefix, postfixes[i-1] + '-' + postfixes[i]))
for prefix in prefixes for i in range(1, len(postfixes))}, inplace=True)
# Compute the differences
for prefix in prefixes:
for i in range(1,len(postfixes)):
postfix = postfixes[i] + '-' + postfixes[i-1]
output_df['_'.join((prefix, postfix))] = input_df['_'.join((prefix, postfixes[i-1]))].values - input_df['_'.join((prefix, postfixes[i]))].values
The output_df is a copy of input_df in the beginning, both to avoid dealing with the _3M case separately, and to pre-allocate the DataFrame instead of creating the columns one at a time (it doesn't matter in your code, but if you had thousands of columns it would waste time moving stuff around in memory otherwise...)
Also, you should avoid calling a list "list" or you're going to get some nasty-to-find bugs along the way when you're trying to convert a tuple to a list!
Upvotes: 1