Reputation: 3
New to python/pandas, and I'm running into an issue when creating new columns within a loop. I want to create a new column in each iteration of the loop, and populate the column with 1(for yes) or 0(for no) based on whether or not three other values in the dataframe are all equal to 1. This should loop 15 times over a total of 45 columns, and produce 15 new columns labeled as 'newCol' + a number from 0 to 14, from the loop.
I want to create a new column every time I iterate through the loop, and label it with it's order # (the value of x as it runs through the loop) so I can track which columns have been checked against each other.
x = 0
label = 'newColumn',x
while x < 15:
label = 'newCol',x
#creates a new column with label that includes x
#populates with 1 or 0
df.loc[label] = np.where((df.iloc[:,x] == 1) & (df.iloc[:,x] == df.iloc[:,x+15]) & (df.iloc[:,x] == df.iloc[:,x+30]), 1, 0)
#increment x
x = x+1
This ends up producing the columns if I view them with .info(), but I cannot access them through any indexing moving forward.
Any help is much appreciated!! Thanks
Upvotes: 0
Views: 86
Reputation: 3305
the .loc
property if for indexing/slicing and possibly changing preexisting values using boolean indexing. If you want to create a new column I suggest doing it like this:
x = 0
label = 'newColumn',x
while x < 15:
label = 'newCol',x
#creates a new column with label that includes x
#populates with 1 or 0
# do this instead
df[label] = np.where((df.iloc[:,x] == 1) & (df.iloc[:,x] == df.iloc[:,x+15]) & (df.iloc[:,x] == df.iloc[:,x+30]), 1, 0)
#increment x
x = x+1
Your original method was actually adding an index to the rows, rather than a column. To avoid that you can use the above method.
The next step, as mentioned by @alex-chojnacki, is that your label
is a tuple and not a string, which could make it difficult to reference in your code.
Upvotes: 1
Reputation: 591
I believe you're issue is that label = 'newColum',x
does not actually return a string, but a tuple. So the value of label, which you are using to index into the df
is actually ("newCol", 0)
on the first iteration.
If you instead create the label as a string instead of a tuple, you should be okay. One way to do this would be label = "newCol" + str(x)
.
Upvotes: 0