joster
joster

Reputation: 3

Unable to access dataframe columns that were created in while loop

New to python/pandas, and I'm running into an issue when creating new columns within a loop. I want to create a new column in each iteration of the loop, and populate the column with 1(for yes) or 0(for no) based on whether or not three other values in the dataframe are all equal to 1. This should loop 15 times over a total of 45 columns, and produce 15 new columns labeled as 'newCol' + a number from 0 to 14, from the loop.

I want to create a new column every time I iterate through the loop, and label it with it's order # (the value of x as it runs through the loop) so I can track which columns have been checked against each other.

x = 0
label = 'newColumn',x
while x < 15:
    label = 'newCol',x
    #creates a new column with label that includes x
    #populates with 1 or 0
    df.loc[label] = np.where((df.iloc[:,x] == 1) & (df.iloc[:,x] == df.iloc[:,x+15]) & (df.iloc[:,x] == df.iloc[:,x+30]), 1, 0)
    #increment x
    x = x+1

This ends up producing the columns if I view them with .info(), but I cannot access them through any indexing moving forward.

Any help is much appreciated!! Thanks

Upvotes: 0

Views: 86

Answers (2)

Ian Thompson
Ian Thompson

Reputation: 3305

the .loc property if for indexing/slicing and possibly changing preexisting values using boolean indexing. If you want to create a new column I suggest doing it like this:

x = 0
label = 'newColumn',x
while x < 15:
    label = 'newCol',x
    #creates a new column with label that includes x
    #populates with 1 or 0

# do this instead
    df[label] = np.where((df.iloc[:,x] == 1) & (df.iloc[:,x] == df.iloc[:,x+15]) & (df.iloc[:,x] == df.iloc[:,x+30]), 1, 0)
    #increment x
    x = x+1

Your original method was actually adding an index to the rows, rather than a column. To avoid that you can use the above method.

The next step, as mentioned by @alex-chojnacki, is that your label is a tuple and not a string, which could make it difficult to reference in your code.

Upvotes: 1

Alex Chojnacki
Alex Chojnacki

Reputation: 591

I believe you're issue is that label = 'newColum',x does not actually return a string, but a tuple. So the value of label, which you are using to index into the df is actually ("newCol", 0) on the first iteration.

If you instead create the label as a string instead of a tuple, you should be okay. One way to do this would be label = "newCol" + str(x).

Upvotes: 0

Related Questions