Victor Wang
Victor Wang

Reputation: 937

Pandas DataFrame: Why I can't change the value of one column based on value of another through row iteration?

I want to change the value of one column based on that value of another. For example, given the following DF:

   Freq TOC
1    10  NA
2    20  NA
3    30  NA

for index, row in df.iterrows():
    if row["Freq"] == 20:
        row["TOC"] = True

I would expect:

   Freq TOC
1    10  NA
2    20  True
3    30  NA

But nothing is changed. What's wrong? Thanks.

Upvotes: 3

Views: 8849

Answers (2)

jpp
jpp

Reputation: 164773

pd.DataFrame.iterrows returns a series for each row in a Python-level loop, not dynamic links to your dataframe. More efficiently, you can use column-wise vectorised methods instead of a row-wise loop (this assumes you are happy with 1 == True):

df['TOC'] = np.where(df['Freq'] == 20, True, np.nan)

More idiomatic is to assign a Boolean series, i.e. True / False values only:

df['TOC'] = df['Freq'] == 20

What would work is utilising the index in your loop, although this will be inefficient:

for index, row in df.iterrows():
    if row['Freq'] == 20:
        df.loc[index, 'TOC'] = True

Upvotes: 8

DYZ
DYZ

Reputation: 57085

You modify a "copy" of a row, not the row itself.

In general, one should not use loops to modify dataframes. Instead, resort to the vectorized operations:

df.loc[df["Freq"]==20, "TOC"] = True

Upvotes: 6

Related Questions