Reputation: 597
I want to loop through a dataframe, checking to see if the row name matches the column name. If they match, I want to set the value for the intersection to zero. I've tried several options but none of them works. Here is pseudocode that shows what I want to do:
for row in dataframe:
if row_name == column_name:
dataframe[rowname][columnname] = 0
This is what the data looks like:
NAME1 NAME2 NAME3
NAME1 1 .9 .2
NAME2 .6 1 .7
NAME3 .5 .8 1
Upvotes: 2
Views: 2572
Reputation: 394209
More convoluted method than @jpp's, you could stack
the df so the column names form the second level of the index :
In[296]:
stack = df.stack()
stack
Out[296]:
NAME1 NAME1 1.0
NAME2 0.9
NAME3 0.2
NAME2 NAME1 0.6
NAME2 1.0
NAME3 0.7
NAME3 NAME1 0.5
NAME2 0.8
NAME3 1.0
dtype: float64
Then we can mask the stacked df and set to 0
where the index level values match:
In[297]:
stack.loc[stack.index.get_level_values(0) == stack.index.get_level_values(1)] = 0
stack
Out[297]:
NAME1 NAME1 0.0
NAME2 0.9
NAME3 0.2
NAME2 NAME1 0.6
NAME2 0.0
NAME3 0.7
NAME3 NAME1 0.5
NAME2 0.8
NAME3 0.0
dtype: float64
Then we call unstack
to revert back to our original df:
In[298]:
stack.unstack()
Out[298]:
NAME1 NAME2 NAME3
NAME1 0.0 0.9 0.2
NAME2 0.6 0.0 0.7
NAME3 0.5 0.8 0.0
This has more of a performance hit on a small df as you're creating temporary df's from the calls to stack
and unstack
but if you have large overlaps of index and column values then it avoids the looping
Upvotes: 0
Reputation: 164773
You can calculate the intersection of your index and columns. Then iterate the intersection and use pd.DataFrame.loc
to set values.
intersection = df.index & df.columns
for item in intersection:
df.loc[item, item] = 0
print(df)
NAME1 NAME2 NAME3
NAME1 0.0 0.9 0.2
NAME2 0.6 0.0 0.7
NAME3 0.5 0.8 0.0
Upvotes: 1