zhr
zhr

Reputation: 429

put condition in pandas dataframe (multi-index)?

I Have a multi-index dataframe as:

df = 
    sta1                 sta2           ...             stan
    coef    e      p     coef    e       p        ...   coef    e    p
t    3     0.1   0.001    2     0.1    0.001              1    0.1    0.01
u    3     0.2   0.01     3     0.01   0.01               2    0.1    0.001
v    2     0.1   0.001    4     0.1    0.1                2    0.001  0.001

I want to put a condition if p>0.01 then coef = nan. My results looks like:

df = 
    sta1                 sta2           ...              stan
    coef    e      p     coef      e       p        ...    coef    e    p
t    3     0.1   0.001    nan     0.1    0.1                1    0.1    0.01
u    3     0.2   0.01     3       0.01   0.01              nan   0.1    0.1
v    2     0.1   0.001    nan     0.1    0.1                2    0.001  0.001

I appreciate that if someone guide me. Thank you.

Upvotes: 2

Views: 692

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

Try this:

Assuming you have the following DF:

In [117]: df
Out[117]:
  stat1             stat2              stat3
   coef    e      p  coef     e      p  coef      e      p
0
t     3  0.1  0.001     2  0.10  0.001     1  0.100  0.010
u     3  0.2  0.010     3  0.01  0.010     2  0.100  0.001
v     2  0.1  0.001     4  0.10  0.100     2  0.001  0.001

you can use pd.IndexSlice[...] method and @jezrael's idea of renaming column names in the mask:

In [118]: mask = df.loc[:, idx[:, ['p']]] > 0.01

In [119]: mask
Out[119]:
   stat1  stat2  stat3
       p      p      p
0
t  False  False  False
u  False  False  False
v  False   True  False

In [120]: df[mask.rename(columns={'p':'coef'})] = np.nan

In [121]: df
Out[121]:
  stat1             stat2              stat3
   coef    e      p  coef     e      p  coef      e      p
0
t     3  0.1  0.001   2.0  0.10  0.001     1  0.100  0.010
u     3  0.2  0.010   3.0  0.01  0.010     2  0.100  0.001
v     2  0.1  0.001   NaN  0.10  0.100     2  0.001  0.001

Explanation:

In [123]: mask
Out[123]:
   stat1  stat2  stat3
       p      p      p
0
t  False  False  False
u  False  False  False
v  False   True  False

In [124]: mask.rename(columns={'p':'coef'})
Out[124]:
   stat1  stat2  stat3
    coef   coef   coef
0
t  False  False  False
u  False  False  False
v  False   True  False

Upvotes: 1

zipa
zipa

Reputation: 27879

I tested this on my random data, please tell me if it works for your example:

for multindex in df.columns.values:
    if multindex[1] == 'p':
        df[(multindex[0], 'coef')] = np.where(df[multindex] > 0.01, np.nan, df[(multindex[0], 'coef')])

Upvotes: 1

Related Questions