K. Mather
K. Mather

Reputation: 93

Alternative to nested np.where in Pandas DataFrame

I have this code (which works) - a bunch of nested conditional statements to set the value in the 'paragenesis1' row of a dataframe (myOxides['cpx']), depending on the values in various other rows of the frame.

I'm very new to python and programming in general. I am thinking that I should write a function to perform this, but how then to apply that function elementwise? This is the only way I have found to avoid the 'truth value of a series is ambiguous' error.

Any help greatly appreciated!

myOxides['cpx'].loc['paragenesis1'] = np.where(
            ((cpxCrOx>=0.5) & (cpxAlOx<=4)),
            "GtPeridA", 
            np.where(
                    ((cpxCrOx>=2.25) & (cpxAlOx<=5)), 
                    "GtPeridB", 
                    np.where(
                            ((cpxCrOx>=0.5)&
                             (cpxCrOx<=2.25)) &
                             ((cpxAlOx>=4) & (cpxAlOx<=6)),
                             "SpLhzA",
                             np.where(
                                     ((cpxCrOx>=0.5) &
                                      (cpxCrOx<=(5.53125 - 
                                                 0.546875 * cpxAlOx))) &
                                      ((cpxAlOx>=4) & 
                                       (cpxAlOx <= ((cpxCrOx - 
                                                     5.53125)/ -0.546875))),
                             "SpLhzB",
                             "Eclogite, Megacryst, Cognate"))))

or;

df.loc['a'] = np.where(
            (some_condition),
            "value", 
            np.where(
                    ((conditon_1) & (condition_2)), 
                    "some_value", 
                    np.where(
                            ((condition_3)& (condition_4)),
                             "some_other_value",
                              np.where(
                                      ((condition_5),
                                        "another_value",
                                        "other_value"))))

Upvotes: 7

Views: 5849

Answers (1)

jezrael
jezrael

Reputation: 862581

One possible solution is use numpy.select:

m1 = (cpxCrOx>=0.5) & (cpxAlOx<=4)
m2 = (cpxCrOx>=2.25) & (cpxAlOx<=5)
m3 = ((cpxCrOx>=0.5) & (cpxCrOx<=2.25)) & ((cpxAlOx>=4) & (cpxAlOx<=6))
m4 = ((cpxCrOx>=0.5) &(cpxCrOx<=(5.53125 -  0.546875 * cpxAlOx))) & \
     ((cpxAlOx>=4) &  (cpxAlOx <= ((cpxCrOx -  5.53125)/ -0.546875))

vals = [ "GtPeridA", "GtPeridB", "SpLhzA", "SpLhzB"]
default = 'Eclogite, Megacryst, Cognate'

myOxides['paragenesis1'] = np.select([m1,m2,m3,m4], vals, default=default)

Upvotes: 22

Related Questions