babz
babz

Reputation: 479

Create a dynamic function in python/pandas for repeated np.where statements

I am creating code in which there is alot of repetition of the function np.where. There are about 200 fields which each require some sort of transformation using np.where (essentially case when/ if then statements).

I was hoping to clean up the code by writing a function instead of repeating the statements for each field. The problem is that some fields require just a basic np.where statement whereas others have nested np.where statements (up to 10) and given this I am not sure how to make a function dynamic enough to deal with it or if it is even worth attempting this.

Sample

Case 1-  Simple np.where
TABLE[‘A’]=np.where(TABLE.FIELD1=='N',TABLE.FIELD2,TABLE.FIELD3)


Case 2- Nested np.where
TABLE[‘B’]= np.where(TABLE.FIELD1=='N','ADBE',
                          np.where(TABLE.FIELD1=='A','ADB ',
                          np.where(TABLE.FIELD1=='D','CDB ',
                          np.where(TABLE.FIELD1=='W','ODB ',
                          np.where(TABLE.FIELD1=='T','TDB ',
                          np.where(TABLE.FIELD1=='I','ODI ',
                          np.where(TABLE.FIELD1=='S','GDB ',       
                          np.where(TABLE.FIELD1=='B','BVP ',
                          np.where(((TABLE.FIELD1=='G')&(TABLE.FIELD2[0:4]=='UXXX')),'EGIB',
                          np.where(TABLE.FIELD1=='G','GIB ', 'null'))))))))))

Upvotes: 3

Views: 244

Answers (1)

jpp
jpp

Reputation: 164773

Would the below not work instead? It may even be more efficient, as less Boolean arrays need to be calculated and applied.

d = {'N': 'ADBE', 'A': 'ADB', 'D': 'CDB', 'W': 'ODB',
     'T': 'TDB', 'I': 'ODI', 'S': 'GDB', 'B': 'BVP'}

TABLE['B'] = 'null'
TABLE.loc[(TABLE.FIELD1=='G') & (TABLE.FIELD2[0:4]=='UXXX'), 'B'] = 'EGIB'
TABLE.loc[TABLE.FIELD1.isin(d), 'B'] = TABLE.loc[TABLE.FIELD1.isin(d), 'B'].map(d)

Upvotes: 1

Related Questions