Reputation: 2155
I'm transferring a SAS script to python using the pandas library. Current the SAS code is very short a concise, however I can't find a good way of writing the python code. It a series of IF statements but is very messy in python.
SAS code:
DATA Table1
IF FLAG= "GO" THEN OUTCOME= 0;
IF FLAG= "STOP" THEN OUTCOME= 100;
I have tried both .loc and np.where and the code is very hard to follow
Example using np.where
Table1['OUTCOME']=np.where(Table1['FLAG']=='GO',0,Table1['OUTCOME'])
Table1['OUTCOME']=np.where(Table1['FLAG']=='Stop',100,Table1['OUTCOME'])
is there a way I can write IF statement in a way which is like SAS in python?
Upvotes: 1
Views: 475
Reputation: 18466
In SAS
, you can use the column names as if you are referencing the variables, but in Python, you need to access the column values manually using the dataframe.
You can build almost the same code as in SAS
with the help of function:
def getOutcome(x):
if x['FLAG']=='GO': return 0
elif x['FLAG']=='STOP': return 100
else: return x['OUTCOME']
Table1['OUTCOME'] = Table1.apply(getOutcome, axis=1)
The main difference is when you are doing IF FLAG= "GO" THEN OUTCOME= 0
that's done for a single value in SAS
at a time, where as, that's quite different the way it works in python, you have an entire series, and you can not do it the same way.
You can just write a function similar to above, that works on a single value at a time, then it becomes easy to migrate SAS
code to Python; however, you may not have the most efficient approach while doing this type of transformation, and there might be some more efficient way of doing those things.
Upvotes: 1