Reputation: 25
I've a function that perform a modification on a string, under some condition than return a list containing modified string and a boolean check if new string is obtained. I want apply func to pandas dataframe column and store result in two new created columns. I found an inelegant way to achieve this purpose:
The main method is:
def alter_string(astring):
...
return altered_string, boolean_check
def _perform_mod(astring):
return alter_string(astring)[0]
def _check():
return alter_string(astring)[1]
df['modified']=df['original'].apply(_perform_mod)
df['check']=df['original'].apply(_check)
In this way I achieve my goal but I have to run twice an heavy computational method. I wonder if there is a better way
Add some details to clarify my question
I've a dataframe column 'original_string'
containing string that are molecular descriptors, I apply to this string a function that can modify or not the string under some circumstances. The function return modified string and True or same string and False. I need to add two new columns to dataframe, modified_string
and check
Here a short sample
original ---> modified check
AAAAAA -----> AAAAAA False
AAABCD -----> AAAVCD True
ACCBDE -----> AACADE True
`
Upvotes: 1
Views: 135
Reputation: 2533
Try this:
df['modified'], df['check'] = zip(*df['original'].apply(alter_string))
This way you run alter_string
function only once.
zip
function creates a list of tuples, where each tuple is a series.
Then through tuple unpacking (df['modified'], df['check']
) we create our brand new columns in dataframe.
Based on this answer
Upvotes: 1