Reputation: 119
I have a function I use to make duplicate rows in Pandas in order to have in each row a different value that were together before
GRI_Code ADX_Code
102 S1.2\nS1.1\nS5.1\nS5.2
102\n405 S4.2\nS4.3\nS4.1
103 E7.2\nE7.3\nE7.1\n\n\nS9.2\nS9.1\nS10.2\nS10.1\n
302 E3.1\nE3.2\n\n
the method I use is:
def separate_code(self, df, column, delimiter):
df = df.assign(GRI_Code=df[f'{column}'].str.split(delimiter)).explode(f'{column}')
return df
When I call the funtion:
df = separate_code(df, column='GRI_Code', delimiter="\n")
My output is:
GRI_Code ADX_Code
102 \n\nS1.2\nS1.1\nS5.1\nS5.2
102 S4.2\nS4.3\nS4.1
405 S4.2\nS4.3\nS4.1
103 E7.2\nE7.3\nE7.1\n\n\nS9.2\nS9.1\nS10.2\nS10.1\n
302 E3.1\nE3.2\n\n
I will be using this method on other dataframes with different column names, I would like to know how can I add the column=
in a dynamic way, if I use the variable column
instead of GRI_Code=
it is going to create a new column and return the same values in the same row as a list, I don't want that.
Upvotes: 1
Views: 276
Reputation: 260640
Use a dictionary and parameter expansion:
def separate_code(self, df, column, delimiter):
return df.assign(**{column: df[f'{column}']...})
Upvotes: 1