Reputation: 1897
I have the following method chaining code and want to create a new column. but i'm getting an error when doing the following.
(
pd.pivot(test, index = ['file_path'], columns = 'year', values = 'file')
.fillna(0)
.astype(int)
.reset_index()
.assign(hierarchy = file_path.str[1:-1].str.join(' > '))
)
Before the assign method the dataframe looks something like this:
file_path 2017 2018 2019 2020
S:\Test\A 0 0 1 2
S:\Test\A\B 1 0 1 3
S:\Test\A\C 3 1 1 0
S:\Test\B\A 1 0 0 1
S:\Test\B\B 1 0 0 1
The error is : name 'file_path' is not defined.
file_path exists in the data frame but I'm not calling it correctly. What is the proper way to create a new column based on another using assign?
Upvotes: 1
Views: 38
Reputation: 18306
you can pass a callable to assign
that accepts the dataframe at that point:
.assign(hierarchy=lambda fr: fr["file_path"].str[1:-1].str.join(" > "))
so that fr
will be the thus far modified dataframe (pivoted, index resetted etc.), over which you can access to the column "file_path".
Upvotes: 3