Reputation: 2220
I have dataframe which looks like below:
df:
Review_Text Noun Thumbups
Would be nice to be able to import files from ... [My, Tracks, app, phone, Google, Drive, import... 1.0
No Offline Maps! It used to have offline maps ... [Offline, Maps, menu, option, video, exchange,... 18.0
Great application. Designed with very well tho... [application, application] 16.0
Great App. Nice and simple but accurate. Wish ... [Great, App, Nice, Exported] 0.0
Save For Offline - This does not work. The rou... [Save, Offline, route, filesystem] 12.0
Since latest update app will not run. Subscrip... [update, app, Subscription, March, application] 9.0
Great app. Love it! And all the things it does... [Great, app, Thank, work] 1.0
I have paid for subscription but keeps telling... [subscription, trial, period] 0.0
Error: The route cannot be save for no locatio... [Error, route, i, GPS] 0.0
When try to restore my tracks it says "unable ... [try, file, locally-1] 0.0
Was a good app but since the update it only re... [app, update, metre] 2.0
based on 'Noun' Column values, I want to create other columns. For example, all values of noun column from first row become columns and those columns contain value of 'Thumbups' column value. If the column name already present in dataframe then it adds 'Thumbups' value into the existing value of the column.
I was trying to implement by using pivot_table :
pd.pivot_table(latest_review,columns='Noun',values='Thumbups')
But got following error:
TypeError: unhashable type: 'list'
Could anyone help me in fixing the issue?
Upvotes: 0
Views: 79
Reputation: 863711
Use Series.str.join
with Series.str.get_dummies
for dummies and then multiple by column Thumbups
by DataFrame.mul
:
df1 = df['Noun'].str.join('|').str.get_dummies().mul(df['Thumbups'], axis=0)
print (df1)
App Drive Error Exported GPS Google Great Maps March My Nice \
0 0.0 10.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 10.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 180.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 90.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Offline Save Subscription Thank Tracks app application exchange \
0 0.0 0.0 0.0 0.0 10.0 10.0 0.0 0.0
1 180.0 0.0 0.0 0.0 0.0 0.0 0.0 180.0
2 0.0 0.0 0.0 0.0 0.0 0.0 160.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 120.0 120.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 90.0 0.0 0.0 90.0 90.0 0.0
6 0.0 0.0 0.0 10.0 0.0 10.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 NaN NaN NaN NaN NaN NaN NaN NaN
file filesystem i import locally-1 menu metre option period \
0 0.0 0.0 0.0 10.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 180.0 0.0 180.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 120.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 NaN NaN NaN NaN NaN NaN NaN NaN NaN
phone route subscription trial try update video work
0 10.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0 180.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 120.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 90.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 10.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
10 NaN NaN NaN NaN NaN NaN NaN NaN
Upvotes: 1
Reputation: 41
rows = []
#_unpacking Noun column row list values and storing it in rows list
_ = df.apply(lambda row: [rows.append([row['Review_Text'],row['Thumbups'], nn])
for nn in row.Noun], axis=1)
#_creates new dataframe with unpacked values
df_new = pd.DataFrame(rows, columns=df.columns)
#_now doing pivot operation on df_new
pivot_df = df_new.pivot(index='Review_Text', columns='Noun')
Upvotes: 0