jeangelj
jeangelj

Reputation: 4498

Python/Pandas: create summary table

In a python pandas dataframe "df", I have the following columns:

user_id | song_id | song_duration | song_title | artist | listen_count

Many users might have listened to the same song - therefore the song is not unique in this table. I would like to create a second dataframe with just song information (with unique song_ids).

song_id | song_title | artist

I manage to create a table with song_id and song_title.

song_df = df.groupby('song_id').song_title.first()

How can I add, the column "artist" into this?

This doesn't work:

song_df = df.groupby('song_id').df['song_title','artist'].first()

AttributeError: 'DataFrameGroupBy' object has no attribute 'df'

Upvotes: 3

Views: 3357

Answers (2)

Rosa Alejandra
Rosa Alejandra

Reputation: 732

You could just drop the duplicates of selected columns

song_df = df[['song_id','song_title','artist']].drop_duplicates()

Upvotes: 0

jezrael
jezrael

Reputation: 863501

IIUC try omit .df:

df.groupby('song_id')['song_title','artist'].first()

Upvotes: 1

Related Questions