Pedro Jesus
Pedro Jesus

Reputation: 25

Create a dataframe with duplicate entries

I do a sql query, and I then with data = pd.read_sql(query, connection) I have the following table,

  ID        ITEM   TYPE_USER   Count
711757     item1     type1       1
711757     item2     type1       1
711757     item3     type1       1
711794     item1     type2       1
711794     item2     type2       1
711541     item2     type3       1
  .          .         .         .
  .          .         .         .

But I need create the following dataframe

   ID     item1    item2    item3      TYPE_USER
 711757      1        1        1         type1 
 711794      1        1        0         type2
 711541      0        1        0         type3   

So, my idea was take

`data.pivot(index='ID', columns = 'ITEM', values='Count')   

But, this give me, the follwoing dataframe

       ID     item1    item2    item3    
 0   711757      1        1        1         
 1   711794      1        1        0         
 2   711541      0        1        0        

In this point, I don't know how join the column 'TYPE_USER', any idea will be appreciated! Thanks!

Upvotes: 1

Views: 160

Answers (1)

pythonic833
pythonic833

Reputation: 3224

pd.pivot_table(df, index=['ID','TYPE_USER'], columns='ITEM', values='Count').fillna(0).reset_index()

result

ITEM    ID  TYPE_USER   item1   item2   item3
0   711541  type3        0.0    1.0      0.0
1   711757  type1        1.0    1.0      1.0
2   711794  type2        1.0    1.0      0.0

Upvotes: 2

Related Questions