Anonyme
Anonyme

Reputation: 31

Transform dataframe/pandas

enter image description here

I have a dataframe with tag as column and movieId as index. movieId represents movies. So there may be multiple tags for the same movieId.

I want to transform this dataframe and there are as many columns as there are tags and that moviesId become the lines but there must be one line per movieId. And that for each movieId if there was a tag there is a 1 in the tag column at otherwise 0

I can't do it when I try, the moviesID appear several times.

Would someone help me

Thank you so much.

Upvotes: 1

Views: 90

Answers (2)

Anshul Vyas
Anshul Vyas

Reputation: 663

We can use ".crosstab()" function to get the required output:

I have created a sample dataframe as 'df':

movieId    tag
260     Best movie ever
1240    scifi
2133    Best movie ever
1097    scifi
260     scifi
250     scifi

By using .crosstab() function:

pd.crosstab(df.movieId, df.tag, dropna = False)

The output will be like this:

tag        Bestmovie ever   scifi
movieId     
250                0          1
260                1          1
1097               0          1
1240               0          1
2133               1          0

Upvotes: 3

Pradeep Padmanaban C
Pradeep Padmanaban C

Reputation: 674

i hope this fixes the problem

import pandas as pd
import numpy as np

df = pd.DataFrame([[260, "best"],[520,"sci"],[260,"sci"]],columns=['movieId','tag'])
print("Dummy DataFrame: \n", df)


movieId, tags= list(df['movieId'].unique()), list(df['tag'].unique())
dfmatrix= pd.DataFrame(np.zeros((len(movieId),len(tags)+1),dtype=int), columns=['movieID']+tags)
# dfmatrix['movieID'][1]= 54
for i, movie in enumerate(movieId):
    listoftag = df.tag[df['movieId']==movie]
    dfmatrix.movieID[i]= movie
    for tag in listoftag:
        dfmatrix[tag][i]=1

print("\n \n dfmatrix \n",dfmatrix)

the output is :

Dummy DataFrame:
    movieId   tag
0      260  best
1      520   sci
2      260   sci


 dfmatrix
    movieID  best  sci
0      260     1    1
1      520     0    1

Upvotes: 1

Related Questions