Reputation: 121
I have a problem with splitting two columns into new rows.
My dataframe looks like this
"involved" and "team_player_formation" each have 15 strings, that needs to be assigned into each their row. With the first string of "involved" matching first string of "team_player_formation"
I have tried to follow this: Split (explode) pandas dataframe string entry to separate rows and searching for ways to split multiple columns, but without success
I have been able to split one of my columns with the following at the moment
df = pd.read_csv('Hello.csv', delimiter=';')
df = df.assign(involved=df['involved'].str.split(',')).explode('involved')
Which gives me something like this:
matchId contestantId periodId typeId involved team_formation team_player_formation
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 b2492j7qzdo7g3ysxz6gq4g5x 4 1
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 b2492j7qzdo7g3ysxz6gq4g5x 4 1
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 b2492j7qzdo7g3ysxz6gq4g5x 4 1
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 b2492j7qzdo7g3ysxz6gq4g5x 4 1
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 b2492j7qzdo7g3ysxz6gq4g5x 4 1
..................
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 am3509ake84cde1xhb9264i22 4 0
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 am3509ake84cde1xhb9264i22 4 0
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 am3509ake84cde1xhb9264i22 4 0
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 am3509ake84cde1xhb9264i22 4 0
0 d5321qxmnyf9004i049uf4pre 77tfx9me4aaqhzv78bmgsy9bg 2 40 am3509ake84cde1xhb9264i22 4 0
But that only split "involed" into a new column.
The output should be something like this, where I have only showed the first 3 rows.
Thank you! I hope you can help, and that I explained it well enough.
Upvotes: 0
Views: 192
Reputation: 153460
If I undstand your problem correctly, you can use this MCVE to help with your analysis.
df = pd.DataFrame({'ID':[1],'string1':['A,B,C'], 'string2':['X,Y,Z']}, index=[0])
df_joined = df.join([df['string1'].str.split(',', expand=True).add_prefix('s1_'),
df['string2'].str.split(',', expand=True).add_prefix('s2_')])
pd.wide_to_long(df_joined, ['s1','s2'], 'ID', 'No', sep='_').reset_index()
Output:
ID No string1 string2 s1 s2
0 1 0 A,B,C X,Y,Z A X
1 1 1 A,B,C X,Y,Z B Y
2 1 2 A,B,C X,Y,Z C Z
Upvotes: 2