Pandas separate column containing string with a semicolon to multiple columns

Question

I am unable to split a pandas series that contains a semicolon. Is it because I am using the column name ('Social_Media') as an index or is it because python wont recognise a semicolon as a split character? Or is something wrong with my script?

#Filters the NaN columns
df2 = df[df['Social_Media'].notnull()]
# Splitter for semicolon
df2['Social_Media'].apply(lambda x: x.split(';')[0])

#This is my output after the split
Timestamp                             
2017-06-01 18:10:46          Twitter;Facebook;Instagram;WhatsApp;Google+
2017-06-01 19:24:04          Twitter;Facebook;Instagram;WhatsApp;Google+
2017-06-01 19:25:21          Twitter;Facebook;Instagram;WhatsApp;Google+

What I need to see as output.

Timestamp                    name_a  name_b   name_c    name_d   name_e
2017-06-01 18:10:46          Twitter Facebook Instagram WhatsApp Google+
2017-06-01 19:24:04          Twitter Facebook Instagram WhatsApp Google+
2017-06-01 19:25:21          Twitter Facebook Instagram WhatsApp Google+

jezrael · Accepted Answer

You can use str.split

df = df['Social_Media'].str.split(';', expand=True).add_prefix('name_')
print (df)
                      name_0    name_1     name_2    name_3   name_4
Timestamp                                                           
2017-06-01 18:10:46  Twitter  Facebook  Instagram  WhatsApp  Google+
2017-06-01 19:24:04  Twitter  Facebook  Instagram  WhatsApp  Google+
2017-06-01 19:25:21  Twitter  Facebook  Instagram  WhatsApp  Google+

And for columns names by alphabet:

import string
L = list(string.ascii_lowercase)
names = dict(zip(range(len(L)), ['name_' + x for x in  L]))

df = df['Social_Media'].str.split(';', expand=True).rename(columns=names)
print (df)
                      name_a    name_b     name_c    name_d   name_e
Timestamp                                                           
2017-06-01 18:10:46  Twitter  Facebook  Instagram  WhatsApp  Google+
2017-06-01 19:24:04  Twitter  Facebook  Instagram  WhatsApp  Google+
2017-06-01 19:25:21  Twitter  Facebook  Instagram  WhatsApp  Google+

Pandas separate column containing string with a semicolon to multiple columns

Answers (1)

Related Questions