Reputation: 169
I have a data frame of messages from a social network. In this date frame, I created a new column without stop words with the use of a lambda function. As a result in this new column, the values are inserted within a list. What I need is to get the values within this list.
What I have:
raw_data = {'CLASS':['1', '2', '3', '1', '2','3','2'],
'MESSAGES': [['mama', 'said', 'home'],['dad', 'said', 'soccer', 'reality'], ['matrix', 'you'],
['run', 'neo', 'free'], ['what', 'doing'], ['begnning', 'believe'],
['choice', 'let', 'you', 'free', 'mind']]}
dfRaw = pd.DataFrame(raw_data, columns = ['CLASS','MESSAGES'])
What I need:
clean_data = {'CLASS':['1', '2', '3', '1', '2','3','2'],
'MESSAGES':['mama, said, home', 'dad, said, soccer, reality', 'matrix, you', 'run, neo, free', 'what, doing','begnning, believe','choice, let, you, free, mind']}
dfEndResult = pd.DataFrame(clean_data, columns = ['CLASS','MESSAGES'])
I read a topic right here on the Stack where the function was suggested:
dfRaw.applymap(lambda x: x if not isinstance(x, list) else x[0] if len(x) else '')
but this function for me is not interesting because it is efficient when the list has only one element. In my case each cell has a different size list.
Thank you all for the help.
Upvotes: 0
Views: 27