martian_rover
martian_rover

Reputation: 341

Convert Pandas column with list to string

I am trying to convert pandas dataframe column which has list for each row to string in each row but somehow it not converting. Here is what I have tried from other answer.

Code:

import pandas as pd
import numpy as np
data = pd.DataFrame.from_dict(dict, orient = 'index') # save the given data below in dict variable to run this line

first approach:

data['tags'] = data['Tags'].apply(lambda x: ' '.join(map(str, x)))

second approach:

data['tags']=[''.join(map(str,item)) for item in data['Tags']]

but both of these gives me same list of strings in tags column. As shown below,

0    ['python', 'windows', 'pip', 'pygame', 'pycharm']
1                         ['converters', 'dxf', 'dwg']
2                           ['python', 'regex', 'nlp']
3          ['sql', 'join', 'dynamic', 'logic', 'case']
4                   ['r-markdown', 'hugo', 'blogdown']
Name: tags, dtype: object

and I want in this form

0    'python', 'windows', 'pip', 'pygame', 'pycharm'
1                         'converters', 'dxf', 'dwg'
2                           'python', 'regex', 'nlp'
3          'sql', 'join', 'dynamic', 'logic', 'case'
4                   'r-markdown', 'hugo', 'blogdown'
Name: tags, dtype: object

Here is the data (first 5 rows) using data.head(5).to_dict(orient = 'index')

{'Tags': {0: "['python', 'windows', 'pip', 'pygame', 'pycharm']",
  1: "['converters', 'dxf', 'dwg']",
  2: "['python', 'regex', 'nlp']",
  3: "['sql', 'join', 'dynamic', 'logic', 'case']",
  4: "['r-markdown', 'hugo', 'blogdown']"},
 'cleaned_text': {0: 'matter pip version instal specific python version read round still stuck upgrade python 3 7 x python 3 8 1 windows 10 go cmd prompt check pip instal module',
  1: 'convert dwg dxf node php jave etc convert dwg file dxf node js python java already try ogr2ogr success thank advance',
  2: 'match text base string list extract subsection python try generate structure earning call text look like following sample operator lady gentleman thank stand welcome xyz fourth quarter',
  3: 'sql dynamically join table various column first time posting use case want join sale datum master agreement table determine applicable fee transactional level hard part agreement',
  4: 'ok update hugo run 2 year old version hugo academic theme blogdown late version hugo ubuntu 19 10 repos 0 58 new version 0 65 download hugo website'}}

Upvotes: 5

Views: 6531

Answers (3)

martian_rover
martian_rover

Reputation: 341

Well, my issue seems like '['a','b','d']' which doesn't appear in dataframe but thanks to @Quang Hoang, he pointed out in my output data which i gave in question. It can also be solved with ast.literal_eval()

import ast
data['tags'] = [ast.literal_eval(item) for item in data['Tags']]

Output:

0    'python', 'windows', 'pip', 'pygame', 'pycharm'
1                         'converters', 'dxf', 'dwg'
2                           'python', 'regex', 'nlp'
3          'sql', 'join', 'dynamic', 'logic', 'case'
4                   'r-markdown', 'hugo', 'blogdown'
Name: tags, dtype: object

Here, literal_eval() evaluate an expression node or a string containing a Python literal or container display. To know more about it and when and how to use refer this question

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150815

You can try:

data['Tags'] = data['Tags'].str[1:-1]

Output data['Tags']

0    'python', 'windows', 'pip', 'pygame', 'pycharm'
1                         'converters', 'dxf', 'dwg'
2                           'python', 'regex', 'nlp'
3          'sql', 'join', 'dynamic', 'logic', 'case'
4                   'r-markdown', 'hugo', 'blogdown'
Name: Tags, dtype: object

Upvotes: 1

Djib2011
Djib2011

Reputation: 7442

I think what you're looking for is:

data['tags'] = data['Tags'].apply(lambda x: ' '.join(x))

Example

ser = pd.Series([['python', 'windows', 'pip', 'pygame', 'pycharm'],
                 ['converters', 'dxf', 'dwg'],
                 ['python', 'regex', 'nlp'],
                 ['sql', 'join', 'dynamic', 'logic', 'case'],
                 ['r-markdown', 'hugo', 'blogdown']])

ser.apply(lambda x: ' '.join(x))

will produce

0    python windows pip pygame pycharm
1                   converters dxf dwg
2                     python regex nlp
3          sql join dynamic logic case
4             r-markdown hugo blogdown
dtype: object

If you want it exactly like you show then you can do the following

ser.apply(lambda x: "'" + "', '".join(x) + "'")

which will produce

0    'python', 'windows', 'pip', 'pygame', 'pycharm'
1                         'converters', 'dxf', 'dwg'
2                           'python', 'regex', 'nlp'
3          'sql', 'join', 'dynamic', 'logic', 'case'
4                   'r-markdown', 'hugo', 'blogdown'
dtype: object

Upvotes: 5

Related Questions