Reputation: 17920
I have a pandas dataframe where one of the columns has array of strings as each element.
So something like this.
col1 col2
0 120 ['abc', 'def']
1 130 ['ghi', 'klm']
Now when i store this to csv using to_csv it seems fine. When i read it back using from_csv i seems to read back. But then when i analyse the value in each cell the array is
'[' ''' 'a' 'b' 'c' and so on. So essentially its not reading it as an array but a set of strings. Can somebody suggest how I can convert this string into an array?
I mean to say the array has been stored like a string
'[\'abc\',\'def\']'
Upvotes: 19
Views: 47274
Reputation: 375377
As mentioned in the other questions, you should use literal_eval
here:
from ast import literal_eval
df['col2'] = df['col2'].apply(literal_eval)
In action:
In [11]: df = pd.DataFrame([[120, '[\'abc\',\'def\']'], [130, '[\'ghi\',\'klm\']']], columns=['A', 'B'])
In [12]: df
Out[12]:
A B
0 120 ['abc','def']
1 130 ['ghi','klm']
In [13]: df.loc[0, 'B'] # a string
Out[13]: "['abc','def']"
In [14]: df.B = df.B.apply(literal_eval)
In [15]: df.loc[0, 'B'] # now it's a list
Out[15]: ['abc', 'def']
Upvotes: 38
Reputation: 12092
Without pandas, this is one way to do it using the ast
modules' literal_eval()
:
>>> data = "['abc', 'def']"
>>> import ast
>>> a_list = ast.literal_eval(data)
>>> type(a_list)
<class 'list'>
>>> a_list[0]
'abc'
Upvotes: 2
Reputation: 17920
Nevermind got it.
All i had to do was
arr = s[1:-1].split(',')
This got rid of the square brackets and also split the string into an array like I wanted.
Upvotes: 6
Reputation: 4874
Maybe try using a different separator value? Like so:
DataFrame.to_csv(filepath, sep=';')
and then read with
DataFrame.from_csv(filepath, sep=';')
Upvotes: 0