Reputation: 421
I have a dataframe that contains a list of numbers (positive and negative numbers). I persist the dataframe to a csv, and when I read it the list of numbers is a string. And it's difficult to convert it back to a list: python complains about the square brackets and the minus sign. Is there a way of persisting lists of numbers and reading them back as list of numbers?
data = [['tom', [10,-5,3]], ['dave', [15,-1,4]], ['al', [14,-1,-1]]]
df1 = pd.DataFrame(data, columns = ['Name', 'Points'])
df1.to_csv("points.csv")
df2 = pd.read_csv("points.csv")
The points column in df2 is a string. How to converti it to a list of numbers?
Upvotes: 0
Views: 553
Reputation: 33938
Don't store your data as a Python list inside a pandas dataframe, that's going to be a pain to write out as CSV and read back, the types will get mangled (unless you use pickle, or JSON, which you can, but why unnecessarily create complications?).
Easier to simply store as a native pandas dataframe:
df3 = pd.DataFrame({'tom': [10,-5,3], 'dave': [15,-1,4], 'al': [14,-1,-1]})
df3
tom dave al
0 10 15 14
1 -5 -1 -1
2 3 4 -1
df3.to_csv('my.csv', index=False)
# Now when we read it back in, the integer columns remain integer...
df3in = pd.read_csv('my.csv')
tom dave al
0 10 15 14
1 -5 -1 -1
2 3 4 -1
df3.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
tom 3 non-null int64
dave 3 non-null int64
al 3 non-null int64
dtypes: int64(3)
memory usage: 152.0 bytes
Upvotes: 1
Reputation: 862671
You can use pickle here with DataFrame.to_pickle
and read_pickle
, because csv data are always strings:
data = [['tom', [10,-5,3]], ['dave', [15,-1,4]], ['al', [14,-1,-1]]]
df1 = pd.DataFrame(data, columns = ['Name', 'Points'])
df1.to_pickle("points.pkl")
df2 = pd.read_pickle("points.pkl")
print (type(df2.loc[0, 'Points']))
<class 'list'>
Upvotes: 2