Reputation: 679
I have the following DataFrame
index word decoded_Word language
0 potato [17, 24, 1, 21, 1, 24] english
1 animal [21, 13, 23, 18, 21, 25] english
2 שלום ... hebrew
and I want to convert it to a csv file, i used the following line
df.to_csv('dataset.csv',encoding='utf8',index=False)
and get the following file
potato,[17 24 1 21 1 24],english
animals,[21 13 23 18 21 25 4],english
שלום,[21 12 6 24],hebrew
but when I execute the following code I get
data = pd.read_csv('dataset.csv')
print(type(data['decoded_word'][0]))
the result is str
I would like to know if there is better way to save/load the numpy array.
Thank you.
Upvotes: 2
Views: 4662
Reputation: 198
That is normal, because pandas does not store the format of the columns in a csv file, and there is only so much it can infer.
To solve this simply, after loading the dataset (so after data = pd.read_csv('dataset.csv')
) do:
data[decoded_word] = data[decoded_word].astype(list)
This will change the type of the column to list
. You may be able to convert it to a numpy.ndarray
as well.
An alternative, if this is possible for you, is to store the dataframe in another format, e.g., pickle:
data.to_pickle('dataset.pkl')
This should preserve the columns types.
Note: I see a comment indicating that you should use eval
. This should work as well, but, as a rule, I prefer to never use eval for manipulating data unless it is the only way and you are very sure there is no security threat.
Upvotes: 2
Reputation: 24322
Before saving change the type of 'decoded_Word' from np.array
to list
then save it to csv:
df['decoded_Word']=df['decoded_Word'].map(list)
#Finally save that csv:
df.to_csv('dataset.csv',encoding='utf8',index=False)
Now load that file:
data = pd.read_csv('dataset.csv')
#Since the 'decoded_Word' is string so make it real list by:
data['decoded_Word']=pd.eval(data['decoded_Word'])
#(optional if you need array then):
data['decoded_Word']=data['decoded_Word'].map(np.array)
Upvotes: 1