Reputation: 4301
I have a dataframe containing numpy array.
I saved it to a csv file.
After loading the csv file, I found that the column containing the numpy array has dtype string.
How to convert it to numpy array using read_csv
?
import pandas as pd
import numpy as np
df = pd.DataFrame(columns = ['name', 'sex'])
df.loc[len(df), :] = ['Sam', 'M']
df.loc[len(df), :] = ['Mary', 'F']
df.loc[len(df), :] = ['Ann', 'F']
#insert np.array
df['data'] = ''
df['data'][0] = np.array([2,5,7])
df['data'][1] = np.array([6,4,8])
df['data'][2] = np.array([9,2,1])
#save to csv file
df.to_csv('data.csv', index =False)
#load csv file
df2 = pd.read_csv('data.csv')#data column becomes string, how to change it to np.array?
Upvotes: 3
Views: 2533
Reputation: 7848
Its a workaround:
In [114]: df2['data'] = df2.data.str.split(' ',expand=True).replace('\[|\]','',regex=True).astype(int).values.tolist()
In [115]: df2['data'] = [np.array(i) for i in df2.data]
In [116]: df2.loc[0,'data']
Out[116]: array([2, 5, 7])
Upvotes: 1
Reputation: 3421
Pandas has only 7 datatypes: Object, float, int, bool, datetime, timedelta and category. So list, string, array etc. is treated as object datatype only. You can read more about it in http://pbpython.com/pandas_dtypes.html You can use astype function to convert between these datatypes only.
Upvotes: 0