Reputation: 401
I have a csv file that looks like this:
A, B
34, "1.0, 2.0"
24, "3.0, 4.0"
I'm reading the file using pandas:
import pandas as pd
df = pd.read_csv('file.csv')
What I need to do is to replace the strings by numpy arrays:
for index, row in df.iterrows():
df['B'][index] = np.fromstring(df['B'][index], sep=',')
However, it raises the error A value is trying to be set on a copy of a slice from a DataFrame
. However, the numpy arrays are being correctly created.
I need all value in B to be of type numpy.ndarray
.
Edit: I tried replacing df by row in the code.
for index, row in df.iterrows():
row['flux'] = np.fromstring(row['flux'][index][1:-1], sep=',')
And no error is raised, but the type of the variables doesn't change and the DataFrame still contains strings.
Upvotes: 1
Views: 1397
Reputation: 862641
Use converters
parameter in read_csv
for convert to numpy array:
import pandas as pd
import numpy as np
from io import StringIO
temp='''A,B
34,"1.0, 2.0"
24,"3.0, 4.0"'''
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), converters={'B':lambda x: np.fromstring(x, sep=',')})
print (df)
A B
0 34 [1.0, 2.0]
1 24 [3.0, 4.0]
Upvotes: 2
Reputation: 4618
You can use apply to change to that format:
df['B'] = df['B'].apply(lambda x: np.fromstring(x, sep=','))
Upvotes: 1