Reputation: 23
I have a data like this in an CSV file;
x Y
[2,3,4] [3.4,2.5,3.1]
[4,5,2] [6.2,7.5,9.7]
[2,6,9] [4.6,2.5,2.4]
[1,3,6] [8.9,7.5,9.2]
I want to calculate the mean for each list in a row
x Y
[2,3,4] < mean [3.4,2.5,3.1] < mean
[4,5,2] < mean [6.2,7.5,9.7] < mean
[2,6,9] < mean [4.6,2.5,2.4] < mean
[1,3,6] < mean [8.9,7.5,9.2] < mean
and output the mean value to a CSV file.
How can it achieve it using python (pandas)?
EDIT
After some research, I found the solution to my issue above;
import csv
import pandas as pd
import numpy as np
from ast import literal_eval
#csv file you want to import
filename ="xy.csv"
fields = ['X','Y'] #field names
df = pd.read_csv(filename,usecols=fields,quotechar='"', sep=',',low_memory = True)
df.X = df.X.apply(literal_eval)
df.X = df.X.apply(np.mean) #calculates mean for the list in field 'X'
print(df.X) #print result
df.Y = df.Y.apply(literal_eval)
df.Y = df.Y.apply(np.mean) #calculates mean for the list in field 'Y'
print(df.Y)
Upvotes: 1
Views: 209
Reputation: 23217
You can use .applymap()
with np.mean()
to map the dataframe element-wise.
import numpy as np
df = df.applymap(eval) # optional step if your column is a string like a list instead of truly a list
df = df.applymap(np.mean)
Result:
print(df)
x Y
0 3.000000 3.000000
1 3.666667 7.800000
2 5.666667 3.166667
3 3.333333 8.533333
Upvotes: 2