Reputation: 23

How to calculate the mean of a list in a row in csv?

I have a data like this in an CSV file;

x         Y            
[2,3,4]   [3.4,2.5,3.1]
[4,5,2]   [6.2,7.5,9.7]
[2,6,9]   [4.6,2.5,2.4]
[1,3,6]   [8.9,7.5,9.2]

I want to calculate the mean for each list in a row

x                Y            
[2,3,4] < mean   [3.4,2.5,3.1] < mean
[4,5,2] < mean   [6.2,7.5,9.7] < mean
[2,6,9] < mean   [4.6,2.5,2.4] < mean
[1,3,6] < mean   [8.9,7.5,9.2] < mean

and output the mean value to a CSV file.

How can it achieve it using python (pandas)?

EDIT

After some research, I found the solution to my issue above;

import csv
import pandas as pd
import numpy as np
from ast import literal_eval

#csv file you want to import
filename ="xy.csv"
fields = ['X','Y'] #field names

df = pd.read_csv(filename,usecols=fields,quotechar='"', sep=',',low_memory = True)
df.X = df.X.apply(literal_eval)
df.X = df.X.apply(np.mean) #calculates mean for the list in field 'X'
print(df.X) #print result

df.Y = df.Y.apply(literal_eval)
df.Y = df.Y.apply(np.mean) #calculates mean for the list in field 'Y'
print(df.Y)

Upvotes: 1

Answers (2)

SeaBean

Reputation: 23217

You can use .applymap() with np.mean() to map the dataframe element-wise.

import numpy as np

df = df.applymap(eval)     # optional step if your column is a string like a list instead of truly a list
df = df.applymap(np.mean)

Result:

print(df)


          x         Y
0  3.000000  3.000000
1  3.666667  7.800000
2  5.666667  3.166667
3  3.333333  8.533333

Upvotes: 2

Nk03

Reputation: 14949

Via applymap:

# df = df.applymap(lambda x: sum(eval(x))/ len(eval(x)))
df = df.applymap(np.mean) # suggested by alex
df = df.applymap(lambda x: sum(x)/ len(x))

OUTPUT:

          x         Y
0  3.000000  3.000000
1  3.666667  7.800000
2  5.666667  3.166667
3  3.333333  8.533333

Upvotes: 2

How to calculate the mean of a list in a row in csv?

Answers (2)

OUTPUT:

Related Questions