kian
kian

Reputation: 87

calculate of mean for columns of text file in python

I have a text file with 13 columns and 10000 rows. I should calculate mean and standard deviation for second and fifth column , but each 200 numbers one time. Codes:

with open('myfile.txt') as f:
    lis1 = [float(line.split()[1]) for line in f]
    lis2 = [float(line.split()[4]) for line in f]
    i = 0

    while (i < len(lis1)):
        g1 = sum(lis1[i:i+200])/200
        g2 = sum(lis2[i:i+200])/200
        i=i+200 

I can not understand why g2 is empty and how can I fix it?

Upvotes: 2

Views: 1174

Answers (2)

Thane Plummer
Thane Plummer

Reputation: 10208

If possible you should only read through your file once. Otherwise you have to reset the file pointer to read it again. Note: code is untested.

lines = []
with open('myfile.txt') as f:
    lines = [line.split() for line in f]
    i = 0

# Now file is closed - operate on the lines read in.
inc = 200
for i in range(0, len(lines), inc):
    # Get all column total
    column_total = [sum(x) for x in zip(*lines[i:i+inc]) ]
    g1 = column_total[1] / inc
    g2 = column_total[4] / inc

Upvotes: 1

Kamejoin
Kamejoin

Reputation: 347

It is because when you created list "lis1" you went throughout the entire file, which means you should reset it. Try using f.seek(0) between lis1 and lis2 calls.

Upvotes: 1

Related Questions