Reputation: 69
So I am attempting to iterate through a .csv file and do some calculations based off of it, my problem being that the file is 10001 lines long and when my program executes it only seems to read 5001 of those lines. Am I doing something wrong when reading in my data or is there a memory limit or some sort of other limitation I am running into? The calculations are fine but they are off from the expected results in some instances and thus I am lead to believe that the missing half of the data will solve this.
fileName = 'normal.csv' #input("Enter a file name: ").strip()
file = open(fileName, 'r') #open the file for reading
header = file.readline().strip().split(',') #Get the header line
data = [] #Initialise the dataset
for index in range(len(header)):
data.append([])
for yy in file:
ln = file.readline().strip().split(',') #Store the line
for xx in range(len(data)):
data[xx].append(float(ln[xx]))
And here is some sample output, yet to be completley formatted but it will be eventually:
"""The file normal.csv contains 3 columns and 5000 records.
Column Heading | Mean | Std. Dev.
--------------------+--------------------+--------------------
Width [mm]|999.9797|2.5273
Height [mm]|499.9662|1.6889
Thickness [mm]|12.0000|0.1869"""
As this is homework I would ask that you attempt to keep responses helpful but not outright the solution, thank you.
Upvotes: 0
Views: 2176
Reputation: 1121534
That's because you are asking Python to read lines in two different locations:
for yy in file:
and
ln = file.readline().strip().split(',') #Store the line
yy
is already a line from the file, but you ignored it; iteration over a file object yields lines from the file. You then read another line using file.readline()
.
If you use iteration, don't use readline()
as well, just use yy
:
for yy in file:
ln = yy.strip().split(',') #Store the line
You are re-inventing the CSV-reading wheel, however. Just use the csv
module instead.
You can read all data in a CSV file into a list per column with some zip()
function trickery:
import csv
with open(fileName, 'r', newline='') as csvfile:
reader = csv.reader(csvfile, quoting=csv.QUOTE_NONNUMERIC) # convert to float
header = next(reader, None) # read one row, the header, or None
data = list(zip(*reader)) # transpose rows to columns
Upvotes: 2