Reputation: 373
I've been trying to write some code to read a CSV file. Some of the lines in the CSV are not complete. I would like the code to skip a bad line if there is data missing in one of the fields. I'm using the following code.
def Test():
dataFile = open('test.txt','r')
readFile = dataFile.read()
lineSplit = readFile.split('\n')
for everyLine in lineSplit:
dividedLine = everyLine.split(';')
a = dividedLine[0]
b = dividedLine[1]
c = dividedLine[2]
d = dividedLine[3]
e = dividedLine[4]
f = dividedLine[5]
g = dividedLine[6]
print (a,b,c,d,e,f,g)
Upvotes: 1
Views: 2278
Reputation: 123473
In my opinion, the Pythonic way to do this would be to use the included csv
module in conjunction with a try/except
block (while following PEP 8 - Style Guide for Python Code).
import csv
def test():
with open('reading_test.txt','rb') as data_file:
for line in csv.reader(data_file):
try:
a,b,c,d,e,f,g = line
except ValueError:
continue # ignore the line
print(a,b,c,d,e,f,g)
test()
This approach is called "It's Easier to Ask Forgiveness than Permission" (EAFP). The other more common style is referred to as "Look Before You Leap" (LBYL). You can read more about them in this snippet from a book by a very authoritative author.
Upvotes: 2
Reputation: 221
This doesn't seem all-that python related so much as conceptual: A line parsed from a csv row will be invalid if: 1. It is shorter than the minimum required length (i.e missing elements) 2. One or more entries parsed come back empty or None (only if all elements are required) 3. The type of an element doesn't match the intended type of the column (not in the scope of what you requested, but good to keep in mind)
In python, once you have split the array, you can check the first two conditions with
if len(dividedLines) < intended_length or ("" in dividedLines): continue
First part just needs you to get the intended length for a row, you can usually use the index row for that. The second part could have the quotes replaced with a None or something, but split returns a empty string so in this case use the "".
HTH
Upvotes: 0
Reputation: 6068
Given that you cannot know before hand whether a given line is incomplete, you need to check if it is and skip it if it is not. You can use continue
for this, which makes the for
loop move to the next iteration:
def Test():
dataFile = open('test.txt','r')
readFile = dataFile.read()
lineSplit = readFile.split('\n')
for everyLine in lineSplit:
dividedLine = everyLine.split(';')
if len(dividedLine) != 7:
continue
a = dividedLine[0]
b = dividedLine[1]
c = dividedLine[2]
d = dividedLine[3]
e = dividedLine[4]
f = dividedLine[5]
g = dividedLine[6]
print (a,b,c,d,e,f,g)
Upvotes: 0