Reputation: 719
I have a csv file I am putting into a empty list, line by line, so the end result is a nested list with each line in a list e.g:
[[1.1,2.6,3,0,4.8],[3.5,7.0,8.0]....and so on.....].
The problem is at the end of the file are empty strings which end up in the final list like:
[[1.1,2.6,3,0,4.8],[3.5,7.0,8.0],['','','','','','','','','']]
How do I get rid of these or stop them being appended to the list. They are quite big csv files so I would prefer to stop them being appended to intial list. I feel I am building a extra large list when I probably don't need too, and this may cause memory issues. Here is code so far:
csvfile = open(file_path, 'r')
reader = csv.reader(csvfile)
data_list = []
for row in reader:
data_list.append(row)
csvfile.close()
i = 0
file_data = []
while i < len(data_list):
j = 0
while j < len(data_list[i]):
try:
data_list[i][j] = float(data_list[i][j])
except ValueError:
pass
j += 1
file_data.append(data_list[i])
i += 1
print file_data
Upvotes: 2
Views: 1399
Reputation: 26335
import csv
csvfile = open('C:\\Users\\CBild\\Desktop\\test.txt', 'r')
reader = csv.reader(csvfile)
data_list = []
for row in reader:
if any(field.strip() for field in row) :
data_list.append(row)
csvfile.close()
print(data_list)
gives
>>>
[['12 2 5'], ['1 5 4']]
Indeed, with condition if any(field.strip() for field in row)
, you treat rows with no character as empty rows also.
Upvotes: 0
Reputation: 226171
The problem is at the end of the file are empty strings
You can just decide not to append them:
for row in reader:
if any(row): # Checks for at least one non-empty field
data_list.append(row)
Here is how the any() function works:
>>> any(['132', '', '456'])
True
>>> any(['', '', ''])
False
Upvotes: 3
Reputation: 142098
Here's a simplified version of your code that's easier to understand what you're attempting to do and somewhat more Pythonic.
First to open and read your file, we use the with
statement so the file is automatically closed, and build a generator to loop over your CSV file only taking rows that contain at least one non-blank column value and converting each element to a float (via a helper function) if possible, otherwise leave it as a string. Then build data_list
in one statement instead of appending data...
with open(file_path) as fin:
csvin = csv.reader(fin)
rows = (map(to_float_if_possible, row) for row in csvin if any(row))
data_list = list(rows)
And the helper function is defined as:
def to_float_if_possible(text):
try:
return float(text)
except ValueError as e:
return text
By the looks of it you may wish to consider numpy
or pandas
when dealing with this type of data.
Upvotes: 1