ozzyzig
ozzyzig

Reputation: 719

Remove empty lines from file or nested list

I have a csv file I am putting into a empty list, line by line, so the end result is a nested list with each line in a list e.g:

[[1.1,2.6,3,0,4.8],[3.5,7.0,8.0]....and so on.....].

The problem is at the end of the file are empty strings which end up in the final list like:

[[1.1,2.6,3,0,4.8],[3.5,7.0,8.0],['','','','','','','','','']]

How do I get rid of these or stop them being appended to the list. They are quite big csv files so I would prefer to stop them being appended to intial list. I feel I am building a extra large list when I probably don't need too, and this may cause memory issues. Here is code so far:

csvfile = open(file_path, 'r')
reader = csv.reader(csvfile)
data_list = []

for row in reader:
    data_list.append(row)
csvfile.close()
i = 0
file_data = []

while i < len(data_list):
    j = 0
    while j < len(data_list[i]):
        try:
            data_list[i][j] = float(data_list[i][j])
        except ValueError:
            pass            
        j += 1
    file_data.append(data_list[i])
    i += 1

print file_data

Upvotes: 2

Views: 1399

Answers (3)

kiriloff
kiriloff

Reputation: 26335

import csv
csvfile = open('C:\\Users\\CBild\\Desktop\\test.txt', 'r')

reader = csv.reader(csvfile)
data_list = []

for row in reader:
    if any(field.strip() for field in row) :
        data_list.append(row)
csvfile.close()

print(data_list)

gives

>>> 
[['12 2 5'], ['1 5 4']]

Indeed, with condition if any(field.strip() for field in row), you treat rows with no character as empty rows also.

Upvotes: 0

Raymond Hettinger
Raymond Hettinger

Reputation: 226171

The problem is at the end of the file are empty strings

You can just decide not to append them:

for row in reader:
    if any(row):              # Checks for at least one non-empty field
       data_list.append(row)

Here is how the any() function works:

>>> any(['132', '', '456'])
True

>>> any(['', '', ''])
False

Upvotes: 3

Jon Clements
Jon Clements

Reputation: 142098

Here's a simplified version of your code that's easier to understand what you're attempting to do and somewhat more Pythonic.

First to open and read your file, we use the with statement so the file is automatically closed, and build a generator to loop over your CSV file only taking rows that contain at least one non-blank column value and converting each element to a float (via a helper function) if possible, otherwise leave it as a string. Then build data_list in one statement instead of appending data...

with open(file_path) as fin:
    csvin = csv.reader(fin)
    rows = (map(to_float_if_possible, row) for row in csvin if any(row))
    data_list = list(rows)

And the helper function is defined as:

def to_float_if_possible(text):
    try:
        return float(text)
    except ValueError as e:
        return text

By the looks of it you may wish to consider numpy or pandas when dealing with this type of data.

Upvotes: 1

Related Questions