Filtering CSV rows by specific column data

Question

I'd like to filter a CSV file (without headers) containing hundreds of rows based on the value in column 12. Values that filter these rows contain data like "00GG", "05FT", "66DM" and 10 more.

With the code below I'm able to print rows based on one criteria:

def load_source(filename):
    with open(filename, "r") as f:
        reader = csv.reader(f, delimiter=";")
        return list(reader)

sourcecsv = load_source("data1.csv")

for row in sourcecsv:
    if row[12] == "00GG":
        print(row)

Since the filtering of data(1.csv) is important for any queries later on, I assume it would be wise to include it already in the function load_source. I tried to do a similar "for row.. if row[12]" loop like above with a list of criteria instead of one string and append it to a new list but I got an empty list whenever I tried to print(sourcecsv) afterwards. Thanks for any help.

markiz · Accepted Answer

You could do:

def load_source(filename):
    with open(filename, "r") as f:
        reader = csv.reader(f, delimiter=";")
        return filter(lambda x: x[12] in ("00GG", "05FT", "66DM")), list(reader))

But using pandas would probably be a better idea, it can load csv files, filter them and much more with ease.

http://pandas.pydata.org/

Filtering CSV rows by specific column data

Answers (2)

Related Questions