fightstarr20
fightstarr20

Reputation: 12568

Split CSV file into 2 lists depending on value

I have a UTF-8 encoded CSV file like this..

"id","name","type","price"
"23","item1","t-shirt","37"
"56","item66","jumper","3"
"366","item7","jumper","55"
"745","item 9","t-shirt","45"
"3245","item 12","t-shirt","67"
"654","item 88","jumper","66"
"2","item 99","jumper","77"

And using python I am trying to split it into 2 lists based on type to give me an output like this..

File 1
id   | name    | type    | price
--------------------------------
23   | item1   | t-shirt | 37
745  | item 9  | t-shirt | 45
3245 | item 12 | t-shirt | 67

File 2
id   | name    | type    | price
--------------------------------
56   | item66  | jumper  | 3
366  | item7   | jumper  | 55
654  | item 88 | jumper  | 66
2    | item 99 | jumper  | 77

I have this so far

list1 = []
list2 = []
 
with open(filename, encoding="utf8") as csv_file:

    csv_reader = csv.reader(csv_file, delimiter='\t')
    
        for row in csv_reader:
        
            if row['type'] == 't-shirt':
                list1.append(row)
            elif row['type'] == 't-shirt':
                list2.append(row)

            

But this is giving me the following error...

TypeError: list indices must be integers or slices, not str

Where am I going wrong?

Upvotes: 0

Views: 468

Answers (2)

Hermes Morales
Hermes Morales

Reputation: 637

Another alternative is to use pandas.

def sort_by_type(file):
    import pandas as pd
    table = pd.read_csv(file)
    File_1 = table[table["type"]=="jumper"]
    File_2 = table[table["type"]=="t-shirt"]
    return File_1, File_2

File_1, File_2 = sort_by_type(file)
File_1


    id  name    type    price
1   56  item66  jumper  3
2   366 item7   jumper  55
5   654 item88  jumper  66
6   2   item99  jumper  77

File_2


    id      name    type    price
 0  23      item1   t-shirt 37
 3  745     item 9  t-shirt 45
 4  3245    item 12 t-shirt 67

Upvotes: 0

pho
pho

Reputation: 25489

When you use csv.reader, row is a list. List indices must be integers, so row['type'] isn't allowed. If you want it to be a dict, use csv.DictReader

list1 = []
list2 = []
 
with open(filename, encoding="utf8") as csv_file:
    csv_reader = csv.DictReader(csv_file, delimiter='\t')
    for row in csv_reader:     
        if row['type'] == 't-shirt':
            list1.append(row)
        elif row['type'] == 'jumper':
            list2.append(row)

Alternatively, if you want to stick to csv.reader, use integer indices (you can hardcode row[2] or look for "type" in the column headers which you'll get as the first record of the csv file):

with open(filename, encoding="utf8") as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='\t')
    columns = next(csv_reader)
    type_col = columns.index("type")
    for row in csv_reader:     
        if row[type_col] == 't-shirt':
            list1.append(row)
        elif row[type_col] == 'jumper':
            list2.append(row)

Upvotes: 1

Related Questions