GG3
GG3

Reputation: 31

My for-Loop isn't working as intended

I want to fill in the list_of_occurences with the correct item from the list grundformen.

My for-loop doesn't work as intended though. It doesn't restart from the beginning and only goes through the rows in the reader once. Therefore it won't fill the list completely.

This is what it prints (you can see the part where something is missing - because it doesn't start searching from the beginning of the list - ):

# List_of_occurrences (1 line - wrapped for easier reading)
[['NN', 1328, ('Ziel',)], ['ART', 771, ('der',)], 
 ['$.', 732, ('_',)], ['VVFIN', 682, ('schlagen',)], 
 ['PPER', 592, ('sie',)], ['$,', 561, ('_',)], 
 ['ADV', 525, ('So',)], ['APPR', 507, ('in',)], 
 ['NE', 433, ('Johanna',)], ['$(', 363, ('_',)], 
 ['VAFIN', 334, ('haben',)], ['ADJA', 307, ('tragisch',)], 
 ['ADJD', 278, ('recht',)], ['KON', 228, ('Doch',)], 
 ['VVPP', 194, ('reichen',)], ['VVINF', 161, ('stören',)], 
 ['KOUS', 151, ('Während',)], ['PPOSAT', 120, ('ihr',)], 
 ['PTKVZ', 104, ('weiter',)], ['PRF', 98, ('sich',)], 
 ['APPRART', 90, ('zu',)], ['PTKNEG', 87, ('nicht',)], 
 ['VMFIN', 76, ('sollen',)], ['PIAT', 66, ('kein',)], 
 ['PIS', 65, ('etwas',)], ['PTKZU', 52, ('zu',)], 
 ['PRELS', 51, ('wer',)], ['PROAV', 42, ('dabei',)],  
 ['PDS', 38, ('jener',)], ['PDAT', 37, ('dieser',)], 
 ['PWAV', 30, ('wie',)], ['PWS', 26, ('Was',)], 
 ['CARD', 24, ('drei',)], ['KOKOM', 21, ('wie',)], 
 ['VAINF', 18, ('werden',)], ['KOUI', 15, ('um',)], 
 ['VMINF', 10, ('können',)], ['VVIZU', 10, ('aufklären',)], 
 ['VAPP', 10], ['PTKA', 6], ['PTKANT', 6], ['PWAT', 4], 
 ['VVIMP', 4], ['PRELAT', 4], ['APZR', 3], ['APPO', 2], 
 ['FM', 1]]

# Grundformen (1 line, wrapped for reading)
['Ziel', 'der', '_', 'schlagen', 'sie', '_', 'So', 'in', 'Johanna',
 '_', 'haben', 'tragisch', 'recht', 'Doch', 'reichen', 'stören', 
 'Während', 'ihr', 'weiter', 'sich', 'zu', 'nicht', 'sollen', 'kein', 
 'etwas', 'zu', 'wer', 'dabei', 'jener', 'dieser', 'wie', 'Was', 
 'drei', 'wie', 'werden', 'um', 'können', 'aufklären']  

occurences = collections.Counter()

with open("material-2.csv", mode='r', newline='', encoding="utf-8") as material:
    reader = csv.reader(material, delimiter='\t', quotechar="\t")
    for line in reader:
        if line:
            occurences[line[5]] += 1
        else:
            pass

list_of_occurences = [list(elem) for elem in occurences.most_common()]

grundformen = []
with open('material-2.csv', mode='r', newline='', encoding="utf-8") as material:
    reader = csv.reader(material, delimiter='\t', quotechar="\t")
    for elem in list_of_occurences:
        for row in reader:
            if row != [] and row[5] == elem[0]:
                grundformen.append(row[2])
                break

iterator = 0
for elem in grundformen:
    list_of_occurences[iterator].insert(2, elem)
    iterator = iterator + 1
    pass

print(list_of_occurences)
print(grundformen)

whole inputfile: https://www.dropbox.com/sh/xyktjk4ycm8x6v0/AACou438_eEWx-ZYmByBiqp_a/material-2.csv?dl=0

Part of my input file:

1 Als Als _ _ KOUS _ _ 6 6 CP CP _ _ 2 es es _ _ PPER _ 3|Nom|Sg|Neut 6 6 SB SB _ _ 3 zu zu _ _ PTKA _ _ 4 4 MO MO _ _ 4 schneien schneien _ _ ADJD _ Comp|Dat|Sg|Fem 5 5 MO MO _ _ 5 aufgehört aufhören _ _ VVPP _ Psp 6 6 OC OC _ _ 6 hatte haben _ _ VAFIN _ 3|Sg|Past|Ind 8 8 MO MO _ _ 7 , _ _ _ $, _ _ 8 8 PUNC PUNC _ _ 8 verließ verlassen _ _ VVFIN _ 3|Sg|Past|Ind 0 0 ROOT ROOT _ _ 9 Johanna Johanna _ _ NE _ Nom|Sg|Masc 8 8 SB SB _ _ 10 von von _ _ APPR _ _ 5 5 SBP SBP _ _ 11 Rotenhoff Rotenhoff _ _ NE _ Dat|Sg|Neut 10 10 NK NK _ _ 12 , _ _ _ $, _ _ 8 8 PUNC PUNC _ _ 13 ohne ohne _ _ KOUI _ _ 18 18 CP CP _ _ 14 ein ein _ _ ART _ Nom|Sg|Neut 16 16 NK NK _ _ 15 rechtes recht _ _ ADJA _ Pos|Nom|Sg|Neut 16 16 NK NK _ _ 16 Ziel Ziel _ _ NN _ Nom|Sg|Neut 18 18 OA OA _ _ 17 zu zu _ _ PTKZU _ _ 18 18 PM PM _ _ 18 haben haben _ _ VAINF _ Inf 8 8 MO MO _ _ 19 , _ _ _ $, _ _ 18 18 PUNC PUNC _ _ 20 das der _ _ ART _ Nom|Sg|Neut 21 21 NK NK _ _ 21 Gutshaus Gutshaus _ _ NN _ Nom|Sg|Neut 16 16 APP APP _ _ 22 . _ _ _ $. _ _ 8 8 PUNC PUNC _ _

how can I change my loop, so that it can fill in everything?

Upvotes: 1

Views: 174

Answers (2)

tijko
tijko

Reputation: 8322

You had an issue with how you were reading in your csv data.

Here the data is read into a list and can be gone through for the second loop instead of opening another file-object but you don't even need to loop through the csv data twice:

import csv
import collections

occurences = collections.Counter()
grundformen = collections.defaultdict(list)

with open("material-2.csv", mode='r', newline='', encoding="utf-8") as material:
    reader = [ln for ln in csv.reader(material, delimiter='\t', quotechar="\t") if ln]
    for line in reader:
        occurences[line[5]] += 1
        grundformen[line[5]].append(line[2])
    list_of_occurences = list(map(list, occurences.most_common()))
    for elem in list_of_occurences:
        elem.append(grundformen[elem[0]][0])

print(occurences)

By making a list out of your csv data, you are able to call the break statement and still be able to start a fresh at the head of the list for your next loop. When you loop over the csv.reader this is an iterator so even when calling break you will start where you left off until its data is exhausted.

Upvotes: 0

jcoppens
jcoppens

Reputation: 5440

reader = csv.reader(material, delimiter='\t', quotechar="\t")

Setting the quotechar the same as the delimiter looks rather strange. The CSV reader will probably get confused, and take either all tabs (\t) as delimiters, or interpret them all as quotechars.

Upvotes: 1

Related Questions