Jane
Jane

Reputation: 1

How do I create a dict, with csv column contents as keys, and occurrence count as value?

I have a csv file with the following columns:

Item Name, Item Type, Manufacturer Name

I need to write a function that creates a dictionary, where the keys are the phrase in the Item Type column and the value, is the occurrence count of the phrase, then I need to print that dictionary.

As far as I can see, it adds the Item Type as the key, but runs into a problem storing the associated value.

Here is the csv contents:

Item Name, Item Type, Manufacturer Name
Elektra Clone,Fuzzstortion,ollieMAX! Pedals
Sputnik II,Fuzz,Spaceman Pedals
Pumpkin Pi,Fuzz,Green Carrot Pedals
Carcosa,Fuzz,DOD
Big Muff Pi (Black Russian),Fuzz,Electro-Harmonix
Octopuss,Passive Octave Up,Bigfoot Engineering
Small Stone,Phaser,Electro-Harmonix
Grand Orbiter,Phaser,Earthquaker Devices
Hummingbird,Tremolo,Earthquaker Devices
Echosystem,Digital Delay,Empress Effects
Freeze,Sound Retainer,Electro-Harmonix
Ditto,Looper,TC Electronic
Stamme[n],Glitch Delay,Drolo

Here is my code:

def countItemTypes(fileName):
    #create an empty dictionary as we need to store key/value pairs
    itemDic = {}
    # where fileName is the name of the csv file
    #first we must open the csv file and read it
    import csv
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header because we don't need to do anything with it
        next(csvReader)
        #now that we have skipped the header we need to iterate through the rows
        for row in csvReader:
            #troubleshooting diagnostic, for loop:
            #print(row)
            #now we need to take the second column entry of the csv and assign that as the key
            #and the total number of its instances as the value to that key
            #quite frankly I have no idea how to do that.
            if itemDic[row[1]] not in itemDic:
                itemDic[row[1]] = 1
            else:
                itemDic[row[1]] += 1
        #print the new dictionary
        print (itemDic)

It runs into KeyError: 'Fuzzstortion' when it hits:

if itemDic[row[1]] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

Upvotes: 0

Views: 889

Answers (3)

Trenton McKinney
Trenton McKinney

Reputation: 62413

Use collections.defaultdict:

  • defaultdict if from the collections module and is a subclass of dict. It's similar to a dict, the exception being, a defaultdict will set a default value for a new key. defaultdict removes the need to first check if a key exists and sets a value.
  • Replace itemDic = {} with itemDic = defaultdict(int)
  • Replace if-else section with itemDic[row[1]] += 1
  • Code implemented below
from collections import defaultdict, Counter  # pick which one you want to use
import csv


def countItemTypes(fileName: str) -> defaultdict:
    """
    Parse a csv file and return a dict with the word count
    of the second column, Item Type
    fileName: Name of csv file to parse
    """
    # create empty defauldict
    itemDic = defaultdict(int)  # or you can use itemDic = Counter()
    # open fileName
    with open(fileName, "r") as itemFile:
    #we are using itemFile as the handle
        csvReader = csv.reader(itemFile, delimiter=",", quotechar='"')
        #skip the header
        next(csvReader)
        # iterate through the rows
        for row in csvReader:
            # assign word from second column as a key and count occurrences
            itemDic[row[1]] += 1
        #return the new dictionary
        return itemDic

Usage:

word_count = countItemTypes('test.csv')
print(word_count)

>>>
defaultdict(int,
            {'Fuzzstortion': 1,
             'Fuzz': 4,
             'Passive Octave Up': 1,
             'Phaser': 2,
             'Tremolo': 1,
             'Digital Delay': 1,
             'Sound Retainer': 1,
             'Looper': 1,
             'Glitch Delay': 1})

Upvotes: 0

Raulillo
Raulillo

Reputation: 316

You are asking the wrong question in your if statement:

itemDic[row[1]] not in itemDic

itemDic stores pairs type-repetitions.

  • What you are asking:

    Is the repetition of the type in row[1] not present in the dictionary?

  • What you are trying to ask:

    Is the type in row[1] not present in the dictionary?

    row[1] not in itemDic

Just as a note, try to put all imports at the start of the file, it's more clear and readable.

Upvotes: 0

absolutelydevastated
absolutelydevastated

Reputation: 1747

The problem with your if condition is that you actually want to check for this

# Check row[1] not in the dictionary
if row[1] not in itemDic:
    itemDic[row[1]] = 1
else:
    itemDic[row[1]] += 1

Upvotes: 1

Related Questions