Reputation: 266
I am going through a large CSV file line by line. What I want to do is count occurrences of the strings in a certain column. Where I am running into trouble is that I would like the counter to be nested inside of a dictionary, where the keys for the outer dictionary is the value from another column. I need to do this or else the data will be processed incorrectly as there are duplicates.
imagine my CSV:
outerDictKey CounterKey
apple purple
apple blue
pear purple
So basically I want:
dictionary = { apple:
counter({blue: 1
purple: 1})
pear:
counter({purple: 1})
}
I wasnt sure how to do this.
myCounter = Counter()
myKey = 'barbara'
counterKey = 'streisand'
largeDict = defaultdict(dict)
largeDict[myKey] = {myCounter[counterKey] += 1}
Intuitively this looks like it wouldnt work, and of course it gives a syntax error.
I also tried
largeDict[myKey][myCounter][counterKey]+=1
Which throws a "TypeError: unhashable type: 'Counter'" error.
Finally
>>> largeDict[myKey]=Counter()
>>> largeDict[myKey][myCounter][counterKey]+=1
Which still gives a type error. So how do I increment a Counter nested in a dictionary?
Upvotes: 2
Views: 3583
Reputation: 49826
This will work:
myCounter = Counter()
largedict = { myKey:
{counterKey: myCounter
anotherKey: Value2}
}
largedict[myKey][counterKey]['somethingyouwanttocount']+=1
Counter
is just a dict with some extra functionality. However, as a dict, it cannot be a key in a dict, nor an entry in a set, which explains the unhashable exception.
Alternatively, if you're keeping track of information about coherent entities, rather than using nested dicts
, you could store the information (including the counter) in objects, and put the objects in a dict
as necessary.
If every value is a counter, then just use defaultdict:
from collections import defaultdict, Counter
largedict = defaultdict(Counter)
largedict['apple']['purple']+=1
Upvotes: 7
Reputation: 239473
If you just want to count occurrences of the strings in a certain column
, wouldnt this be enough
import collections
data = "Welcome to stack overflow. To give is to get."
print collections.Counter(data.split())
Output
Counter({'to': 2, 'give': 1, 'get.': 1, 'is': 1, 'Welcome': 1, 'To': 1, 'overflow.': 1, 'stack': 1})
Upvotes: 1