Reputation: 125
I'm trying to print two values, keys from two different dictionaries in one loop, to do some adding*, but I could not able to do it correctly together in one loop, my code
def code (file):
dict1 = {}
dict2 = {}
f = open('text.txt', 'r')
for line in f.readlines():
line =line.strip()
parts = line.split(" ")
try:
(w1, w2) = [parts[0], parts[1]]
word2 = (wr1+' '+wr2)
# dict2words
if word2 in dict2:
dict2[word2] +=1
else:
dict2 [word2] = 1
#dict1word
if w[1] in dict1:
dict1[w[1]] +=1
else:
dict1[w[1]] = 1
expect:
print ('no word')
code(file)
My file look like this
car1 BMW
car2 Benz
Car3 Kia
car1 BMW
car4 BMW
with this code its count two word happen together (bigram) and unigram(2th word) in different dict like this
dic2 dic1
Car1 BMW 2 BMW 3
Car2 Benz 1 Benz 1
Car3 kia 1 kia 2
Car4 BMW 1 BMW 3
(the bigram car1 bmw happen two time and Bmw unigram happen 3 time in the whole carpus)
I manged to print them correctly separately, but could not print it together to do some calculation like this (the code run correctly but together give the wrong output)
for k, v in sorted(dict1.iteritems()):
print k, v
for k1, v1 in sorted(dict2.iteritems()):
print k1,v1
My question here how to print two keys and values at same time in same loop/sorted order to get this result
dic2 dic1 result
Car1 BMW 2 BMW 3 2 * 3
Car2 Benz 1 Benz 1 1 * 1
Car3 kia 1 kia 2 1 * 2
Car4 BMW 1 BMW 3 1 * 3
Upvotes: 0
Views: 922
Reputation: 790
As Slam mentioned using defauldict, it can be done in the following way.
from collections import defaultdict
def code (file):
dictionary1 = defaultdict(list)
dictionary2 = defaultdict(int)
f = open('text.txt', 'r')
partsarray = []
for line in f.readlines():
line =line.strip()
parts = line.split(" ")
parts.remove('')
partsarray.append(parts)
try:
i=0
for part,partforadding in partsarray:
if(part in dictionary1):
i+=1
dictionary1[part].remove(i)
dictionary1[part].append(i+1)
else:
dictionary1[part].append(partforadding)
dictionary1[part].append(1)
dictionary2[partforadding]+=1
print(dictionary1)
print(dictionary2)
except Exception as error:
print("The error is")
print(error)
print ('no word')
code("text.txt")
The output is
defaultdict(<class 'list'>, {'car1': ['BMW', 2], 'car2': ['Benz', 1], 'Car3': ['Kia', 1], 'Car2': ['Kia', 1], 'car4': ['BMW', 1]})
defaultdict(<class 'int'>, {'BMW': 3, 'Benz': 1, 'Kia': 2})
In the file that you mentioned , car2 has both Benz and Kia. But in the output Car2 has only Benz. Is the data correct? Or am I missing something?
Upvotes: 1
Reputation: 8572
There's no "simple" way to do this.
You need to apply the same logic as you apply while you're dividing bigrams. You need to iterate over dict2, for every key you'll see, split it, get unigram, get count from dict1. I.e.:
for bigram, bigram_count in dict2.items():
unigram = bigram.split(' ')[-1]
unigram_count = dict1[unigram]
print(bigram, bigram_count, unigram, unigram_count, bigram_count * unigram_count)
Offtopic: you can simplify your code with defaultdict. Initialize dict1
and dict2
as defaultdict(int)
and you can skip if w in dict: ... else: ...
routine
Upvotes: 2