Reputation: 607
Hey guys so I have set up 2 dictionaries which have the same keys but different values for both. I am trying to get the code to print out like this
Digit Count % 1 2 3 4 5 6 7 8 9
The count is the countList and the % is the numFreq Values with their numbers also going down in the Count and % respectively.
Okay so the Data File looks like this (only doing some numbers because the file is pretty big
Census Data Alabama Winfield 4534 Alabama Woodland 208 Alabama Woodstock 1081 Alabama Woodville 743 Alabama Yellow Bluff 175 Alabama York 2477 Alaska Adak 361
the count is the number of occurences of the first digit of the number. I basically turned each line into a list and appended the last value of the list (the number) to a new list. So then I did a Count for how many times 1, 2, 3, 4, 5, 6 , 7, 8 ,9 appear. That's what countList represents. So I stored that in a dictionary with the keys being the digits and the counts being the values. The % is the relative frequency of the count. So I set up a new list and calculated the relative frequency which is basically the count + the sum of all the counts and rounded it off to one digit. The % column has the relative count of each digit. I put that into a dictionary also where the keys are the digits 1, 2, 3, 4, 5, 6, 7, 8, 9. So now I just need to print these numbers into the 3 columns,
Here is my code so far
def main():
num_freq = {}
pop_num = []
inFile = open ("Census__2008.txt", "r")
count = 0
for line in inFile:
if (count == 0):
count += 1
continue
else:
count += 1
line = line.strip()
word_list = line.split()
pop_num.append (word_list[-1])
counts = {}
for x in pop_num:
k = str(x)[0]
counts.setdefault(k, 0)
counts[k] += 1
countList = [counts[str(i)] for i in range(1,10)]
sumList = sum(countList)
dictCount = {}
dictCount[1] = countList[0]
dictCount[2] = countList[1]
dictCount[3] = countList[2]
dictCount[4] = countList[3]
dictCount[5] = countList[4]
dictCount[6] = countList[5]
dictCount[7] = countList[6]
dictCount[8] = countList[7]
dictCount[9] = countList[8]
num_Freq = []
for elm in countList:
rel_Freq = 0
rel_Freq = rel_Freq + ((elm / sumList) * 100.0)
rel_Freq = round(rel_Freq, 1)
num_Freq.append(rel_Freq)
freqCount = {}
freqCount[1] = num_Freq[0]
freqCount[2] = num_Freq[1]
freqCount[3] = num_Freq[2]
freqCount[4] = num_Freq[3]
freqCount[5] = num_Freq[4]
freqCount[6] = num_Freq[5]
freqCount[7] = num_Freq[6]
freqCount[8] = num_Freq[7]
freqCount[9] = num_Freq[8]
print ("Digit" " ", "Count", " ", "%")
print (
main()
Upvotes: 1
Views: 188
Reputation: 8061
Using your code, you just need to do:
for i in range(1, 10):
print (i, dictCount[i], freqCount[i])
But you can simplify it a lot:
import collections
data = []
with open("Census__2008.txt") as fh:
fh.readline() # skip first line
for line in fh:
value = line.split()[-1]
data.append(value)
c = collections.Counter([x[0] for x in data])
total = sum(c.values())
print("Digit", "Count", "%")
for k, v in sorted(c.iteritems()):
freq = v / total * 100
round_freq = round(freq, 1)
print(k, v, round_freq)
Upvotes: 1