Reputation: 45
So I have tried to fix this problem for quite a while now and did some research on trying to figure out why my code won't work, but I simply can't get the dictionary to print with all the proper key:value pairs I need.
So here's the story. I am reading a .csv file where the first column are text abbreviations and in the second column they are the full english meaning. Now I have tried multiple ways of trying to open this file, read it, and then store it to dictionary we create. My issue is that the file gets read, and when I print the separated pieces (I believe it goes through the whole file, but I don't know since it does get cut off around line 1007, but goes through to 4600. The problem is that when I now want to take all that stuff and put it into key:value pairs inside a dictionary. The only one that gets stored is the very first line in the file.
Here is the code:
def createDictionary(filename):
f = open(filename, 'r')
dic = {}
for line in f:
#line = line.strip()
data = line.split(',')
print data
dic[data[0]] = data[1]
print dic
What I assumed was the issue was:
print dic
Since it is printing within the loop, but since it is in the loop it should just print everytime it goes through again and again. I am confused on what I am doing wrong. The other methods I attempted to use were json, but I don't know too much about how to use it, and then I also read up about the csv module, but I don't think our professor wants us to use that so i was hoping for someone to spot my error. Thanks in advance!!!
EDIT
This is the output of my program
going to be late\rg2cu', 'glad to see you\rg2e', 'got to eat\rg2g', 'got to go\rg2g2tb', 'got to go to the bathroom\rg2g2w', 'got to go to work\rg2g4aw', 'got to go for a while\rg2gb', 'got to go bye\rg2gb2wn', 'got to go back to work now\rg2ge', 'got to go eat\rg2gn', 'got to go now\rg2gp', 'got to go pee\rg2gpc', 'got 2 go parents coming\rg2gpp', 'got to go pee pee\rg2gs', 'got to go sorry\rg2k', 'good to know\rg2p', 'got to pee\rg2t2s', 'got to talk to someone\rg4u', 'good for you\rg4y', 'good for you\rg8', 'gate\rg9', 'good night\rga', 'go ahead\rgaalma', 'go away and leave me alone\rgafi', 'get away from it\rgafm', 'Get away from me\rgagp', 'go and get pissed\rgaj'
Which goes on for a bit until the end of the file and then after that its supposed to print the entire dictionary in which I get this
{'$$': 'money\r/.'}
Along with a
none
EDIT 2
Here is the full code:
def createDictionary(filename):
f = open(filename, 'r')
dic = {}
for line in f:
line = line.strip()
data = line.split(',')
print data
dic[data[0]] = data[1]
print dic
if __name__ == "__main__":
x = createDictionary("textToEnglish.csv")
print x
EDIT 3
Here is the file I am trying to make into a dictionary
https://1drv.ms/u/s!AqnudQBXpxTGiC9vQEopu1dOciIS
Upvotes: 0
Views: 99
Reputation: 319
Although there is nothing wrong with the other solutions presented, you could simplify and greatly escalate your solutions by using python's excellent library pandas.
Pandas is a library for handling data in Python, preferred by many Data Scientists.
Pandas has a simplified CSV interface to read and parse files, that can be used to return a list of dictionaries, each containing a single line of the file. The keys will be the column names, and the values will be the ones in each cell.
In your case:
import pandas
def createDictionary(filename):
my_data = pandas.DataFrame.from_csv(filename, sep=',', index_col=False)
list_of_dicts = [item for item in my_data.T.to_dict().values()]
return list_of_dicts
if __name__ == "__main__":
x = createDictionary("textToEnglish.csv")
print type(x)
# <class 'list'>
print len(x)
# 4255
print type(x[0])
# <class 'dict'>
Upvotes: 0
Reputation: 107587
Simply add a return
in your function. Also, you will see the dictionary length is not the same as csv rows due to repeated values in first column of csv. Dictionary keys must be unique, so when a reused key is assigned to a value, the latter value replaces former.
def createDictionary(filename):
f = open(filename, 'r')
dic = {}
for line in f:
#line = line.strip()
data = line.split(',')
print(data)
dic[data[0]] = data[1]
return dic
if __name__ == "__main__":
x = createDictionary("textToEnglish.csv")
print type(x)
# <class 'dict'>
print len(x)
# 4255
for k, v in x.items():
print(k, v)
And try not to print
dictionary all at once especially with so many values which becomes intense overhead on memory. See how you can iterate through keys and values with for
loop.
Upvotes: 1