Reputation: 241
Hello I wrote this program to reformat information in a file so that it was separated by commas instead of the delimiter it had which was '|', now that I did that I want to create another function that uses the information this reformatted to write certain information from the file using indexes into a dictionary, my problem is actually doing it because I always try and use a for loop and its not working. I am having a bit of trouble understanding how to use the dictionary, it seemed simple but how do I access the information output to a dictionary, does it create, or do I have to create an output file where that information goes?
def dicList():
dictList = csv.reader(open('C:/Python/data.txt', 'rb'))
for row in dictList:
newRow= ' '.join(row)
listOne = newRow.replace('|',',')
Another minor thing, this function out puts the values like this "hash,version,product,os
"
without the quotes, so it doesn't output as a list which is what I would like, and I can't figure out how to make that happen as well.
What I am trying to do overall is write the dictionary so I can match values against it from another file, and the reason I am using this method is because the files are HUGE so I couldn't just run them against each other for matches. What I was hoping is using this dictionary to run the values in my other file to output matches into another. I can clarify if it doesn't make sense.
Let me clarify a bit more the information I have is in a file, the information is output into that file as "data,data,data", now I have the info in a list through the function
def dicList():
dictList = csv.reader(open('C:/Python/hashsetsdotcom_data.txt', 'rb'), delimiter = '|')
for row in dictList:
print row[0], row[2]
the two values I print here are the ones I want to have in the dictionary as key,value, but I want it to iterate through the whole file which is something like 8 million lines, and I wanna be able to use this data to run another file, which is in relation to this one, and pull values from there to match that against the dictionary value and then after output those matched values into another. So in the end I will have
"Key,Value" ---- with "Match" from another file.
I should've been more clear but didn't realize how specific I should have been.
Here's where my code is now, I am having trouble trying to match data values in another text file with the values from the dictionary, this is possible correct? To iterate through the file where I have such values and run the script to check if they match the dictionary values, and then output all three like I try and do in my last function?
def dicList():
dictList = csv.reader(open('C:/data.txt', 'rb'), delimiter = '|')
for row in dictList:
print row[0], row[2]
def dictAppend():
output = []
fhand = open('C:/Python/lex.txt', 'w')
for row in dicList():
one_entity = {row[0]:row[2]}
output.append(one_entity)
def findMatch():
fhand = open('C:/Python/search.sql', 'r')
fig = open('C:/Python/lex.txt', 'w')
for line in fhand:
if line[1] == dictAppend()[0]:
fig.write(dictAppend()[0], dictAppend[1], line[13])
Upvotes: 0
Views: 295
Reputation: 10257
Per the comments, I will include two solutions, one in response to the comment and the other assuming the presence of headers like the Excel dialect of CSV.
What's wrong with your solution is that you are not setting the delimter to reflect the data
def dicList():
dictList = csv.reader(open('C:/Python/data.txt', 'rb'), delimiter="|")
for row in dictList:
#the data should now be pre-separated into a list
print row
This will split up the fields by pipes rather than commas, no dictionary necessary - it will be a list just like any other csv file. You could join them with commas and write them back out as output if need be.
To get the dictionary format you seem to desire, you need to access the values by index and manually convert:
output = []
for row in dictList:
one_entity = {row[0]:row[1],row[2]:row[3]}
output.append(one_entity)
Assuming of course the data is normalized like you said in the comments, in an alternating key-value format.
key1|val1|key2|val2
A solution for csv with headers, with each field keyed in a dictionary:
data_dictionary = dict()
line_no = 0
fields = 0
output = []
csv_data = csv.reader('C:\filepath')
for line in csv_data:
if line_no == 0:
#read the first line as the keys for the final dict
fields = line
line_no+=1
continue
field_index = 0
one_entity = {}
for answer in line:
one_entity[fields[field_index]] = answer.strip()
output.append(one_entity)
line_no+=1
A combination of these solutions should get you where you need to be.
EDIT
I was not aware of it till he pointed it out, but J.F Sebastian mentions csv.dictReader
to accomplish my example above, which by default will use the first row of the csv file as the field names if no values is passed for the fieldnames
paramter.
http://docs.python.org/library/csv.html#csv.DictReader
Upvotes: 3
Reputation: 2071
Dictionaries create key value u pairs so
Diclist = {}
Makes an empty dictionary
diclist["hello"] = 5
Makes an entry with a key of "hello" and a value of 5
diclist["hello"] = [5,6,7,8,9]
Overwrites that entry with a list
print diclist["hello"]
will print that list
for x in diclist
x will be all the values in diclist.
Upvotes: 0