Reputation: 363
I have a file containing words and dictionary containing each word as key with the score as value:
data.xlsx
1. one neg_three
2. one two
dictionary :
num_dic = {'one':'1','two':'2','neg_three':'-3',....}
What I want is compare each key with each row in the excel file and sum the value of all words using the values in the dictionary and save the result in the last column in the excel file. here is my code
import xlrd
loc = ("data.xlsx")
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
list1 = []
for i in range(1, sheet.nrows): # start at 1 skips header row
list1.append(sheet.cell_value(i, 0))
# save the data to a list and then converted to a string for spliting
str1 = ''.join(list1)
list2 = str1.spilt()
# sum the values
total = 0
for i in list2:
for key, value in num_dic.items():
if key in i:
total =+ int(value)
# save the result in excel file
expData = pd.DataFrame(total, columns = ['total'])
writer = ExcelWriter('data.xlsx')
expData.to_excel(writer,'Sheet1',index=False,startcol=1)
writer.save()
the output should be like this:
data.xlsx:
data total
1. one neg_three -2
2. one two 3
can anyone advise me in my code? because it doesn't work. Thanks in advance
Upvotes: 0
Views: 687
Reputation: 589
Hope this what you are searching for,
import xlrd
import pandas as pd
num_dic = {'one':'1','two':'2','neg_three':'-3'}
loc = ("data.xlsx")
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
list1 = []
total = []
for i in range(1, sheet.nrows): # start at 1 skips header row
result = sheet.cell_value(i, 0).split(" ")
num1 = 0
for j in result:
if j:
num1 += int(num_dic[j])
total.append(num1)
sheet1 = pd.read_excel('data.xlsx')
sheet1['total']=total
writer = pd.ExcelWriter('data.xlsx')
sheet1.to_excel(writer,'Sheet1',index=False,startcol=0)
writer.save()
Upvotes: 1
Reputation: 2488
I don't have xlrd
installed, so the following is in theory only, I did not test it.
Lets look at your code block by block:
import xlrd
loc = ("data.xlsx")
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
sheet.cell_value(0, 0)
list1 = []
for i in range(1, sheet.nrows): # start at 1 skips header row
list1.append(sheet.cell_value(i, 0))
This should do roughly what you want, and appends the read input data to list1
. Except it will not read the last row of your data: you may have to consider to change your range()
to range(1, sheet.nrows + 1)
Next you .join()
it all together, and then do a .split()
on it:
# save the data to a list and then converted to a string for spliting
str1 = ''.join(list1)
list2 = str1.spilt()
I would not do this at all, just leave list1
as is. You can split the values while computing.
Then you use two for
loops to do some sort of calculation, and store them all into a variable called total
:
# sum the values
total = 0
for i in list2:
for key, value in num_dic.items():
if key in i:
total =+ int(value)
This is not really doing what you want, you want to store a total for each row of input data, which then can be written to your result file. Something like this:
# use actual integers in your lookup table
num_dic = {'one':1,'two':2,'neg_three':-3}
list2 = [] ## this is your list containing the totals for each row
for in_row in list1:
row, string_value1, string_value2 = in_row.split()
if string_value1 and string_value2:
# values are not empty continue with calculation
# use dict.get() to ensure a default value of 0 if not found
total = num_dic.get(string_value1, 0) + num_dic.get(string_value2, 0)
# append the total to the list
list2.append(str(total))
else:
# one or both of the values is empty
# appending an empty string to the result list2
list2.append('')
# at this point list2 contains all your results for each row of input
Your last block is writing the results to your file:
# save the result in excel file
expData = pd.DataFrame(total, columns = ['total'])
writer = ExcelWriter('data.xlsx')
expData.to_excel(writer,'Sheet1',index=False,startcol=1)
writer.save()
As I said I do not have xlrd, so I cannot comment on writing the file, so you now need to write the list list2
to your "total" column in your XL file. How you do that may need some experimentation.
Upvotes: 0