Reputation: 63
Im building a web scraping script with python that gets output to Excel. Also using PyQt5 for user interface. Im having trouble executing the function more than 1 time. Function is basically:
Its working great, but as i said, only 1 time. Next time i click start button, it says:
xlsxwriter.exceptions.DuplicateWorksheetName: Sheetname 'japanese katana', with case ignored, is already in use.
I dont understand why is this happening ? Im literally DELETING the file at start. Do i need to clear the memory or something ?
Code is here:
def searchKeywords(self):
try:
os.remove(dir_path + r"\Sonuclar.xlsx")
except FileNotFoundError:
print("File not found")
workbook = xlsxwriter.Workbook(dir_path + r'\Sonuclar.xlsx')
with open(dir_path + r"\results.txt", "w", encoding="utf-8") as f: #write the results also in txt
for query in search_list:
try:
worksheet = workbook.add_worksheet(query)
row = 0
column = 0
for url in search(query + " buy", tld='com', num=20, stop=int(self.lineEditSearchCount.text()), pause=1):
if not any(bad_word in url for bad_word in ban_list):
worksheet.write(row, column, url)
row += 1
f.write(url + "\n")
print(url + " writed to file")
except HTTPError:
print("Too many searches")
break
workbook.close()
Upvotes: 1
Views: 383
Reputation: 41644
You are creating a new xlsxwriter object each time so that gives you a clean instance every time. So that is not causing the issue you are seeing.
The issue is more likely that search_list
isn't cleared between calls. This could result in having the same search query, such as 'japanese katana' more that once in the list. Which in turn would lead to trying to create 2 worksheets with the same name, which isn't allowed by Excel and thus raised as an exception by xlsxwriter. So I'd suggest looking at where search_list
is initialised and populated.
Upvotes: 2