malcom32
malcom32

Reputation: 63

How do i make Xlsxwriter clear its memory?

Im building a web scraping script with python that gets output to Excel. Also using PyQt5 for user interface. Im having trouble executing the function more than 1 time. Function is basically:

  1. Delete Xlsx file if its already there
  2. Create new workbook
  3. Read the keywords to search with for loop (Example: japanese katana, iron sword ...)
  4. For each keyword, create a new worksheet and write rows on xlsx

Its working great, but as i said, only 1 time. Next time i click start button, it says:

xlsxwriter.exceptions.DuplicateWorksheetName: Sheetname 'japanese katana', with case ignored, is already in use.

I dont understand why is this happening ? Im literally DELETING the file at start. Do i need to clear the memory or something ?

Code is here:

def searchKeywords(self):

    try:
        os.remove(dir_path + r"\Sonuclar.xlsx")
    except FileNotFoundError:
        print("File not found")

    
        
    workbook = xlsxwriter.Workbook(dir_path + r'\Sonuclar.xlsx')
    
    with open(dir_path + r"\results.txt", "w", encoding="utf-8") as f: #write the results also in txt            
        
        for query in search_list:
            try:
                worksheet = workbook.add_worksheet(query)
                row = 0
                column = 0
                for url in search(query + " buy", tld='com', num=20, stop=int(self.lineEditSearchCount.text()), pause=1):
                    if not any(bad_word in url for bad_word in ban_list):
                        worksheet.write(row, column, url)
                        row += 1
                        f.write(url + "\n")
                        print(url + " writed to file")
            except HTTPError:
                print("Too many searches")
                break
            

        
    workbook.close()

Upvotes: 1

Views: 383

Answers (1)

jmcnamara
jmcnamara

Reputation: 41644

You are creating a new xlsxwriter object each time so that gives you a clean instance every time. So that is not causing the issue you are seeing.

The issue is more likely that search_list isn't cleared between calls. This could result in having the same search query, such as 'japanese katana' more that once in the list. Which in turn would lead to trying to create 2 worksheets with the same name, which isn't allowed by Excel and thus raised as an exception by xlsxwriter. So I'd suggest looking at where search_list is initialised and populated.

Upvotes: 2

Related Questions