Reputation:
I got this warning WARNING: C:\Users\diodi\AppData\Local\Programs\Python\Python37-32\lib\site-packages\xlsxwriter\worksheet.py:923: UserWarning: Ignoring URL 'https://www.google.com/search?q=hello+world' since it exceeds Excel's limit of 65,530 URLS per worksheet.
"65,530 URLS per worksheet." % force_unicode(url))
I use to write th output of scraped data
#spider.found_items is a list of dictionary
df = pd.DataFrame(spider.found_items)[ordered_list] #ordered_list is the order of dictiornary outputed
df.to_excel("{}.xlsx".format(file_name),sheet_name='All Products')
I checked this Number of URLS is over Excel's limit of 65,530 URLS per worksheet. but this formats the links as strings (not clickable). is there a way to keep URL (as a link) if I can write to multiple sheets or any other suggestion?
Upvotes: 4
Views: 2927
Reputation: 143
If possible, use the EXCEL HYPERLINK function instead. You should not encounter this limitation with the HYPERLINK function.
Upvotes: 1
Reputation: 41644
Fundamentally this is an Excel limitation. It only allows 65,530 unique* urls per worksheet. There isn't any workaround for that.
(*) Excel does allow more than 65,530 non-unique urls if they are grouped contiguously. For example the same url in cells A1 to A100 would only be counted as one url against the 65,530 limit. However, grouping of urls to take advantage of this isn't supported by XlsxWriter (and even if it was it doesn't help if all the urls are unique or non-contiguous).
Upvotes: 2