Reputation: 492
I am wanting to search through a directory of workbooks for certain words within cells and get Python to return the workbook names that contain those words. So far I have been able to loop through each workbook, print each workbook name and the sheet names using the code below, but I don't know the right syntax to loop through each cell and search for a string.
import os
from openpyxl import load_workbook
path = r"T:\DHFCS SSC\17.0 ILS\Jacob WIP\Cable schedules\Cable schedules"
sheets = os.listdir(path)
sheets = [_ for _ in sheets if not _.startswith('~')]
for sheet in sheets:
print(sheet)
wb2 = load_workbook(os.path.join(path, sheet))
print(wb2.get_sheet_names())
##wb2._archive.close()
What do I need to add to this to search for a word in all of the spreadheets and print the spreadsheet names that contain it?
Upvotes: 1
Views: 1877
Reputation: 492
For anyone interested I have worked out how to do this. The file will prompt a user for a directory of the Excel files and then ask for a search term, then return the names of the spreadsheets containing that term.
import os
from openpyxl import load_workbook
path = input("Paste the directory to search (right click on icon in top left > edit > paste: ")
string = input("Enter the search term in lower case: ")
print('Searching...')
workbooks = os.listdir(path)
workbooks = [_ for _ in workbooks if not _.startswith('~')]
for workbook in workbooks:
wb2 = load_workbook(os.path.join(path, workbook))
for sheet_name in (wb2.get_sheet_names()):
sheet = wb2.get_sheet_by_name(sheet_name)
for value in sheet.values:
if string in " ".join([str(_).lower() for _ in value if _ ]):
print(workbook)
Upvotes: 2