Bidstrup
Bidstrup

Reputation: 1617

Search for specific string in multiple excel files

I'm new to python, so bear with me on this one. I want to make a script that searches within all excel files in a specific folder, to see if they contain an '@' sign. If it finds a match it should write to a log.txt files with the name of the excel file. I don't have the name of the excel files in advance.

So far I'm stuck at just read in one file, and return true if it finds the string. What I got is

import pandas as pd

df = pd.read_excel(open('test.xlsx','rb'), sheetname=0)
print(df)

if '@' in df:
    print("true")
else:
    print("false")

This returns the content of sheet1 in the excel file correctly, but the search to find a match, doesn't seem to work. Any ideas? Maybe Im doing it all wrong.

Upvotes: 0

Views: 3789

Answers (2)

Ninjasoup
Ninjasoup

Reputation: 173

This an answer without using pandas

import logging

LOG_FILENAME = r'file_output_location_path\log.txt'
logging.basicConfig(filename=LOG_FILENAME,
                    level=logging.WARNING,
                    format='%(asctime)s %(message)s'
                    )


source_folder = r'excel_file_folder_path'

def containsString():
    for somefile in os.listdir(source_folder):
        if'@' in somefile:
            logging.warning(somefile)

Upvotes: 1

Grr
Grr

Reputation: 16079

You should read through Pandas - Working with Text Data specifically the section on strings that match or contain a pattern.

Unfortunately you cannot search a DataFrame with the string methods common to core Python. Instead you would use a StringMethod, like so:

out = False
for col in df.columns:
    if df[col].str.contains('@').any():
        out = 'True'
        break
print(out)

Also, I don't believe it is necessary to use open when using pd.read_excel. E.g

df = pd.read_excel(open('test.xlsx','rb'), sheetname=0)

Can be rewritten:

df = pd.read_excel('test.xlsx', sheetname=0)

As far as iterating over files in a folder take a look at the builtin module glob.

Upvotes: 3

Related Questions