Reputation: 225
I'm reading in an Excel file to a Pandas data frame but one of the column headers has loads of comments in. It has a keyword 'Measure' amongst all this text which is specific to only this one header. Within the 'contains', how would I filter any header that simply has the keyword 'Measure' somewhere within the header?
The following code is filtering my data frame based 3 filters, but the third filter I just want it to identify the column itself that includes the text 'measure' opposed to having to write it as 'hereisallthe randomtextmeasure'
filtered = df[(df['Mode'].isin(mode_filter)) & (df['Level'].isin(level_filter)) & (df['hereisalltherandomtextmeasure'].isin(measure_filter))]
The reason I'm trying to do this is because I'm running the same code on multiple files but the 'measure' column changes for each file.
First file:
Mode | Level | hereisalltherandomtextmeasure
Second file:
Mode | Level | hereismorerandomtextmeasure
The only static thing about them is that they contain the word measure so ideally I'd like to identify the column that simply contains the word measure opposed to applying a full string.
Thanks.
Upvotes: 2
Views: 3080
Reputation: 394129
IIUC then you can use str.contains
to find if your matching string is contained anywhere in the columns:
In [7]:
df = pd.DataFrame(columns=['hereisall the random textMeasure', 'Measurement', 'asdasds'])
df.columns[df.columns.str.contains('Measure')]
Out[7]:
Index(['hereisall the random textMeasure', 'Measurement'], dtype='object')
Upvotes: 1