Reputation: 33638
I am trying to learn pyparsing. It sounds promising and something that would be fun to use for text processing. Anyhow, here is my question:
I have a list of course names. For example,
courselist = ["Project Based CALC",
"CALCULUS I",
"Calculus II",
"Intermediate MICRO",
"Intermediate CALCULUS advance",
"UNIVERSITY PHYSICS"]
I want to extract courses from a list such as above that have to do with calculus. These are either courses that have the full word CALCULUS or abbreviation CALC. First, suppose that these words appear only in uppercase (there is one with lowercase in the above example; let us ignore that for the moment).
I have written the following code:
import pyparsing as pp
calc = pp.Literal("CALC")
for entry in courselist:
if len(calc.searchString(entry)) >= 1:
print entry
else:
pass
My first question is, whether there a better way of doing this using pyparsing?
Now the above misses Calculus II
. I know I can catch that by defining calc
as:
calc = pp.Literal("CALC") | pp.Literal("Calc")
But this will miss cAlc
. Is there way to do specify the grammar such that all lower and upper case letters in CALC are matched.
Thank you for your help.
Upvotes: 1
Views: 895
Reputation: 414685
calc = pp.CaselessLiteral('calc')
for entry in courselist:
if calc.searchString(entry, 1):
print entry
The effect is similar to:
for entry in courselist:
if 'calc' in entry.lower():
print entry
Upvotes: 2