Craig
Craig

Reputation: 58

Extracting a variable length sub-string using pyparsing

I'm trying to get pyparsing to extract a sub-string consisting of a variable number of words from a string.

The following almost works but loses the last word of the sub-string:

text = "Joe F Bloggs is the author of this book."
author = OneOrMore(Word(alphas) + ~Literal("is the"))

print author.parseString(text)

Output:

['Joe', 'F']

What am I missing?

PS: I know I can do this with a regular expression but specifically want to do it with pyparsing because it needs to fit into a large effort already written using pyparsing.

Upvotes: 1

Views: 219

Answers (1)

PaulMcG
PaulMcG

Reputation: 63762

Your negative lookahead has to come before the actual author word:

>>> author = OneOrMore(~Literal("is the") + Word(alphas))
>>> print author.parseString(text)
['Joe', 'F', 'Bloggs']

Upvotes: 1

Related Questions