Reputation: 83
I'm letting my python code go through an HTML document and while it does, I need it to find a specific words and then, parse the lines that have the following words
For example
if HTML document looks like this
htmlDocument = '''
word 023-213103-2402131025901238923213
bla bla bla
bla bla bla
word 2512-521-096-07464325
bla bla bla
bla bla bla
word 123123-0293231
'''
I need my desirableList to look like this after parsing
desirableList = [
"word 023-213103-2402131025901238923213",
"word 2512-521-096-07464325",
"word 123123-0293231"
]
Upvotes: 1
Views: 233
Reputation: 1479
Here's one way:
>>> desirableList = [s for s in htmlDocument.split("\n") if "word" in s]
>>> desirableList
['word 023-213103-2402131025901238923213', 'word 2512-521-096-07464325', 'word 123123-0293231']
Update the conditional, as needed, to get other kinds of results like "line starts with":
[s for s in htmlDocument.split("\n") if s.startswith("word")]
Upvotes: 1