Reema Q Khan
Reema Q Khan

Reputation: 878

How to extract a certain sentence in a paragraph? Python

I want to extract certain sentences from a paragraph looking at a certain set of words Object C Statement:. The paragraph is as follows:

Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat.

I want to extract Object C Statement: as follows:

There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.

All examples that I have come across are with splitting a specific word etc.

I tried this, but it doesn't work for me:

word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
a, b, c, d, e = re.split(r"\B\s(?=[^\s:]+:)", word)
regex = re.compile(r"""Object A Statement\s(.*?)Object B Statement\s(.*?)Object C Statement\s(.*?)Object D Statement\s(.*?)Object E Statement\s(.*)""", re.S|re.X)
a, b, c, d, e = regex.match(word).groups()

Upvotes: 0

Views: 928

Answers (1)

Yang Yushi
Yang Yushi

Reputation: 765

You can split the string with "\s*Object . Statement:\s*"

import re

word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
result = re.split(r"\s*Object . Statement:\s*", word)
result = [r for r in result if len(r) > 0]
print("\n".join(result))

I get the following result.

There was a cat with a bag full of meat. It was a red cat with a blue hat.
There was a dog with a bag full of toys. It was a blue dog with a green hat.
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
There was a zebra with a bag full of grass. It was a white zebra with a blue hat.
There was a bear with a bag full of wood. It was a brown bear with a black hat.

Upvotes: 2

Related Questions