Asit Singh
Asit Singh

Reputation: 17

Splitting and saving parts of a input text

I was trying to figure out how to split and save a text into different sentences in python based on various periods, like , . ? !. But some text has decimal points and re.split considers that as a period. I was wondering how I can get around that? any help would be appreciated!

Eg text:

A 0.75-in-diameter steel tension rod is 4.8 ft long and carries a load of 13.5 kip. Find the tensile stress, the total deformation, the unit strains, and the change in the rod diameter.

Upvotes: 1

Views: 45

Answers (1)

Will Da Silva
Will Da Silva

Reputation: 7040

This will depend on your input, but if you can assume that ever period that you want to split at is followed by a space, then you can simply do:

>>> s = 'A 0.75-in-diameter steel tension rod is 4.8 ft long and carries a load of 13.5 kip. Find the tensile stress, the total deformation, the unit strains, and the change in the rod diameter.'
>>> s.split('. ')
['A 0.75-in-diameter steel tension rod is 4.8 ft long and carries a load of 13.5 kip', 'Find the tensile stress, the total deformation, the unit strains, and the change in the rod diameter.']

For anything more complicated than that, you'll probably want to use a regex like so:

import re
re.split(r'[\.!?]\s', s)

Upvotes: 4

Related Questions