norman
norman

Reputation: 5576

Counting number of words between punctuation characters in Python

I want to use Python to count the numbers of words that occur between certain punctuation characters in a block of text input. For example, such an analysis of everything written up to this point might be represented as:

[23, 2, 14]

...because the first sentence, which has no punctuation except the period at the end, has 23 words, the "For example" phrase that comes next has two, and the rest, ending with the colon, has 14.

This probably wouldn't be too hard to make, but (to go along with the "don't reinvent the wheel" philosophy that seems especially Pythonic) is there anything already out there that would be especially suitable for the task?

Upvotes: 1

Views: 1289

Answers (2)

roippi
roippi

Reputation: 25964

Joran beat me to it, but I'll add my approach:

from string import punctuation
import re

s = 'I want to use Python to count the numbers of words that occur between certain punctuation characters in a block of text input. For example, such an analysis of everything written up to this point might be represented as'

gen = (x.split() for x in re.split('[' + punctuation + ']',s))

list(map(len,gen))
Out[32]: [23, 2, 14]

(I love map)

Upvotes: 4

Joran Beasley
Joran Beasley

Reputation: 113998

punctuation_i_care_about="?.!"
split_by_punc =  re.split("[%s]"%punctuation_i_care_about, some_big_block_of_text)
words_by_puct = [len(x.split()) for x in split_by_punc]

Upvotes: 4

Related Questions