Ravi Stu
Ravi Stu

Reputation: 1

Split string with whitespace and then do a count

The sample below is to strip punctuations and converting text into lower case from a ranbo.txt file...

Help me to split this with whitespace

infile = open('ranbo.txt', 'r')
lowercased = infile.read().lower() 
for c in string.punctuation:
    lowercased = lowercased.replace(c,"")
white_space_words = lowercased.split(?????????)
print white_space_words

Now after this split - how can I found how many words are in this list?

count or len function?   

Upvotes: 0

Views: 1780

Answers (1)

eumiro
eumiro

Reputation: 212955

white_space_words = lowercased.split()

splits using any length of whitespace characters.

'a b \t cd\n  ef'.split()

returns

['a', 'b', 'cd', 'ef']

But you could do it also other way round:

import re
words = re.findall(r'\w+', text)

returns a list of all "words" from text.

Get its length using len():

len(words)

and if you want to join them into a new string with newlines:

text = '\n'.join(words)

As a whole:

with open('ranbo.txt', 'r') as f:
    lowercased = f.read().lower() 
words = re.findall(r'\w+', lowercased)
number_of_words = len(words)
text = '\n'.join(words)

Upvotes: 1

Related Questions