Reputation: 865
I want to find how many ' '
(whitespaces) there are in each of these sentences which happen to be elements in a list. So, for:
['this is a sentence', 'this is one more sentence']
Calling element 0 would return a value of 3, and calling element 1 would return a value of 4. I really am having trouble doing both of finding the whitespaces as well as looping through every element to find the one with the highest number of whitespaces.
Upvotes: 0
Views: 215
Reputation: 87134
You state "whitespace", normally that would include these characters '\t\n\x0b\x0c\r '
, plus any unicode characters, e.g. u'\u3000' (IDEOGRAPHIC SPACE).
A regex solution is one of the better ones, because it easily supports any unicode whitespace codepoint in addition to the usual ascii ones. Just use re.findall()
and set the re.UNICODE
flag:
import re
def count_whitespace(s):
return len(re.findall(r'\s', s, re.UNICODE))
l = ['this is a sentence',
'this is one more sentence',
'',
u'\u3000\u2029 abcd\t\tefghi\0xb \n\r\nj k l\tm \n\n',
'nowhitespaceinthisstring']
for s in l:
print count_whitespace(s)
Output
3 4 0 23 0
An easy, non-regex, way to do this is with str.split()
which naturally splits on any whitespace character and is an effective way of removing all whitespace from a string. This also works with unicode whitespace characters:
def count_whitespace(s):
return len(s) - len(''.join(s.split()))
for s in l:
print count_whitespace(s)
Output
3 4 0 23 0
Finally, picking out the sentence with the most whitespace characters:
>>> max((count_whitespace(s), s) for s in l)[1]
u'\u3000\u2029 abcd\t\tefghi\x00xb \n\r\nj k l\tm \n\n'
Upvotes: 1
Reputation: 20359
You can use Counter
.I dont know whether it is time consuming than .count()
from collections import Counter
lst = ['this is a sentence', 'this is one more sentence']
>>>[Counter(i)[' '] for i in lst]
[3, 4]
Upvotes: 1
Reputation: 52171
Have a simple list-coprehension using count
>>> lst = ['this is a sentence', 'this is one more sentence']
>>> [i.count(' ') for i in lst]
[3, 4]
Other ways include using map
>>> map(lambda x:x.count(' '),lst)
[3, 4]
If you want a callable (which is a function that iterates through your list as you have mentioned) it can be implemented as
>>> def countspace(x):
... return x.count(' ')
...
and executed as
>>> for i in lst:
... print countspace(i)
...
3
4
This can be solved using regexes using the re
module as mentioned below by Grijesh
>>> import re
>>> [len(re.findall(r"\s", i)) for i in lst]
[3, 4]
Post edit
As you say you need to find the max element also, you can do
>>> vals = [i.count(' ') for i in lst]
>>> lst[vals.index(max(vals))]
'this is one more sentence'
This can be implemented as a callable using
>>> def getmax(lst):
... vals = [i.count(' ') for i in lst]
... maxel = lst[vals.index(max(vals))]
... return (vals,maxel)
and use it as
>>> getmax(lst)
([3, 4], 'this is one more sentence')
Post comment edit
>>> s = 'this is a sentence. this is one more sentence'
>>> lst = s.split('. ')
>>> [i.count(' ') for i in lst]
[3, 4]
Upvotes: 3