Stelian
Stelian

Reputation: 21

Explanation about split in python

I have this task.

st = 'print only the words that sstart with an s in the sstatement'

and the solution would be

for word in st.split():
    if word[0] == 's':
        print word

why won't it work with

for word in st.split():
    if word[1] == 's':
        print word

I kind of understand what that zero stands for, but how can I print the words with the second letter being 's'.

Upvotes: 1

Views: 74

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476574

One of the problems is that it is not guaranteed that the length of the string is sufficient. For instance the empty string ('') or a string with one character ('s') might end up in the word list as well.

A quick fix is to use a length check:

for word in st.split():
    if len(word) > 1 and word[1] == 's':
        print word

Or you can - like @idjaw says - use slicing, and then we will obtain an empty string if out of range:

for word in st.split():
    if word[1:2] == 's':
        print word

If you have a string, you can obtain a substring with st[i:j] with st the string, i the first index (inclusive) and j the last index (exclusive). If however the indices are out of range, that is not a problem: then you will obtain the empty string. So we simply construct a slice that starts at 1 and ends at 1 (both inclusive here). If no such indices exist, we obtain the empty string (and this is not equal to 's'), otherwise we obtain a string with exactly one character: the one at index 1.

In the case however you will check against more complicated patterns, you can use a regex:

import re

rgx = re.compile(r'\b\ws\w*\b')
rgx.findall('print only the words that sstart with an s in the sstatement')

Here we specified to match anything between word boundaries \b that is a sequence of \ws with the second character an s:

>>> rgx.findall('print only the words that sstart with an s in the sstatement')
['sstart', 'sstatement']

Upvotes: 2

Related Questions