Johnny
Johnny

Reputation: 105

How to get all substrings in a list of characters (python)

I want to iterate over a list of characters

temp = ['h', 'e', 'l', 'l', 'o', '#', 'w', 'o', 'r', 'l', 'd']

so that I can obtain two strings, "hello" and "world"

My current way to do this is:

#temp is the name of the list
#temp2 is the starting index of the first alphabetical character found
for j in range(len(temp)):
    if temp[j].isalpha() and temp[j-1] != '#':
            temp2 = j
            while (temp[temp2].isalpha() and temp2 < len(temp)-1:
                temp2 += 1
            print(temp[j:temp2+1])
            j = temp2

The issue is that this prints out

['h', 'e', 'l', 'l', 'o']
['e', 'l', 'l', 'o']
['l', 'l', 'o']
['l', 'o']
['o']

etc. How can I print out only the full valid string?

Edit: I should have been more specific about what constitutes a "valid" string. A string is valid as long as all characters within it are either alphabetical or numerical. I didn't include the "isnumerical()" method within my check conditions because it isn't particularly relevant to the question.

Upvotes: 1

Views: 490

Answers (5)

Kracekumar
Kracekumar

Reputation: 20429

List has the method index which returns position of an element. You can use slicing to join the characters.

In [10]: temp = ['h', 'e', 'l', 'l', 'o', '#', 'w', 'o', 'r', 'l', 'd']
In [11]: pos = temp.index('#')
In [14]: ''.join(temp[:pos])
Out[14]: 'hello'
In [17]: ''.join(temp[pos+1:])
Out[17]: 'world'

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180540

If you just want alphas just use isalpha() replacing the # and any other non letters with a space and then split of you want a list of words:

print("".join(x  if x.isalpha() else " " for x in temp).split())

If you want both words in a single string replace the # with a space and join using the conditional expression :

print("".join(x if x.isalpha() else " " for x in temp))
hello world

To do it using a loop like you own code just iterate over items and add to the output string is isalpha else add a space to the output:

out = ""
for s in temp:
    if s.isalpha():
        out += s
    else:
        out += " "

Using a loop to get a list of words:

words  = []
out = ""
for s in temp:
    if s.isalpha():
        out += s
    else:
        words.append(out)
        out = ""

Upvotes: 0

Sylvain Leroux
Sylvain Leroux

Reputation: 52070

An alternate, itertools-based solution:

>>> temp = ['h', 'e', 'l', 'l', 'o', '#', 'w', 'o', 'r', 'l', 'd']
>>> import itertools
>>> ["".join(str)
     for isstr, str in itertools.groupby(temp, lambda c: c != '#') 
     if isstr]
['hello', 'world']

itertools.groupby is used to ... well ... group consecutive items depending if they are of not equal to #. The comprehension list will discard the sub-lists containing only # and join the non-# sub-lists.

The only advantage is that way, you don't have to build the full-string just to split it afterward. Probably only relevant if the string in really long.

Upvotes: 0

Bhargav Rao
Bhargav Rao

Reputation: 52191

If you want only hello and world and your words are always # seperated, you can easily do it by using join and split

>>> temp = ['h', 'e', 'l', 'l', 'o', '#', 'w', 'o', 'r', 'l', 'd']
>>> "".join(temp).split('#')
['hello', 'world']

Further more if you need to print the full valid string you need to

>>> t = "".join(temp).split('#')
>>> print(' '.join(t))
hello world

Upvotes: 6

mkrieger1
mkrieger1

Reputation: 23313

You can do it like this:

''.join(temp).split('#')

Upvotes: 1

Related Questions