diimension
diimension

Reputation: 195

How do you filter a string to only contain letters?

How do I make a function where it will filter out all the non-letters from the string? For example, letters("jajk24me") will return back "jajkme". (It needs to be a for loop) and will string.isalpha() function help me with this?

My attempt:

def letters(input):
    valids = []
    for character in input:
        if character in letters:
            valids.append( character)
    return (valids)

Upvotes: 8

Views: 77229

Answers (6)

ted
ted

Reputation: 4975

See re.sub, for performance consider a re.compile to optimize the pattern once.
Below you find a short version which matches all characters not in the range from A to Z and replaces them with the empty string. The re.I flag ignores the case, thus also lowercase (a-z) characters are replaced.

import re

def charFilter(myString)
    return re.sub('[^A-Z]+', '', myString, 0, re.I)

If you really need that loop there are many awnsers, explaining that specifically. However you might want to give a reason why you need a loop.

If you want to operate on the number sequences and thats the reason for the loop consider replacing the replacement string parameter with a function like:

import re

def numberPrinter(matchString) {
     print(matchString)
     return ''
}

def charFilter(myString)
    return re.sub('[^A-Z]+', '', myString, 0, re.I)

Upvotes: 4

Noah Cardoza
Noah Cardoza

Reputation: 197

Not using a for-loop. But that's already been thoroughly covered.

Might be a little late, and I'm not sure about performance, but I just thought of this solution which seems pretty nifty:

set(x).intersection(y)

You could use it like:

from string import ascii_letters

def letters(string):
    return ''.join(set(string).intersection(ascii_letters))

NOTE: This will not preserve linear order. Which in my use case is fine, but be warned.

Upvotes: 0

Prasanth
Prasanth

Reputation: 5258

Of course you can use isalpha. Also, valids can be a string.

Here you go:

def letters(input):
    valids = ""
    for character in input:
        if character.isalpha():
            valids += character
    return valids

Upvotes: 0

skunkfrukt
skunkfrukt

Reputation: 1570

import re
valids = re.sub(r"[^A-Za-z]+", '', my_string)

EDIT: If it needs to be a for loop, something like this should work:

output = ''
for character in input:
    if character.isalpha():
        output += character

Upvotes: 10

Ian Clelland
Ian Clelland

Reputation: 44132

If it needs to be in that for loop, and a regular expression won't do, then this small modification of your loop will work:

def letters(input):
    valids = []
    for character in input:
        if character.isalpha():
            valids.append(character)
    return ''.join(valids)

(The ''.join(valids) at the end takes all of the characters that you have collected in a list, and joins them together into a string. Your original function returned that list of characters instead)

You can also filter out characters from a string:

def letters(input):
    return ''.join(filter(str.isalpha, input))

or with a list comprehension:

def letters(input):
    return ''.join([c for c in input if c.isalpha()])

or you could use a regular expression, as others have suggested.

Upvotes: 24

Askr
Askr

Reputation: 57

The method string.isalpha() checks whether string consists of alphabetic characters only. You can use it to check if any modification is needed. As to the other part of the question, pst is just right. You can read about regular expressions in the python doc: http://docs.python.org/library/re.html They might seem daunting but are really useful once you get the hang of them.

Upvotes: 0

Related Questions