papezjustin
papezjustin

Reputation: 2355

How can I check if a string contains ANY letters from the alphabet?

What is best pure Python implementation to check if a string contains ANY letters from the alphabet?

string_1 = "(555).555-5555"
string_2 = "(555) 555 - 5555 ext. 5555

Where string_1 would return False for having no letters of the alphabet in it and string_2 would return True for having letter.

Upvotes: 127

Views: 350097

Answers (8)

NeuroMorphing
NeuroMorphing

Reputation: 149

A simple and multilingual solution to the question of the OP can be achieved with the Alphabetic library, which can be installed via: pip install alphabetic.

After installing, import the library, create a WritingSystem instance and select the alphabet of the desired language (Alphabetic covers a wide range of languages):

from alphabetic import WritingSystem

ws = WritingSystem()

# Retrieve the English alphabet list  
en_alphabet = ws.by_language(ws.Language.English, as_list=True)

Using this list, we can write a handy function that accomplishes this task:

def contains_alphabet_strings(string: str, language_alphabet: list[str]) -> bool:
    alphabet_chars = [char for char in string if char in language_alphabet]    
    return True if len(alphabet_chars) > 0 else False

and call it as follows:

string_1 = "(555).555-5555"
string_2 = "(555) 555 - 5555 ext. 5555"

# Result:
contains_alphabet_strings(string_1, en_alphabet) # False
contains_alphabet_strings(string_2, en_alphabet) # True

Upvotes: 0

Mihir Verma
Mihir Verma

Reputation: 340

I tested each of the above methods for finding if any alphabets are contained in a given string and found out average processing time per string on a standard computer.

~250 ns for

import re

~3 µs for

re.search('[a-zA-Z]', string)

~6 µs for

any(c.isalpha() for c in string)

~850 ns for

string.upper().isupper()


Opposite to as alleged, importing re takes negligible time, and searching with re takes just about half time as compared to iterating isalpha() even for a relatively small string.
Hence for larger strings and greater counts, re would be significantly more efficient.

But converting string to a case and checking case (i.e. any of upper().isupper() or lower().islower() ) wins here. In every loop it is significantly faster than re.search() and it doesn't even require any additional imports.

Upvotes: 13

JBernardo
JBernardo

Reputation: 33407

Regex should be a fast approach:

re.search('[a-zA-Z]', the_string)

Upvotes: 167

Barm
Barm

Reputation: 403

I liked the answer provided by @jean-françois-fabre, but it is incomplete.
His approach will work, but only if the text contains purely lower- or uppercase letters:

>>> text = "(555).555-5555 extA. 5555"
>>> text.islower()
False
>>> text.isupper()
False

The better approach is to first upper- or lowercase your string and then check.

>>> string1 = "(555).555-5555 extA. 5555"
>>> string2 = '555 (234) - 123.32   21'

>>> string1.upper().isupper()
True
>>> string2.upper().isupper()
False

Upvotes: 20

Jean-François Fabre
Jean-François Fabre

Reputation: 140316

You can use islower() on your string to see if it contains some lowercase letters (amongst other characters). or it with isupper() to also check if contains some uppercase letters:

below: letters in the string: test yields true

>>> z = "(555) 555 - 5555 ext. 5555"
>>> z.isupper() or z.islower()
True

below: no letters in the string: test yields false.

>>> z= "(555).555-5555"
>>> z.isupper() or z.islower()
False
>>> 

Not to be mixed up with isalpha() which returns True only if all characters are letters, which isn't what you want.

Note that Barm's answer completes mine nicely, since mine doesn't handle the mixed case well.

Upvotes: 29

Ronald Saunfe
Ronald Saunfe

Reputation: 651

You can also do this in addition

import re
string='24234ww'
val = re.search('[a-zA-Z]+',string) 
val[0].isalpha() # returns True if the variable is an alphabet
print(val[0]) # this will print the first instance of the matching value

Also note that if variable val returns None. That means the search did not find a match

Upvotes: 1

cola
cola

Reputation: 12486

You can use regular expression like this:

import re

print re.search('[a-zA-Z]+',string)

Upvotes: 11

DSM
DSM

Reputation: 353604

How about:

>>> string_1 = "(555).555-5555"
>>> string_2 = "(555) 555 - 5555 ext. 5555"
>>> any(c.isalpha() for c in string_1)
False
>>> any(c.isalpha() for c in string_2)
True

Upvotes: 116

Related Questions