user3541631
user3541631

Reputation: 4008

Python, check how much of a string is in uppercase?

I have a text, and I want to know if all or a percent bigger than 50% is in uppercase.

DOFLAMINGO WITH TOUCH SCREEN lorem ipsum

I try to use regex(found here a solution):

rx = re.compile(r"^([A-Z ':]+$)", re.M)
upp = rx.findall(string)

But this finds all caps, i don't know if all or more than 50 percent(this includes all) is uppercase ?

I want to number only letters (so no numbers,spaces, new lines etc)

Upvotes: 3

Views: 1764

Answers (7)

Patrick Artner
Patrick Artner

Reputation: 51623

Generic solution that works with any boolean function and iterable (see below for version that only looks at str.isalpha()):

def percentage(data, boolfunc):
    """Returns how many % of the 'data' returns 'True' for the given boolfunc."""
    return (sum(1 for x in data if boolfunc(x)) / len(data))*100

text = "DOFLAMINGO WITH TOUCH SCREEN lorem ipsum"

print( percentage( text, str.isupper ))
print( percentage( text, str.islower ))
print( percentage( text, str.isdigit ))
print( percentage( text, lambda x: x == " " ))

Output:

62.5  # isupper
25.0  # islower
0.0   # isdigit
12.5  # lambda for spaces

even better is schwobaseggl's

return sum(map(boolfunc,data)) / len(data)*100

because it does not need to persist a list but instead uses a generator.


Edit: 2nd version that only uses str.isalpha characters and allows multiple boolfuncs:

def percentage2(data, *boolfuncs):
    """Returns how many % of the 'data' returns 'True' for all given boolfuncs.

    Only uses str.isalpha() characters and ignores all others."""
    count = sum(1 for c in data if c.isalpha())
    return sum(1 for x in data if all(f(x) for f in boolfuncs)) / count * 100

text = "DOFLAMINGO WITH TOUCH SCREEN lorem ipsum"

print( percentage2( text, str.isupper, str.isalpha ))
print( percentage2( text, str.islower, str.isalpha ))
 

Output:

71.42857142857143
28.57142857142857

Upvotes: 2

jpp
jpp

Reputation: 164613

Regex seems overkill here. You can use sum with a generator expression:

x = 'DOFLAMINGO WITH TOUCH SCREEN lorem ipsum'

x_chars = ''.join(x.split())  # remove all whitespace
x_upper = sum(i.isupper() for i in x_chars) > (len(x_chars) / 2)

Or functionally via map:

x_upper = sum(map(str.isupper, x_chars)) > (len(x_chars) / 2)

Alternatively, via statistics.mean:

from statistics import mean

x_upper = mean(i.isupper() for i in s if not i.isspace()) > 0.5

Upvotes: 5

Pablo Paglilla
Pablo Paglilla

Reputation: 366

Using regular expressions, this is one way you can do it (given that s is the string in question):

upper = re.findall(r'[A-Z]', s)
lower = re.findall(r'[a-z]', s)
percentage = ( len(upper) / (len(upper) + len(lower)) ) * 100

It finds the lista of both uppercase and lowercase characters and gets the percentage using their lengths.

Upvotes: 1

user2390182
user2390182

Reputation: 73450

You can use filter and str.isalpha to clean out non-alphabetic chars and str.isupper to count uppercase chars and calculate the ratio:

s = 'DOFLAMINGO WITH TOUCH SCREEN lorem ipsum'

alph = list(filter(str.isalpha, s))  # ['D', ..., 'O', 'W', ..., 'N', 'l', 'o', ...]
sum(map(str.isupper, alph)) / len(alph)
# 0.7142857142857143

Also see the docs on sum and map which you might find yourself using regularly. Moreover, this uses the fact that bool is a subclass of int and is cast appropriately for the summation which might be too implicit for the taste of some.

Upvotes: 7

Manrique
Manrique

Reputation: 2221

Try this, it's short and does the job:

text = "DOFLAMINGO WITH TOUCH SCREEN lorem ipsum"
print("Percent in Capital Letters:", sum(1 for c in text if c.isupper())/len(text)*100)
# Percent in Capital Letters: 62.5

Upvotes: 0

Est
Est

Reputation: 420

Something like the following should work.

string = 'DOFLAMINGO WITH TOUCH SCREEN lorem ipsum'
rx = re.sub('[^A-Z]', '', string)
print(len(rx)/len(string))

Upvotes: 0

gold_cy
gold_cy

Reputation: 14216

Here is one way to do it:

f = sum(map(lambda c: c.isupper(), f)) / len(f)
(sum(map(lambda c: c.isupper(), f)) / len(f)) > .50  

Upvotes: 0

Related Questions