Reginald
Reginald

Reputation: 439

A value in a list, python

Every character in the English language has a percentage of occurrence, these are the percentages:

A       B       C       D       E       F       G       H       I
.0817   .0149   .0278   .0425   .1270   .0223   .0202   .0609   .0697
J       K       L       M       N       O       P       Q       R
.0015   .0077   .0402   .0241   .0675   .0751   .0193   .0009   .0599
S       T       U       V       W       X       Y       Z   
.0633   .0906   .0276   .0098   .0236   .0015   .0197   .0007

A list called letterGoodness is predefined as:

letterGoodness = [.0817,.0149,.0278,.0425,.1270,.0223,.0202,...

I need to find the "goodness" of a string. For example the goodness of 'I EAT' is: .0697 + .1270 + .0817 + .0906 =.369. This is part of a bigger problem, but I need to solve this to solve the big problem. I started like this:

def goodness(message):
   for i in L:
     for j in i:

So it will be enough to find out how to get the occurrence percentage of any character. Can you help me? The string contains only uppercase letters and spaces.

Upvotes: 4

Views: 306

Answers (2)

chucksmash
chucksmash

Reputation: 6017

You would be better off using a dictionary data structure.

EDIT: This is not my original code but instead the code updated along the lines DSM suggested.

import string

num_vals = [.0817, .0149, .0278, .0425, .1270, .0223, .0202, .0609, .0697 , .0015, .0077,
            .0402, .0241, .0675, .0751, .0193, .0009, .0599, .0633, .0906, .0276, .0098,
            .0236, .0015, .0197, .0007]

letterGoodness = {letter : value for letter,value in map(None, string.ascii_uppercase, num_vals)}

def goodness(message):
    string_goodness = 0
    for letter in message:
        letter = letter.upper()
        if letter in letterGoodness.keys():
            string_goodness += letterGoodness[letter]
    return string_goodness

print goodness("I eat")

Using the test case you provided:

print goodness("I eat")

yields the output:

.369

One thing to note - building a dictionary as is done here requires on Python 2.7+. The same thing can be accomplished in Python 2.6+ with the dict() constructor.

Upvotes: 2

mgilson
mgilson

Reputation: 310227

letterGoodness is better as a dictionary, then you can just do:

sum(letterGoodness.get(c,0) for c in yourstring.upper())
#                                             #^.upper for defensive programming

To convert letterGoodness from your list to a dictonary, you can do:

import string
letterGoodness = dict(zip(string.ascii_uppercase,letterGoodness))

If you're guaranteed to only have uppercase letters and spaces, you can do:

letterGoodness = dict(zip(string.ascii_uppercase,letterGoodness))
letterGoodness[' '] = 0
sum(letterGoodness[c] for c in yourstring)

but the performance gains here are probably pretty minimal so I would favor the more robust version above.


If you insist on keeping letterGoodness as a list (and I don't advise that), you can use the builtin ord to get the index (pointed out by cwallenpoole):

 ordA = ord('A')
 sum(letterGoodness[ord(c)-ordA] for c in yourstring if c in string.ascii_uppercase)

I'm too lazy to timeit right now, but you may want to also define a temporary set to hold string.ascii_uppercase -- It might make your function run a little faster (depending on how optimized str.__contains__ is compared to set.__contains__):

 ordA = ord('A')
 big_letters = set(string.ascii_uppercase)
 sum(letterGoodness[ord(c)-ordA] for c in yourstring.upper() if c in big_letters)

Upvotes: 11

Related Questions