Reputation: 961
I want to create binary values for words based on their content of vowels and consonants, where vowels receive a value of '0' and consonants get a value of '1'.
For example, 'haha' would be represented as 1010, hahaha as 101010.
common_words = ['haha', 'hahaha', 'aardvark', etc...]
dictify = {}
binary_value = []
#doesn't work
for word in common_words:
for x in word:
if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
binary_value.append(0)
dictify[word]=binary_value
else:
binary_value.append(1)
dictify[word]=binary_value
-With this I am getting too many binary digits in the resulting dictionary:
>>>dictify
{'aardvark': [0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,...}
desired output:
>>>dictify
{'haha': 1010,'hahaha': 101010, 'aardvark': 00111011}
I am thinking of a solution that doesn't involve a loop within a loop...
Upvotes: 0
Views: 375
Reputation: 77089
This seems like a job for translation tables. Assuming your input strings are all ASCII (and it seems likely or the definition of exactly what is a vowel gets fuzzy), you can define a translation table this way*:
# For simplicity's sake, I'm only using lowercase letters
from string import lowercase, maketrans
tt = maketrans(lowercase, '01110111011111011111011111')
With the above table, the problem becomes trivial:
>>> 'haha'.translate(tt)
'1010'
>>> 'hahaha'.translate(tt)
'101010'
>>> 'aardvark'.translate(tt)
'00111011'
Given this solution, you can build dictify very simply with a comprehension:
dictify = {word:word.translate(tt) for word in common_words} #python2.7
dictify = dict((word, word.translate(tt)) for word in common_words) # python 2.6 and earlier
*This can also be done with Python 3, but you have to use bytes instead of strings:
from string import ascii_lowercase
tt = b''.maketrans(bytes(ascii_lowercase, 'ascii'), b'01110111011111011111011111')
b'haha'.translate(tt)
...
Upvotes: 1
Reputation: 18008
user2357112 explains your code. Here is just another way:
>>> common_words = ['haha', 'hahaha', 'aardvark']
>>> def binfy(w):
return "".join('0' if c in 'aeiouAEIOU' else '1' for c in w)
>>> dictify = {w:binfy(w) for w in common_words}
>>> dictify
{'aardvark': '00111011', 'haha': '1010', 'hahaha': '101010'}
Upvotes: 2
Reputation: 280301
The code you've posted doesn't work because all words share the same binary_value
list. (It also doesn't work because number_value
and each
are never defined, but we'll pretend those variables said binary_value
and word
instead.) Define a new list for each word:
for word in common_words:
binary_value = []
for x in word:
if x=='a' or x=='e' or x=='i' or x=='o' or x=='u':
binary_value.append(0)
dictify[word]=binary_value
else:
binary_value.append(1)
dictify[word]=binary_value
If you want the output to look like 00111011
rather than a list, you'll need to make a string. (You could make an int, but then it would look like 59
instead of 00111011
. Python doesn't distinguish "this int is base 2" or "this int has 2 leading zeros".)
for word in common_words:
binary_value = []
for x in word:
if x.lower() in 'aeiou':
binary_value.append('0')
else:
binary_value.append('1')
dictify[word] = ''.join(binary_value)
Upvotes: 2