Reputation: 91
How could I create a Python function that takes
[0, 0, ... 0]
(one zero per letter of the alphabet)and returns
So if a word has an 'A'
or 'a'
and the first spot in the array corresponds to 'a'
, then the output array would have a 1
in its first spot:
[1, ...]
If a word has a 'B'
or 'b'
, then the output array would have a 1
for its second spot. If a word has an 'a'
and a 'b'
, then the output array would have a 1
in the first and second spots:
[1, 1, ...]
And so on. So the string "abba"
would result in something like this:
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Ideally, I would also be able to search for characters not in the alphabet, like !
and ?
, too, and just add other bits to the array to represent those characters.
Any help would be welcome! Thanks a ton.
Upvotes: 2
Views: 124
Reputation: 8273
Why not create a simple mapping dictionary
import string
alphabet=string.ascii_lowercase
d=dict(zip(alphabet,range(0,26)))
a=[0]*26
The dictionary will look like this
{'a': 0,
'b': 1,
'c': 2,
'd': 3,
'e': 4,
'f': 5,
'g': 6,
'h': 7,
'i': 8,
'j': 9,
'k': 10,
'l': 11,
'm': 12,
'n': 13,
'o': 14,
'p': 15,
'q': 16,
'r': 17,
's': 18,
't': 19,
'u': 20,
'v': 21,
'w': 22,
'x': 23,
'y': 24,
'z': 25}
Logic for lookup and updating the list
for i in set('aabbc?'):
index_to_update=d.get(i,None)
if index_to_update is not None:
a[index_to_update]=1
print(a)#[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Upvotes: 2
Reputation: 11282
A very simple way to create such a list is:
def string_to_bit_array(text):
# We don't care if upper or lower case
text = text.lower()
# Remove duplicate alphabet characters
text = set(text)
# Define alphabet characters
alphabet = "abcdefghijklmnopqrstuvwxyz"
# Create list with zeros
matches = [0] * len(alphabet)
# Loop over every character of the text
for character in text:
# Skip this character if not in alphabet
if not character in alphabet:
continue
# Find index of character in alphabet
index = alphabet.find(character)
# Set match index to one instead of zero
matches[index] = 1
# Return result
return matches
print(string_to_bit_array("abba"))
This prints:
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
You can just add further characters to alphabet
if you need them:
alphabet = "abcdefghijklmnopqrstuvwxyz!?"
Upvotes: 2