Reputation: 43
I have a string as follows:
theatre = 'Regal Crown Center Stadium 14'
I would like to break this into an acronym based on the first letter in each word but also include both numbers:
desired output = 'RCCS14'
My code attempts below:
acronym = "".join(word[0] for word in theatre.lower().split())
acronym = "".join(word[0].lower() for word in re.findall("(\w+)", theatre))
acronym = "".join(word[0].lower() for word in re.findall("(\w+ | \d{1,2})", theatre))
acronym = re.search(r"\b(\w+ | \d{1,2})", theatre)
In which I wind up with something like: rccs1
but can't seem to capture that last number. There could be instances when the number is in the middle of the name as well: 'Regal Crown Center 14 Stadium'
as well. TIA!
Upvotes: 3
Views: 2579
Reputation: 22837
(?:(?<=\s)|^)(?:[a-z]|\d+)
(?:(?<=\s)|^)
Ensure what precedes is either a space or the start of the line(?:[a-z]|\d+)
Match either a single letter or one or more digitsThe i
flag (re.I
in python) allows [a-z]
to match its uppercase variants.
import re
r = re.compile(r"(?:(?<=\s)|^)(?:[a-z]|\d+)", re.I)
s = 'Regal Crown Center Stadium 14'
print(''.join(r.findall(s)))
The code above finds all instances where the regex matches and joins the list items into a single string.
Result: RCCS14
Upvotes: 2
Reputation: 436
import re
theatre = 'Regal Crown Center Stadium 14'
r = re.findall("\s(\d+|\S)", ' '+theatre)
print(''.join(r))
Gives me RCCS14
Upvotes: 0
Reputation: 1701
I can't comment since I don't have enough reputation, but S. Jovan answer isn't satisfying since it assumes that each word starts with a capital letter and that each word has one and only one capital letter.
re.sub(r'[a-z ]+', '', "Regal Crown Center Stadium YB FIEUBFB DBUUFG FUEH 14")
will returns 'RCCSYBFIEUBFBDBUUFGFUEH14'
However ctwheels answers will be able to work in this case :
r = re.compile(r"\b(?:[a-z]|\d+)", re.I)
s = 'Regal Crown Center Stadium YB FIEUBFB DBUUFG FUEH 14'
print(''.join(r.findall(s)))
will print
RCCSYFDF14
Upvotes: 0