Gabriel Hisatugu
Gabriel Hisatugu

Reputation: 61

Regex Python Adding a char before a random word and the special char :

I'm trying to find the correct regex lookaround to this type of string:

cat: monkey, ab4 / 1997 / little: cat, 1954/ afgt22 /dog: monkey, 173 / pine-apple: duer, 129378s. / 12

The regex I'm trying to set is:

Insert the char '|' before any 'word' followed by ':', being 'word' any type of word with only chars and not numbers.

The issue:

I'm unable to find a way to consider beggining of strings, words containing '-' or words that are preceded of special chars, like '/' and not space, as in this example:

https://regex101.com/r/gX7wY0/5

As you can see, only one of them worked so far, but the '|' char has a space after it, then the word followed by ':'.

What I'm trying to do is:

|cat: monkey, ab4 / 1997 / |little: cat, 1954/ afgt22 /|dog: monkey, 173 / |pine-apple: duer, 129378s. / 12

So far only the special char '-' made part of a word before ':'.

Thanks in advance, I'm still learning how to use regex with Python. Any tips are welcome!

Upvotes: 0

Views: 57

Answers (1)

James
James

Reputation: 36691

You can use r'\b' to search for word breaks. For your case you are looking for

  • substrings that match: [A-Za-z\-]+
  • and are surrounded by word breaks: \b[A-Za-z\-]+\b
  • and are followed by a colon: \b[A-Za-z\-]+\b:
  • You can capture the word using parenthesis: \b([A-Za-z\-]+)\b:
  • and recover it in the substitution using \1
import re

s = 'cat: monkey, ab4 / 1997 / little: cat, 1954/ afgt22 /dog: monkey, 173 / pine-apple: duer, 129378s. / 12'

re.sub(r'(\b[A-Za-z\-]+\b):', r'|\1:', s)
# returns:
'|cat: monkey, ab4 / 1997 / |little: cat, 1954/ afgt22 /|dog: monkey, 173 / |pine-apple: duer, 129378s. / 12'

Upvotes: 1

Related Questions