Me All
Me All

Reputation: 269

Separating a string of words into characters following certain conditions

I have a string of words and I want to separate them into individual characters. However, if a group of characters is part of what I've called "special consonant pairs", they need to remain together.

These are some of my "special consonant pairs":

consonant_pairs = ["ng", "ld", "dr", "bl", "nd", "th" ...]

This is one of the sample strings I want to separate into characters:

sentence_1 = "We were drinking beer outside and we could hear the wind blowing"

And this would be my desired output (I have already deleted spaces and punctuation):

sentence_1_char = ['w', 'e', 'w', 'e', 'r', 'e', 'dr', 'i', 'n', 'k', 'i', 'ng', 'b', 'e', 'e', 'r', 'o', 'u', 't', 's', 'i', 'd', 'e', 'a', 'n', 'd', 'w', 'e', 'c', 'o', 'u', 'ld', 'h', 'e', 'a', 'r', 'th', 'e', 'w', 'i', 'nd', 'bl', 'o', 'w', 'i', 'ng']

I thought of using list(), though I don't know how to go about the consonant pairs. Could anyone help me?

Upvotes: 1

Views: 90

Answers (1)

Mateen Ulhaq
Mateen Ulhaq

Reputation: 27281

A quick (not necessarily performant) answer:

import re
charred = re.split('(' + '|'.join(consonant_pairs) + ')', sentence)

EDIT: To get the expected output in OP:

import re
matches = re.finditer('(' + '|'.join(consonant_pairs) + '|.)', sentence)
charred = [sentence[slice(*x.span())] for x in matches]

Upvotes: 4

Related Questions