Mert I.
Mert I.

Reputation: 23

Splitting a string from the points where alphabet system changes

I'm trying to create a list that has items only from 1 alphabet only, such as Latin alphabet or Hangul. One of the alphabet will always be Latin alphabet and other might change. I also don't want blank items in my list that is caused by the spaces between words.

I want to split it from the points where alphabet changes.

To give an example, my string is:

"형 older brother 누나 older sister 언니 older sister 오빠 older brother" .

I want to create the list:

["형", "older brother", "누나", "older sister", "언니", "older sister", "오빠", "older brother"]

Can someone help?

Upvotes: 2

Views: 80

Answers (1)

Rakesh
Rakesh

Reputation: 82765

Using regex.

import re

s = "형 older brother 누나 older sister 언니 older sister 오빠 older brother"
#print(re.split(r"([^a-z\s]+)", s, re.IGNORECASE))
print([i for i in re.split(r"([^a-z\s]+)", s) if i])

Upvotes: 2

Related Questions