Reputation: 23
I'm trying to create a list that has items only from 1 alphabet only, such as Latin alphabet or Hangul. One of the alphabet will always be Latin alphabet and other might change. I also don't want blank items in my list that is caused by the spaces between words.
I want to split it from the points where alphabet changes.
To give an example, my string is:
"형 older brother 누나 older sister 언니 older sister 오빠 older brother"
.
I want to create the list:
["형", "older brother", "누나", "older sister", "언니", "older sister", "오빠", "older brother"]
Can someone help?
Upvotes: 2
Views: 80
Reputation: 82765
Using regex.
import re
s = "형 older brother 누나 older sister 언니 older sister 오빠 older brother"
#print(re.split(r"([^a-z\s]+)", s, re.IGNORECASE))
print([i for i in re.split(r"([^a-z\s]+)", s) if i])
Upvotes: 2