Split a string that includes Korean characters

Question

I have a string containing Korean characters:

s = '굿모닝, today is 촉촉'

I want to split it as:

t = ['굿모닝', 'today', 'is', '촉촉']

Note that all the Korean characters are put together instead of separated, that is, it is '굿모닝', not '굿', '모', '닝'.

Questions:

Savir · Accepted Answer

I don't think Korean has any relevance here... The only issue I can think of is that pesky comma right after the first 3 characters which prevents you from using straight s.split() but regular expressions are mighty!!

import re
s = '굿모닝, Today is 촉촉'
re.split(',?\s', s)

Outputs ['굿모닝', 'Today', 'is', '촉촉']

Just split your string by an optional comma ,? followed by a non-optional white character \s

Answers (1)