Chan
Chan

Reputation: 4301

Split a string that includes Korean characters

I have a string containing Korean characters:

s = '굿모닝, today is 촉촉'

I want to split it as:

t = ['굿모닝', 'today', 'is', '촉촉']

Note that all the Korean characters are put together instead of separated, that is, it is '굿모닝', not '굿', '모', '닝'.

Questions:

Upvotes: 2

Views: 777

Answers (1)

Savir
Savir

Reputation: 18428

I don't think Korean has any relevance here... The only issue I can think of is that pesky comma right after the first 3 characters which prevents you from using straight s.split() but regular expressions are mighty!!

import re
s = '굿모닝, Today is 촉촉'
re.split(',?\s', s)

Outputs ['굿모닝', 'Today', 'is', '촉촉']

Just split your string by an optional comma ,? followed by a non-optional white character \s

Upvotes: 4

Related Questions