Racy Gomes
Racy Gomes

Reputation: 58

How to Match AND Split REGEX

How to split by ASCII Character group in REGEX (Android/Java)

Actual String
"আমি আছি i am ইংরেজি থেকে বাংলা"

Expected Output
আমি আছি
i am
ইংরেজি থেকে বাংলা

Upvotes: 2

Views: 140

Answers (1)

tenub
tenub

Reputation: 3446

You could always split on the following:

(?<=[\u0021-\u007E])\s+(?=[^\u0021-\u007E])|(?<=[^\u0021-\u007E])\s+(?=[\u0021-\u007E])

This splits on whitespace preceded by a standard latin character and followed by not a standard latin character or not a standard latin character followed by a standard latin character. Of course you can modify the unicode ranges to accept by looking here as a reference.

Upvotes: 2

Related Questions