Reputation: 127
Input: """aaaabb\\\\\cc"""
Pattern: ["""aaa""", """\\""", """\"""]
Output: [aaa, abb, \\, \\, \, cc]
How can I split Input to Output using patterns in Pattern in Kotlin?
I found that Regex("(?<=cha)|(?=cha)") helps patterns to remain after spliting, so I tried to use looping, but some of the patterns like '\' and '[' require escape backslash, so I'm not able to use loop for spliting.
EDIT:
val temp = mutableListOf<String>()
for (e in Input.split(Regex("(?<=\\)|(?=\\)"))) temp.add(e)
This is what I've been doing, but this does not work for multiple regex, and this add extra "" at the end of temp if Input ends with "\"
Upvotes: 3
Views: 2010
Reputation: 626738
You may use the function I wrote for some previous question that splits by a pattern keeping all matched and non-matched substrings:
private fun splitKeepDelims(s: String, rx: Regex, keep_empty: Boolean = true) : MutableList<String> {
var res = mutableListOf<String>() // Declare the mutable list var
var start = 0 // Define var for substring start pos
rx.findAll(s).forEach { // Looking for matches
val substr_before = s.substring(start, it.range.first()) // // Substring before match start
if (substr_before.length > 0 || keep_empty) {
res.add(substr_before) // Adding substring before match start
}
res.add(it.value) // Adding match
start = it.range.last()+1 // Updating start pos of next substring before match
}
if ( start != s.length ) res.add(s.substring(start)) // Adding text after last match if any
return res
}
You just need a dynamic pattern from yoyur Pattern
list items by joining them with a |
, an alternation operator while remembering to escape all the items:
val Pattern = listOf("aaa", """\\""", "\\") // Define the list of literal patterns
val rx = Pattern.map{Regex.escape(it)}.joinToString("|").toRegex() // Build a pattern, \Qaaa\E|\Q\\\E|\Q\\E
val text = """aaaabb\\\\\cc"""
println(splitKeepDelims(text, rx, false))
// => [aaa, abb, \\, \\, \, cc]
See the Kotlin demo
Note that between \Q
and \E
, all chars in the pattern are considered literal chars, not special regex metacharacters.
Upvotes: 3