J Gee
J Gee

Reputation: 127

Kotlin .split() with multiple regex

  Input: """aaaabb\\\\\cc"""
  Pattern: ["""aaa""", """\\""", """\"""]
  Output: [aaa, abb, \\, \\, \, cc]

How can I split Input to Output using patterns in Pattern in Kotlin?

I found that Regex("(?<=cha)|(?=cha)") helps patterns to remain after spliting, so I tried to use looping, but some of the patterns like '\' and '[' require escape backslash, so I'm not able to use loop for spliting.

EDIT:

  val temp = mutableListOf<String>()
  for (e in Input.split(Regex("(?<=\\)|(?=\\)"))) temp.add(e)

This is what I've been doing, but this does not work for multiple regex, and this add extra "" at the end of temp if Input ends with "\"

Upvotes: 3

Views: 2010

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You may use the function I wrote for some previous question that splits by a pattern keeping all matched and non-matched substrings:

private fun splitKeepDelims(s: String, rx: Regex, keep_empty: Boolean = true) : MutableList<String> {
    var res = mutableListOf<String>() // Declare the mutable list var
    var start = 0                     // Define var for substring start pos
    rx.findAll(s).forEach {           // Looking for matches     
        val substr_before = s.substring(start, it.range.first()) // // Substring before match start
        if (substr_before.length > 0 || keep_empty) {
            res.add(substr_before)      // Adding substring before match start
        }
        res.add(it.value)               // Adding match          
        start = it.range.last()+1       // Updating start pos of next substring before match
    }
    if ( start != s.length ) res.add(s.substring(start))  // Adding text after last match if any
    return res
}

You just need a dynamic pattern from yoyur Pattern list items by joining them with a |, an alternation operator while remembering to escape all the items:

val Pattern = listOf("aaa", """\\""", "\\") // Define the list of literal patterns
val rx = Pattern.map{Regex.escape(it)}.joinToString("|").toRegex() // Build a pattern, \Qaaa\E|\Q\\\E|\Q\\E
val text = """aaaabb\\\\\cc"""
println(splitKeepDelims(text, rx, false))
// => [aaa, abb, \\, \\, \, cc]

See the Kotlin demo

Note that between \Q and \E, all chars in the pattern are considered literal chars, not special regex metacharacters.

Upvotes: 3

Related Questions