Reputation: 2349
I have a string:
Hi there, "Bananas are, by nature, evil.", Hey there.
I want to split the string with commas as the delimiter. How do I get the .split method to ignore the comma inside the quotes, so that it returns 3 strings and not 5.
Upvotes: 4
Views: 3334
Reputation: 1682
You can use regex
in split method
According to this answer the following regex only matches ,
outside of the "
mark
,(?=(?:[^\"]\"[^\"]\")[^\"]$)
so try this code:
str.split(",(?=(?:[^\\\"]*\\\"[^\\\"]*\\\")*[^\\\"]*\$)".toRegex())
Upvotes: 9
Reputation:
If you don't want regular expressions:
val s = "Hi there, \"Bananas are, by nature, evil.\", Hey there."
val hold = s.substringAfter("\"").substringBefore("\"")
val temp = s.split("\"")
val splitted: MutableList<String> = (temp[0] + "\"" + temp[2]).split(",").toMutableList()
splitted[1] = "\"" + hold + "\""
splitted
is the List you want
Upvotes: 2
Reputation: 5300
You have to use a regular expression capable of handling quoted values. See Java: splitting a comma-separated string but ignoring commas in quotes and C#, regular expressions : how to parse comma-separated values, where some values might be quoted strings themselves containing commas
The following code shows a very simple version of such a regular expression.
fun main(args: Array<String>) {
"Hi there, \"Bananas are, by nature, evil.\", Hey there."
.split(",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)".toRegex())
.forEach { println("> $it") }
}
outputs
> Hi there
> "Bananas are, by nature, evil."
> Hey there.
Be aware of the regex backtracking problem: https://www.regular-expressions.info/catastrophic.html. You might be better off writing a parser.
Upvotes: 2
Reputation: 41638
You can use split overload that accepts regular expressions for that:
val text = """Hi there, "Bananas are, by nature, evil.", Hey there."""
val matchCommaNotInQuotes = Regex("""\,(?=([^"]*"[^"]*")*[^"]*$)""")
println(text.split(matchCommaNotInQuotes))
Would print:
[Hi there, "Bananas are, by nature, evil.", Hey there.]
Consider reading this answer on how the regular expression works in this case.
Upvotes: 2