KonaKona
KonaKona

Reputation: 109

How to remove a substring between two specific characters in Scala

I have this List in Scala:

List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

And I want to obtain the same List with the substrings between | and ] removed and | removed too.

So the result would be:

List[String] = List([[aaa]], [[ccc]], [[ooo]])

I tried something making a String with the List and using replaceAll, but I want to conserve the List.

Thanks.

Upvotes: 4

Views: 2539

Answers (3)

Mikel San Vicente
Mikel San Vicente

Reputation: 3863

Here is a simple solution that should be quite good in performance:

val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list.map(str => str.takeWhile(_ != '|') + "]]" )

It assumes that the format of the strings is:

  • Two left square brackets [ at the beginning,
  • then the word we want to extract,
  • and then a pipe |.

Upvotes: 5

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626826

You can use a simple \|.*?]] regex to match these substrings you need to remove.

Here is a way to perform the replacement in Scala code:

val l = List[String]("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
println(l.map(x => x.replaceAll("""\|.*?(]])""", "$1"))) 

See the Scala demo

I added a capturing group around ]] and used a $1 backreference in the replacement pattern to insert the ]] back into the result.

Details:

  • \| - a literal | pi[e symbol (since it is a special char outide of a character class, it must be escaped)
  • .*? - any zero or more symbols other than line break symbols
  • (]]) - Group 1 capturing ]] substring (note that ] outside of a character class does not need escaping, it is just the opposite of the case with |).

Upvotes: 4

Nagarjuna Pamu
Nagarjuna Pamu

Reputation: 14825

Replace the 3 characters between | and } with ].

regex is "\\|(.{3})\\]" (do not forget to escape | and })

scala> val list = List("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
list: List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])

scala> list.map(_.replaceAll("\\|(.{3})\\]", "]"))
res16: List[String] = List([[aaa]], [[ccc]], [[ooo]])

Upvotes: 0

Related Questions