Reputation: 97
I have a field whose value is a concatenated set of fields delimited by |
(pipe),
Note:- escape character is also a pipe.
Given:
AB|||1|BC||DE
Required:
["AB|","1","BC|DE"]
How can I split the given string into an array or list without iterating character by character (i.e. using regex or any other method) to get what is required?
Upvotes: 0
Views: 148
Reputation: 96385
If there's an unused character you can substitute for the doubled-pipe you could do this:
groovy:000> s = "AB|||1|BC||DE"
===> AB|||1|BC||DE
groovy:000> Arrays.asList(s.replaceAll('\\|\\|', '@').split('\\|'))*.replaceAll(
'@', '|')
===> [AB|, 1, BC|DE]
Cleaned up with a magic char sequence and using tokenize it would look like:
pipeChars = 'ZZ' // or whatever
s.replaceAll('\\|\\|', pipeChars).tokenize('\\|')*.replaceAll(pipeChars, '|')
Of course this assumes that it's valid to go left-to-right across the string grouping the pipes into pairs, so each pair becomes a single pipe in the output, and the left-over pipes become the delimiters. When you start with something like
['AB|', '|1', 'BC|DE']
which gets encoded as
AB|||||1|BC||DE
then the whole encoding scheme falls apart, it's entirely unclear how to group the pairs of pipes in order to recover the original values. 'X|||||Y' could have been generated by ['X|','|Y'] or ['X||', 'Y'] or ['X', '||Y'], there is no way to know which it was.
Upvotes: 1
Reputation: 1241
How about using the split('|') method - but from what you provided, it looks like you can also have the '|' character in the field value. Any chance you can change the delimiter character to something that is not in the resulting values?
Upvotes: 0