Jacek
Jacek

Reputation: 194

Split long string on ">

I'm trying to use split() method to split long string containing repeated tags (content of a text document containing CFML code), each of which terminates with those 2 chars: "> and a line break.

I cannot figure out how to accomplish that, was trying multiple regexes with no luck, inside of mentioned tags there can be nested other tags ( please don't ask why :-) ), and the split breaks on those nested tags, even if they do not contain ">

Example:

<cfset code = "Text text text <table style='width:538px; [... more text stripped ...] </table>">
<cfset another_code = "Text text text">
...

Any clues would be greatly appreciated!

Upvotes: 1

Views: 169

Answers (2)

Pshemo
Pshemo

Reputation: 124275

I am not sure what are you trying to do but if you want to split on ">(new line) then maybe use split("\">\r?\n"). But maybe you want to split on new line mark that has "> before? In that case you can use look-behind mechanism like split("(?<=\">)\r?\n")

Upvotes: 1

wchargin
wchargin

Reputation: 16047

To do it with pure regex, I would use str.split(Pattern.quote("\">")).

However, you should consider using an XML parser such as SAX, StAX, DOM parser, etc. There's no need to reinvent the wheel.

Upvotes: 1

Related Questions