user2652936
user2652936

Reputation:

divide the string with RegExp in Groovy

well, i put xml-response with a lot of symbols, like this:

def xmlString = "<TAG1>1239071ABCDEFGH</TAG1><TAG2>1239071ABCDEFGH</TAG2>"

using xmlSlurper to leave only digits

def node = 
new XmlSlurper().parseText(xmlString)
    def nodelist = [node.tag1.tag2]

after this "node" got a value like "1239071123907112390711239071" and i try to put java RegExp to separate the digits by 7

System.out.println(java.util.Arrays.toString( nodelist.node.split("(?<=\G.{7})") ))

Where i did wrong? it doesn't work

Upvotes: 1

Views: 289

Answers (1)

tim_yates
tim_yates

Reputation: 171084

Assuming you have some valid xml like:

def xmlString = """<document>
                  |    <TAG1>1239071ABCDEFGH</TAG1>
                  |    <TAG2>1239071ABCDEFGH</TAG2>
                  |</document>""".stripMargin()

Then you can get all elements starting with TAG, and for each of these trim off the end chars which aren't digits:

def nodeList = new XmlSlurper().parseText( xmlString )
                               .'**'
                               .findAll { node ->
                                   node.name().startsWith( 'TAG' )
                               }
                               .collect { node ->
                                   it.text().takeWhile { ch ->
                                       Character.isDigit( ch )
                                   }
                               }

nodeList in this example would then equal:

assert nodeList == ['1239071', '1239071']

If you want to keep these numbers associated with the TAG that contained them (assuking TAGn tags are unique), then you can change to collectEntries

def nodeList = new XmlSlurper().parseText( xmlString )
                               .'**'
                               .findAll { node ->
                                   node.name().startsWith( 'TAG' )
                               }    
                               .collectEntries { node ->
                                   [ node.name(), node.text().takeWhile { Character.isDigit( it ) } ]
                               }


assert nodeList == [TAG1:'1239071', TAG2:'1239071']

Upvotes: 1

Related Questions