Reputation: 95
I've got an xml file full facebook messages with elements that need to be rearranged and closed. The structure looks like this:
<john>
<timestamp>Tuesday, August 7, 2012 at 3:53pm EDT</timestamp>
<message>Cats or dogs?</message>
<hillary>
<timestamp>Sunday, August 8, 2012 at 1:54am EST</timestamp>
<message>Ugh, definitely dogs.</message>
The <john>
and <hillary>
tags need to be closed, and the <timestamp>
and <message>
elements need to be swapped:
<john>
<message>Cats or dogs?</message>
<timestamp>Tuesday, August 7, 2012 at 3:53pm EDT</timestamp>
</john>
<hillary>
<message>Ugh, definitely dogs.</message>
<timestamp>Sunday, August 8, 2012 at 1:54am EST</timestamp>
</hillary>
I'm new to regular expressions and am having such a hard time with this. Any help would be greatly appreciated!
Upvotes: 0
Views: 842
Reputation: 89584
You can try this:
search : (<([^>]+)>(?:\s+|<([^>]+)>[^<]*</\3>)+)(?=(\r?\n)|$)
replace : $1$4</$2>
If needed, you can be more explicit:
search : (<([^>]+)>(?:\s+|<(timestamp|message)>[^<]*</\3>)+)(?=(\r?\n)|$)
replace : $1$4</$2>
To place the message tags before the timestamp tags:
search : (<message>[^<]*</message>)(\s*)(<timestamp>[^<]*</timestamp>)
replace : $3$2$1
Upvotes: 1
Reputation: 46
I've done an example for you here that will work for any name.
Search: /<(.*?>)(.*?</message>)/gs
Replace: <$1$2\n</$1
Upvotes: 0