johni
johni

Reputation: 5568

Regex parse content of HTTP request

I'm trying to match these kind of character sequences:

sender=11&receiver=2&subject=3&message=4
sender=AOFJOIA&receiver=p2308u48302rf0&subject=(@#UROJ)(J#OFN:&message=aoefhoa348!!!

Where the delimiters between (key, val) pair is the '&' character. I'd like to group them in a way I can get access to the key and the value of each pair.

I tried something like:

([[:alnum:]]+)=([[:alnum:]]+)

But then I miss the:

subject=(@#UROJ)(J#OFN:

I couldn't find a way to allow these type of characters to be accepted. To be more specific, if there are n pairs of key-value, I would like to have n matches, each consisting of 2 groups - 1 for the key, 1 for the value.

I'd be glad if you helped me out with this.

Thanks

Upvotes: 0

Views: 1836

Answers (3)

Dmitry JJ
Dmitry JJ

Reputation: 178

    String req = "sender=AOFJOIA&receiver=p2308u48302rf0&subject=(@#UROJ)(J#OFN:&message=aoefhoa348!!!";
    Pattern p = Pattern.compile("([\\w]+)=([^&]+)");
    Matcher m = p.matcher(req);

    while (m.find()){
        System.out.println("key = " + m.group(1)); // key
        System.out.println("value = " + m.group(2)); // value
    }

You should define your own character class for the "value" group of key/value pair. For instance, it could be [\w!"#$%'()*+,-./:;<=>?@[]^_`{|}~] or [\w@()#:!] or just as simple as the following: [^&]. I think [^&] character class is the most appropriate since you don't know all possible characters that can be in "value" part.

Upvotes: 0

Alvaro Silvino
Alvaro Silvino

Reputation: 9753

https://regex101.com/r/hN7qG9/1

I guess that will solve your problem:

/([^?=&]+)(=([^&]*))?/ig

output:

sender=11
receiver=2
subject=3
message=4
sender=AOFJOIA
receiver=p2308u48302rf0
subject=(@#UROJ)(J#OFN:
message=aoefhoa348!!!

and you can acess each patter:

 $1 - first pattern (sender)
 $2 - second pattern (=11)
 $3 - second pattern without '='(11)

reference

var string = 'sender=11&receiver=2&subject=3&message=4'
var string2 = 'sender=AOFJOIA&receiver=p2308u48302rf0&subject=(@#UROJ)(J#OFN:&message=aoefhoa348!!!';

var regex = /([^?=&]+)(=([^&]*))?/ig;
var eachMatche = string.match(regex);

for (var i = 0; i < eachMatche.length; i++) {
  snippet.log(eachMatche[i]);
  snippet.log('First : '+eachMatche[i].replace(regex,'$1'));
  snippet.log('Second : '+eachMatche[i].replace(regex,'$3'));
}
var eachMatche = string2.match(regex);
for (var i = 0; i < eachMatche.length; i++) {
  snippet.log(eachMatche[i]);
  snippet.log('First : '+eachMatche[i].replace(regex,'$1'));
  snippet.log('Second : '+eachMatche[i].replace(regex,'$3'));
}
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Upvotes: 1

Christian W
Christian W

Reputation: 1496

All the special characters in your example fall unter the "punctuation" group, see :

https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

If that still isn't enough, you could try to make your own character regex class. Like [@# etc...] . Keep in mind that you will have to escape special java characters with an extra /.

Upvotes: 0

Related Questions