MNY
MNY

Reputation: 1536

fail to extract value using regex

I have a strig that looks like:"bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla"

and I want to extract the PersonId, so basically I need everything that is between PersonId: and the ;, I did something like:

val personIdRegex: Regex = """PersonId:\+s;""".r
val personIdExtracted = personIdRegex.findAllIn(str).matchData.take(1).map(m => m.group(1)).mkString

its not working thought, pretty weak in regex would love some help :)

thanks!

Upvotes: 0

Views: 93

Answers (5)

The fourth bird
The fourth bird

Reputation: 163207

You could update your regex to

PersonId:([^;]+)

This will capture not a semicolon in the first capturing group ([^;]+)

Then using your code it would look like:

val personIdRegex: Regex = """PersonId:([^;]+)""".r
val str = "bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla"
val personIdExtracted = personIdRegex.findAllIn(str).matchData.take(1).map(m => m.group(1)).mkString
println(personIdExtracted)

That will give you:

fruhdHH$skdjJIFROfUB3djeggG$tt

Demo

Upvotes: 4

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

If you want to get the first match (as there will always be one match in the string), it makes more sense to use findFirstIn:

"""(?<=PersonId:)[^;]+""".r.findFirstIn(str).get

The (?<=PersonId:)[^;]+ regex means:

  • (?<=PersonId:) - assert there is PersonId: text immediately to the left of the current position
  • [^;]+ - 1+ chars other than ;

See the regex demo.

See the Scala demo:

val str = "bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla"
val personIdRegex = """(?<=PersonId:)[^;]+""".r
val personIdExtracted = personIdRegex.findFirstIn(str).get
println(personIdExtracted)
// => fruhdHH$skdjJIFROfUB3djeggG$tt

Or, a more natural way, use match block with an unanchored regex (here, you may match optional whitespace between PersonId: and the ID itself without restrictions):

val personIdRegex = """PersonId:\s*([^;]+)""".r.unanchored
val personIdExtracted = str match {
  case personIdRegex(person_id) => person_id
  case _ => ""
}

See this Scala demo.

Here, the .unanchored makes the pattern match partial substrings inside a string, and ([^;]+) in the regex forms a capturing group that can be referred to by any arbitrary name inside match block (I chose person_id).

Upvotes: 2

virion
virion

Reputation: 1

You can use following.

String str = "bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla";
    Pattern pattern = Pattern.compile("PersonId:(.*?);");
    Matcher matcher = pattern.matcher(str);
    if (matcher.find()) {
               System.out.println(matcher.group(1));
    }

Upvotes: 0

WarCrow
WarCrow

Reputation: 119

If you want to capture fruhdHH$skdjJIFROfUB3djeggG$tt from "bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla".

You can use this pattern: ".*PersonId:(.*);" This will capture the desired value in group 1.

This patter can be disected in the following manner:

.*PersonId: : This is telling to match any pattern uptill "PersonId:"

(.*); : This is meant to capture any series of characters in the first group until a ; is encountered

Upvotes: 0

Veselin Davidov
Veselin Davidov

Reputation: 7071

You can use that regex:

String test="bla bla bla PersonId:fruhdHH$skdjJIFROfUB3djeggG$tt; bla bla bla";
    Pattern p = Pattern.compile("PersonId:([^;]+)");
    Matcher m = p.matcher(test);
    if (m.find()) {
        System.out.println(m.group(1));
    }

Search for PersonId: and add the value until the first ; in a group

Upvotes: 5

Related Questions