LiviuIosim
LiviuIosim

Reputation: 21

How to extract the end of an string after a certain character with regex/java?

I to parse multiple lines of text that, for example, look like this:

{"Name":"pathology[876]", "cpu":"0.58","mem":"18.39", "vm":"1542.14"}
{"Name":"/opt/pathology/bin/pathology[876]", "cpu":"0.58","mem":"18.39", "vm":"1542.14"}
{"Name":"/usr/sbin/ofonod[760]", "cpu":"0.00","mem":"0.00", "vm":"0.00"}
{"Name":"/opt/networking/bin/network_manager[370]", "cpu":"0.20","mem":"53.43", "vm":"4225.69"}
{"Name":"/usr/bin/dmrouterd[913]", "cpu":"0.00","mem":"0.00", "vm":"0.00"}

I have to extract every process name, but some come alone and as well with their related path which I have to ignore, for example: pathology[876] is that same thing as /opt/pathology/bin/pathology[876]. I have to generalize this process to take the process name indifferently of the path. How could I take the desired string between the last / and the end of the string?

So far I have computed the following regex that treats paths like: /opt/<anything>/bin/<anything> extracting part after bin/, but there is a problem where the path is longer, for example /opt/<anything>/bin/pat/pathology[876] I get pat/pathology[876] while I would want only pathology[876].

"(Name)":("\/opt\/(.*?)\/bin\/(.*?)"|"(.*?)")

Upvotes: 1

Views: 263

Answers (2)

Felix
Felix

Reputation: 56

my steps to create such regex are:

  1. Thinking about which characters are (not) included in my target string? In this case all chars are allowed, but " and / are not allowed: ([^/\"]+)
  2. What is written before my target string? In this case an optional string like /.../.../ which always starts and ends with /. To catch all ../../../ we can write ([^"\/]+\/)* and to catch the first / and make it optional we just extend it to (\/([^"\/]+\/)*)?
  3. What is written after my target string? -> "

The final regex could be:

"Name":"(?:\/(?:[^"\/]+\/)*)?([^/\"]+)"

(Note the syntax (?:X) will group the expression X but will not be captured as a "result group")

I've tested and saved this regex here: https://regex101.com/r/WnSNNk/2

Upvotes: 2

MonkeyZeus
MonkeyZeus

Reputation: 20737

This would do it for you:

[^\/"]+(?=", "cpu")

In English:

Per line, find everything that's not a forward slash nor double quote leading up to ", "cpu"

https://regex101.com/r/u3rhUf/1/

Upvotes: 2

Related Questions