user1224096
user1224096

Reputation: 121

How do I deal with this regex issue?

I need to extract a value from a hidden HTML field, have somewhat figured it out but I'm currently stuck.

My regex looks like:

<input type="hidden" name="form_id" value=".*"

But this extracts the whole string from the HTML.

The string looks like:

<input type="hidden" name="form_id" value="123"/>

I need to extract the "value" from the string, it is always changing, but the "name" is always the same. Is there a way to extract it without doing another expression? I appreciate any help.

Upvotes: 2

Views: 171

Answers (3)

fhulprogrammer
fhulprogrammer

Reputation: 669

<[a-zA-Z"= _^>]*value="(\d*)"/>
I have tested this for your example.
If you want to extract for only input tag you can write:

<input[a-zA-Z"= _^>]*value="(\d*)"/>

Upvotes: 0

humble_coder
humble_coder

Reputation: 2787

I just threw this together. Basically you want to negate any ending > in your request. So you'd likely want to do something of this nature:

<[^>]*hidden[^>]*value="(.*)"[^>]*>

And then read the first capture group (Delphi instructions). This keeps it as reasonably generic as possible although it does assume positional order on "hidden" and "value".

In order to find the value without regard for order you could use could use a slightly cleaner lookahead as was suggested:

 ^(?=.*name="form_id").*value="([^"]*)".*$

Upvotes: 1

burning_LEGION
burning_LEGION

Reputation: 13450

(?<=<[^<>]+?name="form_id"[^<>]+value=")(.*)(?=")

Upvotes: 3

Related Questions