Mojimi
Mojimi

Reputation: 3171

How to Capture html tags using lua pattern

This is how what i'm trying to extract from looks : http://pastebin.com/VD0K3ZcN

lines:match([[title="(value here)">]])

How can I get the "value here"? it does not have numbers or the ">" symbol inside it, only letters, spaces, ' - and .

I have tried

lines:match([[title="(.+)">]])

but it simply got the whole line after the capture.

Upvotes: 1

Views: 1316

Answers (1)

Advert
Advert

Reputation: 653

The problem with your pattern is this:

title="    -- This is fine, but you probably want to find out what tag title is in.
(.+)       -- Problem: Greedy match. I'll illustrate this later.
">         -- Will match a closing tag with a double quote.

Now, if I have this HTML:

<html>
 <head title="Foobar">
 </head>
 <body onload="somejs();">
 </body>
</html>

Your pattern will match:

Foobar"></head><body onload="somejs();

You can fix this by using (.-). This is the non-greedy version, and it will match the least amount possible, stopping once it finds the next "> instead of the last ">.

Upvotes: 4

Related Questions