nando
nando

Reputation: 55

REgex match for URL parameter

I am trying to extract the name of the magnet link. Here is my code:

use strict;
use warnings;
use v5.18.0;

my $link = 'magnet:?xt=urn:btih:CD46E14A7D62A85607D0F38F0CEE6EE7FEA34209&dn=inherent+vice+2014+dvdscr&tr=udp%3A%2F%2Fexplodie.org%3A6969%2Fannounce&tr=udp%3A%2F%2Fopen.demonii.com%3A1337';

$link =~ m/dn=(.*)&/;
my $link_text = $1;
say $link_text;

The resulting $link_text includes characters after the capture group: inherent+vice+2014+dvdscr+xvid+ac3+evo&tr=udp%3A%2F%2Finferno.demonoid.ph%3A3389%2Fannounce

I can strip off the trailing parameters with a subsequent substitution:

$link_text =~ s/&.*//;
say $link_text;

Which returns what I'm looking for: inherent+vice+2014+dvdscr

What am I doing wrong in the first regex?

Upvotes: 1

Views: 223

Answers (1)

hwnd
hwnd

Reputation: 70750

* is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match. Use *? for a non-greedy match meaning "zero or more — preferably as few as possible".

dn=(.*?)&

Upvotes: 2

Related Questions