Tarlen
Tarlen

Reputation: 3797

Extracting for URL from string using regex

I want to extract the first valid URL in a string, which can be anywhere between characters and whitespace

I have tried with the following

...
urlRegex: /^(http[s]?:\/\/.*?\/[a-zA-Z-_]+.*)$/,

...
var input = event.target.value // <--- some string;
var url   = input.match(this.urlRegex);

Problem is url returns the whole string when it finds a url, instead of returning just the part of the string matching the regex

Example The string

https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd

returns

["https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd", "https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd", index: 0, input: "https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd"]

How can this be achieved?

Upvotes: 3

Views: 23557

Answers (3)

Adriano Nico Verona
Adriano Nico Verona

Reputation: 14

That's because the match result holds the whole string first that matches, then the groups. I guess you want the group, so you can do this:

url[1]

Here's a fiddle: http://jsfiddle.net/jgt8u6pc/1/

var urlRegex = /^http[s]?:\/\/.*?\/([a-zA-Z-_]+).*$/;
var input = 'http://stackoverflow.com/questions/31760030/extracting-for-url-from-string-using-regex' // <--- some string;
var url = input.match(urlRegex);

$('#one').text(url[0]);
$('#two').text(url[1]);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="one"></div>
<div id="two"></div>

Upvotes: -1

Braj
Braj

Reputation: 46871

  • You haven't included digits in your regex as part of URL.
  • Assuming URL starts from the beginning of the string

Live Demo with regex explanation on left side.

Regex explanation

var regex = /^(https?:\/\/[^/]+(\/[\w-]+)+)/;
var str = 'https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd';

var url = str.match(regex)[0];
document.write(url);

Upvotes: 3

Shrinivas Shukla
Shrinivas Shukla

Reputation: 4463

Your regex is incorrect.

Correct regex for extracting URl : /(https?:\/\/[^ ]*)/

Check out this fiddle.

Here is the snippet.

var urlRegex = /(https?:\/\/[^ ]*)/;

var input = "https://medium.com/aspen-ideas/there-s-no-blueprint-26f6a2fbb99c random stuff sd";
var url = input.match(urlRegex)[1];
alert(url);

Upvotes: 15

Related Questions