Reputation: 4272

Regex: Match, but don't include part of matched

This is the text:

https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT

I want to get the actual link:

https://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/

The /=3Dhttps.*\//g gets including =3D, but I want to get without it. How can I figure this out?

Here's the regex.

Upvotes: 3

Answers (5)

agruwell

Reputation: 150

For those using regex with a language that supports a limited subset of features (like CMake) none of the other answers may work. In this case, one option is to just capture the preceeding string (=3D in the OP's case), and then use a string operation to remove it from the rest of the match after the fact. It's not elegant, but it works.

Upvotes: 0

Aleks H. Lamarche

Reputation: 25

I have never used regex in Javascript, but I have used them extensively in bash, sh, ps and C# and from what I understand this is what you are looking for:

/=3D(http.*\/)\\

https://regex101.com/r/bupG3W/1

And for capturing the group inside the match

var myString = "something format_abc";
var myRegexp = /(?:^|\s)format_(.*?)(?:\s|$)/g;
var match = myRegexp.exec(myString);
console.log(match[1]); // abc

Upvotes: 0

Josh Crozier

Reputation: 240868

One option is to prevent the first http.* substring from being matched by using a negative lookahead with a ^ anchor:

Example Here

(?!^)https:.*\/

This essentially matches https:.*\/ as long as it isn't at the beginning of the string.

Snippet:

var string = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';

console.log(string.match(/(?!^)https:.*\//)[0]);

However, the expression above won't cover all edge cases therefore the better option would be to just use a capturing group:

Updated Example

=3D(https.*\/)

Snippet:

var string = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';

console.log(string.match(/=3D(https.*\/)/)[1]);

You can also use a negated character class, such as [^\\]+ in order to match one or more non-\ characters:

Updated Example

=3D(https[^\\]+)

Upvotes: 2

aelor

Reputation: 11116

make =3D as a positive lookbehind

(?<==3D)https.*\/

demo here : https://regex101.com/r/sHvRMA/2

update:

for javascript specific code, use capture groups

var str = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';
var reg = /=3D(https.*\/)/;
console.log(str.match(reg)[1]);

Upvotes: 1

Gaston Gonzalez

Reputation: 427

this is a great resource for figuring out regex matches

http://regexr.com/

Upvotes: 0

Regex: Match, but don&#39;t include part of matched

Answers (5)

Related Questions

Regex: Match, but don't include part of matched