Reputation: 4272
This is the text:
https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT
I want to get the actual link:
https://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/
The /=3Dhttps.*\//g
gets including =3D
, but I want to get without it. How can I figure this out?
Here's the regex.
Upvotes: 3
Views: 16230
Reputation: 150
For those using regex with a language that supports a limited subset of features (like CMake) none of the other answers may work. In this case, one option is to just capture the preceeding string (=3D
in the OP's case), and then use a string operation to remove it from the rest of the match after the fact. It's not elegant, but it works.
Upvotes: 0
Reputation: 25
I have never used regex in Javascript, but I have used them extensively in bash, sh, ps and C# and from what I understand this is what you are looking for:
/=3D(http.*\/)\\
https://regex101.com/r/bupG3W/1
And for capturing the group inside the match
var myString = "something format_abc";
var myRegexp = /(?:^|\s)format_(.*?)(?:\s|$)/g;
var match = myRegexp.exec(myString);
console.log(match[1]); // abc
Upvotes: 0
Reputation: 240868
One option is to prevent the first http.*
substring from being matched by using a negative lookahead with a ^
anchor:
(?!^)https:.*\/
This essentially matches https:.*\/
as long as it isn't at the beginning of the string.
Snippet:
var string = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';
console.log(string.match(/(?!^)https:.*\//)[0]);
However, the expression above won't cover all edge cases therefore the better option would be to just use a capturing group:
=3D(https.*\/)
Snippet:
var string = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';
console.log(string.match(/=3D(https.*\/)/)[1]);
You can also use a negated character class, such as [^\\]+
in order to match one or more non-\
characters:
=3D(https[^\\]+)
Upvotes: 2
Reputation: 11116
make =3D as a positive lookbehind
(?<==3D)https.*\/
demo here : https://regex101.com/r/sHvRMA/2
update:
for javascript specific code, use capture groups
var str = 'https://www.google.com/url?rct=3Dj\u0026sa=3Dt\u0026url=3Dhttps://rivesjournal.com/inside-track-trading-focus-on-shares-of-adobe-systems-inc-adbe/48453/\u0026ct=3Dga\u0026cd=3DCAEYASoTOT';
var reg = /=3D(https.*\/)/;
console.log(str.match(reg)[1]);
Upvotes: 1