Mukesh Addon
Mukesh Addon

Reputation: 17

Regex with avoidation of markdown

Pattern is : /(?:https?://)?(?:[^.]+.)?momento360.com/e/(.*)?/i

This regex pattern returns the remaining part of the URL after the website.

[click here](https://momento360.com/e/uc/1478291a8dd94a8198339f1ffe4b97be?utm_campaign=embed&utm_source=other&size=medium)


https://momento360.com/e/u/a9b53aa8f8b0403ba7a4e18243aabc66


https://momento360.com/e/uc/1478291a8dd94a8198339f1ffe4b97be?upload-key=e84e1fb3567546a885a2a223bde6ef32

But now I want to ignore the string that is in [click here](...) Markdown

Upvotes: 1

Views: 95

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You can use

\[click here]\(http[^()]*\)(*SKIP)(*F)|(?:https?:\/\/)?(?:[^.]+\.)?momento360\.com\/e\/(.*)

See the regex demo.

Here,

  • \[click here]\(http[^()]*\)(*SKIP)(*F) - matches [click here](...) substring and skips this match, starting the search for a new match at the location where the failure occurred
  • (?:https?:\/\/)?(?:[^.]+\.)?momento360\.com\/e\/(.*) - matches an optional http:// or https://, then matches an optional sequence of any 1+ chars other than a dot and then a dot, then momento360.com/e/ string and then captures into Group 1 any zero or more chars other than line break chars, as many as possible.

Upvotes: 1

mohammad asghari
mohammad asghari

Reputation: 1894

If all Urls you want to scape are in () and others not, you can use this:

[^(]https?:?(?:[^.]+.)?momento360.com/e/(.*)?

Upvotes: 1

Related Questions