Regex for finding URLs inside text and parse them for uri

Question

I have a long text, where can be links like schema://example.com/{entity}/{id}.

I need to extract them look like:

{entity1} => {id1}
{entity1} => {id2}
{entity2} => {id3}
{entity2} => {id4}

I can extract all url with

\bschema:\/\/(?:(?!&[^;]+;)[^\s"'<>)])+\b

And parse it then with

schema:\/\/example\.com\./(.*)\/(.*)

But I need more optimized way. Could you help me, please?

Andreas · Accepted Answer

Not sure if I understood the complexity of the question but this should do what you need.

I use the pattern to capture the entity and id and then I combine them with array_combine.

Preg_match_all("~schema://example.com/(.*?)/(.*?)(\.|\s|$)~", $txt, $matches);

$arr = array_combine($matches[1],$matches[2]);
Var_dump($arr);

Answers (2)